Nvidia GTC Keynote Insights

[Music] I am I am a Visionary Illuminating galaxies to witness the birth of [Music] stars and sharpening our understanding of extreme weather [Music] events I am a helper guiding the blind through a crowded world I was thinking about running to the store and giving voice to those who cannot speak to not make me laugh I am a Transformer harnessing gravity to store Renewable [Music] Power [Music] and Paving the way towards unlimited clean energy for us [Music] all I am a [Music] trainer teaching robots to assist to watch out for [Music] danger and help save lives I am a Healer providing a new generation of cures and new levels of patient care doctor that I am allergic to penicillin is it still okay to take the medications definitely these antibiotics don't contain penicillin so it's perfectly safe for you to take them I am a navigator [Music] generating virtual scenarios to let us safely explore the real world and understand every [Music] decision I even helped write the script breathe life into the words [Music] I am AI brought to life by Nvidia deep learning and Brilliant Minds everywhere please welcome to the stage Nvidia founder and CEO Jensen [Music] [Applause] [Music] Wong welcome to GTC I hope you realize this is not a concert you have arrived at a developers conference there will be a lot of science described algorithms computer architecture mathematics I sensed a very heavy weight in the room all of a sudden almost like you were in the wrong place no no conference in the world is there a great assembly of researchers from such diverse fields of science from climatech to radio Sciences trying to figure out how to use AI to robotically control MOS for Next Generation 6G radios robotic self-driving car s even artificial intelligence even artificial intelligence everybody's first I noticed a sense of relief there all of all of a sudden also this conference is represented by some amazing companies this list this is not the attendees these are the presentors and what's amazing is this if you take away all of my friends close friends Michael Dell is sitting right there in the IT industry all of the friends I grew up with in the industry if you take away that list this is what's amazing these are the presenters of the non it Industries using accelerated Computing to solve problems that normal computers can't it's represented in life sciences healthc Care genomics Transportation of course retail Logistics manufacturing industrial the gamut of Industries represented is truly amazing and you're not here to attend only you're here to present to talk about your research $100 trillion dollar of the world's Industries is represented in this room today this is absolutely amazing there is absolutely something happening there is something going on the industry is being transformed not just ours because the computer industry the computer is the single most important instrument of society today fundamental transformations in Computing affects every industry but how did we start how did we get here I made a little cartoon for you literally I drew this in one page this is nvidia's Journey started in 1993 this might be the rest of the talk 1993 this is our journey we were founded in 1993 there are several important events that happen along the way I'll just highlight a few in 2006 Cuda which has turned out to have been a revolutionary Computing model we thought it was revolutionary then it was going to be an overnight success and almost 20 years later it happened we saw it coming two decades later in 2012 alexnet Ai and Cuda made first Contact in 2016 recognizing the importance of this Computing model we invented a brand new type of computer we called the dgx one 170 Tera flops in this supercomputer eight gpus connected together for the very first time I hand delivered the very first dgx-1 to a startup located in San Francisco called open AI dgx-1 was the world's first AI supercomputer remember 170 Tera flops 2017 the Transformer arrived 2022 chat GPT capture the world's imag imaginations have people realize the importance and the capabilities of artificial intelligence and 2023 generative AI emerged and a new industry begins why why is a new industry because the software never existed before we are now producing software using computers to write software producing software that never existed before it is a brand new category it took share from nothing it's a brand new category and the way you produce the software is unlike anything we've ever done before in data centers generating tokens producing floating Point numbers at very large scale as if in the beginning of this last Industrial Revolution when people realized that you would set up factories apply energy to it and this invisible valuable thing called electricity came out AC generators and 100 years later 200 years later we are now creating new types of electrons tokens using infrastructure we call factories AI factories to generate this new incredibly valuable thing called artificial intelligence a new industry has emerged well well we're going to talk about many things about this new industry we're going to talk about how we're going to do Computing next we're going to talk about the type of software that you build because of this new industry the new software how you would think about this new software what about applications in this new industry and then maybe what's next and how can we start preparing today for what is about to come next well but before I start I want to show you the soul of Nvidia the soul of our company at the intersection of computer Graphics physics and artificial intelligence all intersecting inside a computer in Omniverse in a virtual world simulation everything we're going to show you today literally everything we're going to show you today is a simulation not animation it's only beautiful because it's physics the world is beautiful it's only amazing because it's being animated with robotics it's being animated with artificial intelligence what you're about to see all day it's completely generated completely simulated and Omniverse and all of it what you're about to enjoy is the world's first concert where everything is homemade everything is homemade you're about to watch some home videos so sit back and enjoy [Music] [Music] yourself [Music] m what [Music] 0:13:29.120,1193:02:47.295 [Music] a [Music] [Music] God I love it Nvidia accelerated Computing has reached the Tipping Point general purpose Computing has run out of steam we need another way of doing Computing so that we can continue to scale so that we can continue to drive down the cost of computing so that we can continue to consume more and more Computing while being sustainable accelerated Computing is a dramatic speed up over general purpose Computing and in every single industry we engage and I'll show you many the impact is dramatic but in no industry is a more important than our own the industry of using simulation tools to create products in this industry it is not about driving down the cost of computing it's about driving up the scale of computing we would like to be able to sim at the entire product that we do completely in full Fidelity completely digitally in essentially what we call digital twins we would like to design it build it simulate it operate it completely digitally in order to do that we need to accelerate an entire industry and today I would like to announce that we have some Partners who are joining us in this journey to accelerate their entire ecosystem so that we can bring the world into accelerated Computing but there's a bonus when you become accelerated your infrastructure is cou to gpus and when that happens it's exactly the same infrastructure for generative Ai and so I'm just delighted to announce several very important Partnerships there are some of the most important companies in the world and Anis does engineering simulation for what the world makes we're partnering with them to Cuda accelerate the Ansys ecosystem to connect Ansys to the Omniverse digital twin incredible the thing that's really great is that the install base of media GPU accelerated systems are all over the world in every cloud in every system all over Enterprises and so the app the applications they accelerate will have a giant installed base to go serve end users will have amazing applications and of course system makers and csps will have great customer demand synopsis synopsis is nvidia's literally first software partner they were there in very first day of our company synopsis revolutionized the chip industry with high level design we are going to Cuda accelerate synopsis we're accelerating computational lithography one of the most important applications that nobody's ever known about in order to make chips we have to push lithography to limit Nvidia has created a library domain specific library that accelerates computational lithography incredibly once we can accelerate and software Define all of tsmc who is announcing today that they're going to go into production with Nvidia kitho once this software defined and accelerated the next step is to apply generative AI to the future of semiconductor manufacturing push in Geometry even further Cadence builds the world's essential Eda and SDA tools we also use Cadence between these three companies ansis synopsis and Cadence we basically build Nvidia together we are cud accelerating Cadence they're also building a supercomputer out of Nvidia gpus so that their customers could do fluid Dynamic simulation at a 100 a thousand times scale basically a wind tunnel in real time Cadence Millennium a supercomputer with Nvidia gpus inside a software company building supercomputers I love seeing that building Cadence co-pilots together imagine a day when Cadence could synopsis ansis tool providers would offer you AI co-pilots so that we have thousands and thousands of co-pilot assistants helping us design chips Design Systems and we're also going to connect Cadence digital twin platform to Omniverse as you could see the trend here we're accelerating the world's CAE Eda and SDA so that we could create our future in digital Twins and we're going to connect them all to Omniverse the fundamental operating system for future digital twins one of the industries that benefited tremendously from scale and you know you all know this one very well large language model basically after the Transformer was invented we were able to scale large language models at incredible rates effectively doubling every six months now how is it possible that by doubling every six months that we have grown the industry we have grown the computational requirements so far and the reason for that is quite simply this if you double the size of the model you double the size of your brain you need twice as much information to go fill it and so every time you double your parameter count you also have to appropriately increase your training token count the combination of those two numbers becomes the computation scale you have to support the latest the state-of-the-art open AI model is approximately 1.8 trillion parameters 1.8 trillion parameters required several trillion tokens to go train so so a few trillion parameters on the order of a few trillion tokens on the order of when you multiply the two of them together approximately 30 40 50 billion quadrillion floating Point operations per second now we just have to do some Co math right now just hang hang with me so you have 30 billion quadrillion a quadrillion is like a paa and so if you had a PA flop GPU you would need 30 billion seconds to go compute to go train that model 30 billion seconds is approximately 1,000 years well 1,000 years it's worth it like to do it sooner but it's worth it which is usually my answer when most people tell me hey how long how long's it going to take to do something 20 years how it it's worth it but can we do it next week and so 1,000 years 1,000 years so what we need what we need are bigger gpus we need much much bigger gpus we recognized this early on and we realized that the answer is to put a whole bunch of gpus together and of course innovate a whole bunch of things along the way like inventing 10 censor cores advancing MV links so that we could create essentially virtually Giant gpus and connecting them all together with amazing networks from a company called melanox infiniband so that we could create these giant systems and so djx1 was our first version but it wasn't the last we built we built supercomputers all the way all along the way in 2021 we had Seline 4500 gpus or so and then in 2023 we built one of the largest AI supercomputers in the world it's just come online EOS and as we're building these things we're trying to help the world build these things and in order to help the world build these things we got to build them first we build the chips the systems the networking all of the software necessary to do this you should see these systems imagine writing a piece of software that runs across the entire system Distributing the computation across thousands of gpus but inside are thousands of smaller gpus millions of gpus to distribute work across all of that and to balance the workload so that you can get the most Energy Efficiency the best computation time keep your cost down and so those those fundamental Innovations is what got us here and here we are as we see the miracle of chat GPT emerg in front of us we also realize we have a long ways to go we need even larger models we're going to train it with multimodality data not just text on the internet but we're going to we're going to train it on texts and images and graphs and charts and just as we learn watching TV and so there's going to be a whole bunch of watching video so that these Mo models can be grounded in physics understands that an arm doesn't go through a wall and so these models would have common sense by watching a lot of the world's video combined with a lot of the world's languages it'll use things like synthetic data generation just as you and I do when we try to learn we might use our imagination to simulate how it's going to end up just as I did when I Was preparing for this keynote I was simulating it all along the way I hope it's going to turn out as well as I had it in my head as I was simulating how this keynote was going to turn out somebody did say that another performer did her performance completely on a treadmill so that she could be in shape to deliver it with full energy I I didn't do that if I get a l wind at about 10 minutes into this you know what happened and so so where were we we're sitting here using synthetic data generation we're going to use reinforcement learning we're going to practice it in our mind we're going to have ai working with AI training each other just like student teacher Debaters all of that is going to increase the size of our model it's going to increase the amount of the amount of data that we have and we're going to have to build even bigger gpus Hopper is fantastic but we need bigger gpus and so ladies and gentlemen I would like to introduce you to a very very big [Applause] GPU named after David Blackwell math ician game theorists probability we thought it was a perfect per per perfect name black wealth ladies and gentlemen enjoy this the com [Applause] Blackwell is not a chip Blackwell is the name of a platform uh people think we make gpus and and we do but gpus don't look the way they used to here here's the here's the here's the the if you will the heart of the blackw system and this inside the company is not called Blackwell it's just the number and um uh this this is Blackwell sitting next to oh this is the most advanced GPU in the world in production today this is Hopper this is Hopper Hopper changed the world this is Blackwell it's okay Hopper you're you're very good good good boy well good girl 208 billion transistors and so so you could see you I can see that there's a small line between two dyes this is the first time two dieses have abutted like this together in such a way that the two chip the two dieses think it's one chip there's 10 terabytes of data between it 10 terabytes per second so that these two these two sides of the Blackwell Chip have no clue which side they're on there's no memory locality issues no cach issues it's just one giant chip and so uh when we were told that Blackwell's Ambitions were beyond the limits of physics uh the engineer said so what and so this is what what happened and so this is the Blackwell chip and it goes into two types of systems the first one is form fit function compatible to Hopper and so you slide all Hopper and you push in Blackwell that's the reason why one of the challenges of ramping is going to be so efficient there are installations of Hoppers all over the world and they could be they could be you know the same infrastructure same design the power the electricity The Thermals the software identical push it right back and so this is a hopper version for the current hgx configuration and this is what the other the second Hopper looks like this now this is a prototype board and um Janine could I just borrow ladies and gentlemen Jan Paul and so this this is the this is a fully functioning board and I just be careful here this right here is I don't know10 billion the second one's five it gets cheaper after that so any customers in the audience it's okay all right but this is this one's quite expensive this is to bring up board and um and the the way it's going to go to production is like this one here okay and so you're going to take take this it has two blackw Dy two two blackw chips and four Blackwell dies connected to a Grace CPU the grace CPU has a super fast chipto chip link what's amazing is this computer is the first of its kind where this much computation first of all fits into this small of a place second it's memory coherent they feel like they're just one big happy family working on one application together and so everything is coherent within it um the just the amount of you know you saw the numbers there's a lot of terabytes this and terabytes that's um but this is this is a miracle this is a this let's see what are some of the things on here uh there's um uh MV link on top PCI Express on the bottom on on uh your which one is mine and your left one of them it doesn't matter uh one of them one of them is a CPU chipto chip link is my left or your depending on which side I was just I was trying to sort that out and I just kind of doesn't matter hopefully it comes plugged in so okay so this is the grace Blackwell system but there's more so it turns out it turns out all of the specs is fantastic but we need a whole lot of new features uh in order to push the limits Beyond if you will the limits of physics we would like to always get a lot more X factors and so one of the things that we did was We Invented another Transformer engine another Transformer engine the second generation it has the ability to dynamically and automatically rescale and recas numerical formats to a lower Precision whenever it can remember artificial intelligence is about probability and so you kind of have you know 1.7 approximately 1.7 time approximately 1.4 to be approximately something else does that make sense and so so the the ability for the mathematics to retain the Precision and the range necessary in that particular stage of the pipeline super important and so this is it's not just about the fact that we designed a smaller ALU it's not quite the world's not quite that simple you've got to figure out when you can use that across a computation that is thousands of gpus it's running for weeks and weeks on weeks and you want to make sure that the the uh uh the training job is going going to converge and so this new Transformer engine we have a fifth generation MV link it's now twice as fast as Hopper but very importantly it has computation in the network and the reason for that is because when you have so many different gpus working together we have to share our information with each other we have to synchronize and update each other and every so often we have to reduce the partial products and then rebroadcast out the partial products the sum of the partial products back to everybody else and so there's a lot of what is called all reduce and all to all and all gather it's all part of this area of synchronization and collectives so that we can have gpus working with each other having extraordinarily fast links and being able to do mathematics right in the network allows us to essentially amplify even further so even though it's 1.8 terabytes per second it's effectively higher than that and so it's many times that of Hopper the likel Ood of a supercomputer running for weeks on in is approximately zero and the reason for that is because there's so many components working at the same time the statistic the probability of them working continuously is very low and so we need to make sure that whenever there is a well we checkpoint and restart as often as we can but if we have the ability to detect a weak chip or a weak note early we could retire it and maybe swap in another processor that ability to keep the utilization of the supercomputer High especially when you just spent $2 billion building it is super important and so we put in a Ras engine a reliability engine that does 100% self test in system test of every single gate every single bit of memory on the Blackwell chip and all the memory that's connected to it it's almost as if we shipped with every single chip its own Advanced tester that we CH test our chips with this is the first time we're doing this super excited about it secure AI only this conference do they clap for Ras the the uh secure AI uh obviously you've just spent hundreds of millions of dollars creating a very important Ai and the the code the intelligence of that AI is encoded in the parameters you want to make sure that on the one hand you don't lose it on the other hand it doesn't get contaminated and so we now have the ability to encrypt data of course at rest but also in transit and while it's being computed it's all encrypted and so we now have the ability to encrypt and transmission and when we're Computing it it is in a trusted trusted environment trusted uh engine environment and the last thing is decompression moving data in and out of these nodes when the compute is so fast becomes really essential and so we've put in a high linee speed compression engine and effectively moves data 20 times times faster in and out of these computers these computers are are so powerful and there's such a large investment the last thing we want to do is have them be idle and so all of these capabilities are intended to keep Blackwell fed and as busy as possible overall compared to Hopper it is two and a half times two and a half times the fp8 performance for training per chip it is ALS it also has this new format called fp6 so that even though the computation speed is the same the bandwidth that's Amplified because of the memory the amount of parameters you can store in the memory is now Amplified fp4 effectively doubles the throughput this is vitally important for inference one of the things that that um is becoming very clear is that whenever you use a computer with AI on the other side when you're chatting with the chatbot when you're asking it to uh review or make an image remember in the back is a GPU generating tokens some people call it inference but it's more appropriately generation the way that Computing is done in the past was retrieval you would grab your phone you would touch something um some signals go off basically an email goes off to some storage somewhere there's pre-recorded content somebody wrote a story or somebody made an image or somebody recorded a video that record pre-recorded content is then streamed back to the phone and recomposed in a way based on a recommender system to present the information to you you know that in the future the vast majority of that content will not be retrieved and the reason for that is because that was pre-recorded by somebody who doesn't understand the context which is the reason why we have to retrieve so much content if you can be working with an AI that understands the context who you are for what reason you're fetching this information and produces the information for you just the way you like it the amount of energy we save the amount of networking bandwidth we save the amount of waste of time we save will be tremendous the future is generative which is the reason why we call it generative AI which is the reason why this is a brand new industry the way we compute is fundamentally different we created a processor for the generative AI era and one of the most important parts of it is content token generation we call it this format is fp4 well that's a lot of computation 5x the Gen token generation 5x the inference capability of Hopper seems like enough but why stop there the answer is it's not enough and I'm going to show you why I'm going to show you why and so we would like to have a bigger GPU even bigger than this one and so we decided to scale it and notice but first let me just tell you how we've scaled over the course of the last eight years we've increased computation by 1,000 times8 years 1,000 times remember back in the good old days of Moore's Law it was 2x well 5x every what 10 10x every 5 years that's easier easiest math 10x every 5 years a 100 times every 10 years 100 times every 10 years at the in the middle in the hey days of the PC Revolution one 100 times every 10 years in the last 8 years we've gone 1,000 times we have two more years to go and so that puts it in perspective the rate at which we're advancing Computing is insane and it's still not fast enough so we built another chip this chip is just an incredible chip we call it the Envy link switch it's 50 billion transistors it's almost the size of Hopper all by itself this switch ship has four MV links in it each 1.8 terabytes per second and and it has computation in as I mentioned what is this chip for if we were to build such a chip we can have every single GPU talk to every other GPU at full speed at the same time that's insane it doesn't even make sense but if you could do that if you can find a way to do that and build a system to do that that's cost effective that's cost effective how incredible would it be that we could have all these gpus connect over a coherent link so that they effectively are one giant GPU well one of one of the Great Inventions in order to make a cost effective is that this chip has to drive copper directly the seres of this chip is is just a phenomenal invention so that we could do direct drive to copper and as a result you can build a system that looks like this now this system this system is kind of insane this is one dgx this is what a dgx looks like now remember just six years ago it was pretty heavy but I was able to lift it I delivered the uh the uh first djx1 to open Ai and and the researchers there it's on you know the pictures are on the internet and uh uh and we all autographed it uh and um uh if you come to my office it's autographed there is really beautiful and but but you could lift it uh this dgx this dgx that djx by the way was 170 teraflops if you're not familiar with the numbering system that's 0.17 pedop flops so this is 720 the first one I delivered to open AI was 0.17 you could round it up to 0.2 won't make any difference but and back then was like wow you know 30 more teraflops and so this is now 720 pedop flops almost an exal flop for training and the world's first one exal flops machine in one rack just so you know there are only a couple two three exop flops machines on the planet as we speak and so this is an exop flops AI system in one single rack well let's take a look at the back of it so this is what makes it possible that's the back that's the that's the back the dgx MV link spine 130 terabytes per second goes through the back of that chassis that is more than the aggregate bandwidth of the internet so we we could basically send everything to everybody within a second and so so we we have 5,000 cables 5,000 mvlink cables in total 2 miles now this is the amazing thing if we had to use Optics we would have had to use transceivers and retim and those transceivers and reers alone would have cost 20,000 watts 2 kilowatts of just transceivers alone just to drive the mvlink spine as a result we did it completely for free over mvlink switch and we were able to save the 20 kilow for computation this entire rack is 120 kilowatts so that 20 kilowatts makes a huge difference it's liquid cooled what goes in is 25° C about room temperature what comes out is 45°c about your jacuzzi so room temperature goes in jacuzzi comes out 2 liters per second we could we could sell a peripheral 600,000 Parts somebody used to say you know you guys make gpus and we do but this is what a GPU looks like to me when somebody says GPU I see this two years ago when I saw a GPU was the hgx it was 70 lb 35,000 Parts our gpus now are 600,000 parts and 3,000 lb 3,000 lb 3,000 lb that's kind of like the weight of a you know Carbon Fiber Ferrari I don't know if that's useful metric but everybody's going I feel it I feel it I get it I get that now that you mention that I feel it I don't know what's 3,000 lb okay so 3,000 lb ton and a half so it's not quite an elephant so this is what a dgx looks like now let's see what it looks like in operation okay let's imagine what is what how do we put this to work and what does that mean well if you were to train a GPT model 1.8 trillion parameter model it took it took about apparently about you know 3 to 5 months or so uh with 25,000 amp uh if we were to do it with hopper it would probably take something like 8,000 gpus and it would consume 15 megawatts 8,000 gpus on 15 megawatts it would take 90 days about 3 months and that would allows you to train something that is you know this groundbreaking AI model and this is obviously not as expensive as as um as anybody would think but it's 8,000 8,000 gpus it's still a lot of money and so 8,000 gpus 15 megawatts if you were to use Blackwell to do this it would only take 2,000 gpus 2,000 gpus same 90 days but this is the amazing part only 4 me GS of power so from 15 yeah that's right and that's and that's our goal our goal is to continuously drive down the cost and the energy they're directly proportional to each other cost and energy associated with the Computing so that we can continue to expand and scale up the computation that we have to do to train the Next Generation models well this is training inference or generation is vitally important going forward you know probably some half of the time that Nvidia gpus are in the cloud these days it's being used for token generation you know they're either doing co-pilot this or chat you know chat GPT that or um all these different models that are being used when you're interacting with it or generating IM generating images or generating videos generating proteins generating chemicals there's a bunch of gener generation going on all of that is B in the category of computing we call inference but inference is extremely hard for large language models because these large language models have several properties one they're very large and so it doesn't fit on one GPU this is Imagine imagine Excel doesn't fit on one GPU you know and imagine some application you're running on a daily basis doesn't run doesn't fit on one computer like a video game doesn't fit on one computer and most in fact do and many times in the past in hyperscale Computing many applic applications for many people fit on the same computer and now all of a sudden this one inference application where you're interacting with this chatbot that chatbot requires a supercomputer in the back to run it and that's the future the future is generative with these chatbots and these chatbots are trillions of tokens trillions of parameters and they have to generate tokens at interactive rates now what does that mean well uh three to tokens is about a word I you know the the uh you know space the final frontier these are the adventures that's like that's like 80 tokens okay I don't know if that's useful to you and so you know the art of communications is is selecting good an good analogies yeah this is this is not going well every I don't know what he's talking about never seen Star Trek and so and so so here we are we're trying to generate these tokens when you're interacting with it you're hoping that the tokens come back to you as quickly as possible and as quickly as you can read it and so the ability for Generation tokens is really important you have to paralyze the work of this model across many many gpus so that you could achieve several things one on the one hand you would like throughput because that throughput reduces the cost the overall cost per token of uh generating so your throughput dictates the cost of of uh delivering the service on the other hand you have another interactive rate which is another tokens per second where it's about per user and that has everything to do with quality of service and so these two things um uh compete against each other and we have to find a way to distribute work across all of these different gpus and paralyze it in a way that allows us to achieve both and it turns out the search search space is enormous you know I told you there's going to be math involved and everybody's going oh dear I heard some gasp just now when I put up that slide you know so so this this right here the the y axis is tokens per second data center throughput the x- axis is tokens per second interactivity of the person and notice the upper right is the best you want interactivity to be very High number of tokens per second per user you want the tokens per second of per data center to be very high the upper upper right is is terrific however it's very hard to do that and in order for us to search for the best answer across every single one of those intersections XY coordinates okay so you just look at every single XY coordinate all those blue dots came from some repartitioning of the software some optimizing solution has to go and figure out what whether to use use tensor parallel expert parallel pipeline parallel or data parallel and distribute this enormous model across all these G different gpus and sustain performance that you need this exploration space would be impossible if not for the programmability of nvidia's gpus and so we could because of Cuda because we have such Rich ecosystem we could explore this universe and find that green roof line it turns out that green roof line notice you got tp2 EPA dp4 it means two parall two uh tensor parallel tensor parallel across two gpus expert parallels across eight data parallel across four notice on the other end you got tensor parallel cross 4 and expert parallel across 16 the configuration the distribution of that software it's a different different um runtime that would produce these different results and you have to go discover that roof line well that's just one model and this is just one configuration of a computer imagine all of the models being created around the world and all the different different um uh configurations of of uh systems that are going to be available so now that you understand the basics let's take a look at inference of Blackwell compared to Hopper and this is this is the extraordinary thing in one generation because we created a system that's designed for trillion parameter gener generative AI the inference capability of Blackwell is off the charts and in fact it is some 30 times Hopper y for large language models for large language models like Chad GPT and others like it the blue line is Hopper I gave you imagine we didn't change the architecture of Hopper we just made it a bigger chip we just used the latest you know greatest uh 10 terab you know terabytes per second we connected the two chips together we got this giant 208 billion parameter chip how would we have performed if nothing else changed and it turns out quite wonderfully quite wonderfully and that's the purple line but not as great as it could be and and that's where the fp4 tensor core the new Transformer engine and very importantly the MV link switch and the reason for that is because all these gpus have to share the results partial products whenever they do all to all all all gather whenever they communicate with each other that mvlink switch is communicating almost 10 times faster than what we could do in the past using the fastest networks Okay so Blackwell is going to be just an amazing system for a generative Ai and in the future in the future data centers are going to be thought of as I mentioned earlier as an AI Factory an AI Factory's goal in life is to generate revenues generate in this case intelligence in this facility not generating electricity as in AC generator but of the last Industrial Revolution and this Industrial Revolution the generation of intelligence and so this ability is super super important the excitement of Blackwell is really off the charts you know when we first when we first um uh you know this this is a year and a half ago two years ago I guess two years ago when we first started to to go to market with hopper you know we had the benefit of of uh two two uh two csps uh joined us in a lunch and and we were you know delighted um and so we had two customers uh we have more now unbelievable excitement for Blackwell unbelievable excitement and there's a whole bunch of different configurations of course I showed you the configurations that slide into the hopper form factor so that's easy to upgrade I showed you examples that are liquid cooled that are the extreme versions of it one entire rack that's that's uh connected by mvlink 72 uh we're going to Blackwell is going to be ramping to the world's AI companies of which there are so many now doing amazing work in different modalities the csps every CSP is geared up all the OEM and odms Regional clouds Sovereign AIS and Telos all over the world are signing up to launch with Blackwell this Blackwell Blackwell would be the the the most successful product launch in our history and so I can't wait wait to see that um I want to thank I want to thank some partners that that are joining us in this uh AWS is gearing up for Blackwell they're uh they're going to build the first uh GPU with secure AI they're uh building out a 222 exf flops system you know just now when we animated uh just now the digital twin if you saw the the all of those clusters are coming down by the way that is not just art that is a digital twin of what we're building that's how big it's going to be besides infrastructure we're doing a lot of things together with AWS we're Cuda accelerating stag maker AI we're Cuda accelerating Bedrock AI uh Amazon robotics is working with us uh using Nvidia Omniverse and Isaac Sim AWS Health has Nvidia Health Integrated into it so AWS has has really leaned into accelerated Computing uh Google is gearing up for Blackwell gcp already has A1 100s h100s t4s l4s a whole Fleet of Nvidia Cuda gpus and they recently announced the Gemma model that runs across all of it uh we're work working to optimize uh and accelerate every aspect of gcp we're accelerating data proc which for data processing their data processing engine Jax xlaa vertex Ai and mojoko for robotics so we're working with uh Google and gcp across a whole bunch of initiatives uh Oracle is gearing up for black wellth Oracle is a great partner of ours for Nvidia dgx cloud and we're also working together to accelerate something that's really important to a lot of companies Oracle database Microsoft is accelerating and Microsoft is gearing up for Blackwell Microsoft Nvidia has a wide- ranging partnership we're accelerating Cuda accelerating all kinds of services when you when you chat obviously and uh AI services that are in Microsoft Azure uh it's very very likely Nvidia is in the back uh doing the inference and the token generation uh we built they built the largest Nvidia infiniband supercomputer basically a digital twin of hours or a physical twin of hours uh we're bringing the Nvidia ecosystem to Azure Nvidia djx cloud to Azure uh Nvidia Omniverse is now hosted in Azure Nvidia Healthcare is an Azure and all of it is deeply integrated and deeply connected with Microsoft fabric the whole industry is gearing up for Blackwell this is what I'm about to show you most of the most of the the the uh uh uh scenes that you've seen so far of Blackwell are the are the full Fidelity design of Blackwell everything in our company has a digital twin and in fact this digital twin idea is it is really spreading and it it helps it helps companies build very complicated things perfectly the first time and what could be more exciting than creating a digital twin to build a computer that was built in a digital twin and so let me show you what wistron is doing to meet the demand for NVIDIA accelerated Computing widraw one of our leading manufacturing Partners is building digital twins of Nvidia dgx and hgx factories using custom software developed with Omniverse sdks and apis for their newest Factory wraw started with a digital twin to virtually integrate their multi-ad and process simulation data into a unified view testing and optimizing layouts in this physically accurate digital environment increased worker efficency icy by 51% during construction the Omniverse digital twin was used to verify that the physical build matched the digital plans identifying any discrepancies early has helped avoid costly change orders and the results have been impressive using a digital twin helped bring wion's Factory online in half the time just 2 and 1/2 months instead of five in operation the Omniverse digital twin helps widraw rapidly Test new layouts to accommodate new processes or improve operations in the existing space and monitor real-time operations using live iot data from every machine on the production line which ultimately enabled wion to reduce End to-end Cycle Times by 50% and defect rates by 40% with Nvidia Ai and Omniverse nvidia's Global ecosystem of partners are building a new era of accelerated AI enabled [Music] digitalization that's how we that's the way it's going to be in the future we're going to manufacturing everything digitally first and then we'll manufacture it physically people ask me how did it start what got you guys so excited what was it that you saw that caused you to put it all in on this incredible idea and it's this hang on a second guys that was going to be such a moment that's what happens when you don't rehearse this as you know was first Contact 20 12 alexnet you put a cat into this computer and it comes out and it says cat and we said oh my God this is going to change everything you take 1 million numbers you take one Million numbers across three channels RGB these numbers make no sense to anybody you put it into this software and it compress it dimensionally reduce it it reduces it from a million dimensions a million Dimensions it turns it into three letters one vector one number and it's generalized you could have the cat be different cats and and you could have it be the front of the cat and the back of the cat and you look at this thing you say unbelievable you mean any cats yeah any cat and it was able to recognize all these cats and we realized how it did it systematically structurally it's scalable how big can you make it well how big do you want to make it and so we imagine that this is a completely new way of writing software and now today as you know you could have you type in the word c a and what comes out is a cat it went the other way am I right unbelievable how is it possible that's right how is it possible you took three letters and you generated a million pixels from it and it made sense well that's the miracle and here we are just literally 10 years later 10 years later where we recognize textt we recognize images we recognize videos and sounds and images not only do we recognize them we understand their meaning we understand the meaning of the text that's the reason why it can chat with you it can summarize for you it understands the text it understood not just recognizes the the English it understood the English it doesn't just recognize the pixels and understood the pixels and you can you can even condition it between two modalities you can have language condition image and generate all kinds of interesting things well if you can understand these things what else can you understand that you've digitized the reason why we started with text and you know images is because we digitized those but what else have we digitized well it turns out we digitized a lot of things proteins and genes and brain waves anything you can digitize so long as there's structure we can probably learn some patterns from it and if we can learn the patterns from it we can understand its meaning if we can understand its meaning we might be able to generate it as well and so therefore the generative AI Revolution is here well what else can we generate what else can we learn well one of the things that we would love to learn we would love to learn is we would love to learn climate we would love to learn extreme weather we would love to learn uh what how we can predict future weather at Regional scales at sufficiently high resolution such that we can keep people out of Harm's Way before harm comes extreme weather cost the world $150 billion surely more than that and it's not evenly distributed $150 billion is concentrated in some parts of the world and of course to some people of the world we need to adapt and we need to know what's coming and so we are creating Earth too a digital twin of the Earth for predicting weather we and we've made an extraordinary invention called Civ the ability to use generative AI to predict weather at extremely high resolution let's take a look as the earth's climate changes AI powered weather forecasting is allowing us to more accurately predict and track severe storms like super typhoon chanthu which caused widespread damage in Taiwan and the surrounding region in 2021 current AI forecast models can accurately predict the track of storms but they are limited to 25 km resolution which can miss important details Invidia cordi is a revolutionary new generative AI model trained on high resolution radar assimilated Warf weather forecasts and air 5 reanalysis data using cordi extreme events like chanthu can be super resolved from 25 km to 2 km resolution with 1,000 times the speed and 3,000 times the Energy Efficiency of conventional weather models by combining the speed and accuracy of nvidia's weather forecasting model forecast net and generative AI models like cordi we can explore hundreds or even thousands of kilometer scale Regional weather forecasts to provide a clear picture of the best worst and most likely impacts of a storm this wealth of information can help minimize loss of life and property damage today cordi is optimized for Taiwan but soon generative super sampling will be available as part of the in viia Earth 2 inference service for many regions across the [Music] globe the weather company has the trust a source of global weather predictions we are working together to accelerate their weather simulation first principled base of simulation however they're also going to integrate Earth to cordi so that they could help businesses and countries do Regional high resolution weather prediction and so if you have some weather prediction you'd like to know like to do uh reach out to the weather company really exciting really exciting work Nvidia Healthcare something we started 15 years ago we're super super excited about this this is an area where we're very very proud whether it's Medical Imaging or genene sequencing or computational chemistry it is very likely that Nvidia is the computation behind it we've done so much work in this area today we're announcing that we're going to do something really really cool imagine all of these AI models that are being used to generate images and audio but instead of images and audio because it understood images and audio all the digitization that we've done for genes and proteins and amino acids that digitalization capability is now now passed through machine learning so that we understand the language of Life the ability to understand the language of Life of course we saw the first evidence of it with alphafold this is really quite an extraordinary thing after Decades of painstaking work the world had only digitized and reconstructed using cor electron microscopy or Crystal XR x-ray crystallography um these different techniques painstaking reconstructed the protein 200,000 of them in just what is it less than a year or so Alpha fold has reconstructed 200 million proteins basically every protein every of every living thing that's ever been sequenced this is completely revolutionary well those models are incredibly hard to use um for incredibly hard for people to build and so what we're going to do is we're going to build them we're going to build them for uh the the researchers around the world and it won't be the only one there'll be many other models that we create and so let me show you what we're going to do with it virtual screening for new medicines is a computationally intractable problem existing techniques can only scan billions of compounds and require days on thousands of standard compute nodes to identify new drug candidates Nvidia biion Nemo Nims enable a new generative screening Paradigm using Nims for protein structure prediction with Alpha fold molecule generation with MIM and docking with diff dock we can now generate and Screen candidate molecules in a matter of minutes MIM can connect to custom applications to steer the generative process iteratively optimizing for desired properties these applications can be defined with biion Nemo microservices or built from scratch here a physics based simulation optimizes for a molecule's ability to bind to a Target protein while optimizing for other favorable molecular properties in parallel MIM generates high quality drug-like molecules that bind to the Target and are synthesizable translating to a higher probability of developing successful medicines faster biion Nemo is enabling a new paradigm in drug Discovery with Nims providing OnDemand microservices that can be combined to build powerful drug Discovery workflows like denovo protein design or ided molecule generation for virtual screening bio Nims are helping researchers and developers reinvent computational drug design Nvidia M MIM MIM cord diff there's a whole bunch of other models whole bunch of other models computer vision models robotics models and even of course some really really terrific open source language models these models are groundbreaking however it's hard for companies to use how would you use it how would you bring it into your company and integrate it into your workflow how would you package it up and run it remember earlier I just said that inference is an extraordinary computation problem how would you do the optimization for each and every one of these models and put together the Computing stack necessary to run that supercomputer so that you can run the models in your company and so we have a great idea we're going to invent a new way invent a new way for you to receive and operate software this software comes basically in a digital box we call it a container and we call it the Nvidia inference micr service a Nim and let me explain to you what it is a Nim it's a pre-trained model so it's pretty clever and it is packaged and optimized to run across nvidia's install base which is very very large what's inside it is incredible you have all these pre-trained state-ofthe-art open source models they could be open source they could be from one of our partners it could be created by us like Nvidia mull it is packaged up with all of its dependencies so Cuda the right version CNN the right version tensor RT llm Distributing across the multiple gpus Tred and inference server all completely packaged together it's optimized depending on whether you have a single GPU multi- GPU or multi node of gpus it's optimized for that and it's connected up with apis that are simple to use now this think about what an AI API is an AI API is an interface that you just talk to and so this is a piece of software in the future that has a really simple API and that API called human and these packages incredible bodies of software will be optimized and packaged and we'll put it on a website and you can download it you could take it with you you could run it in any Cloud you can run it in your own data center you can run in workstations if it fit and all you have to do is come to ai. nvidia.com we call it Nvidia inference microservice but inside the company we all call it Nims okay just imagine you know one of some someday there there's going to be one of these chat Bots and these chat Bots is going to just be in a Nim and you you'll uh you'll assemble a whole bunch of chat Bots and that's the way software is going to be be built someday how do we build software in the future it is unlikely that you'll write it from scratch or write a whole bunch of python code or anything like that it is very likely that you assemble a team of AIS there's probably going to be a super AI that you use that takes the mission that you give it and breaks it down into an execution plan some of that execution plan could be handed off to another Nim that Nim would maybe uh understand sap the language of sap is abap it might understand service now and it go retrieve some information from their platforms it might then hand that result to another Nim who that goes off and does some calculation on it maybe it's an optimization software a combinatorial optimization algorithm maybe it's uh you know some just some basic calculator maybe it's pandas to do some numerical analysis on it and then it comes back with its answer and it gets combined with everybody else's and it because it's been presented with this is what the right answer should look like it knows what answer what an what right answers to produce and it presents it to you we can get a report every single day at you know top of the hour uh that has something to do with a bill plan or some forecast or uh some customer alert or some bugs database or whatever it happens to be and we could assemble it using all these Nims and because these Nims have been packaged up and ready to work on your systems so long as you have video gpus in your data center in the cloud this this Nims will work together as a team and do amazing things and so we decided this is such a great idea we're going to go do that and so Nvidia has Nims running all over the company we have chatbots being created all over the place and one of the mo most important chatbots of course is a chip designer chatbot you might not be surprised we care a lot about building chips and so we want to build chatbots AI co-pilots that are co-designers with our engineers and so this is the way we did it so we got ourselves a llama llama 2 this is a 70b and it's you know packaged up in a NM and we asked it you know uh what is a CTL Well turns out CTL is an internal uh program and it has a internal proprietary language but it thought the CTL was a combinatorial timing logic and so it describes you know conventional knowledge of CTL but that's not very useful to us and so we gave it a whole bunch of new examples you know this is no different than employee onboarding an employee uh we say you know thanks for that answer it's completely wrong um and and uh and then we present to them uh this is what a CTL is okay and so this is what a CTL is at Nvidia and the CTL as you can see you know CTL stands for compute Trace Library which makes sense you know we were tracing compute Cycles all the time and it wrote the program isn't that amazing and so the productivity of our chip designers can go up this is what you can do with a Nim first thing you can do with is customize it we have a service called Nemo microservice that helps you curate the data preparing the data so that you could teach this on board this AI you fine-tune them and then you guardrail it you can even evaluate the answer evaluate its performance against um other other examples and so that's called the Nemo micr service now the thing that's that's emerging here is this there are three elements three pillars of what we're doing the first pillar is of course inventing the technology for um uh AI models and running AI models and packaging it up for you the second is to create tools to help you modify it first is having the AI technology second is to help you modify it and third is infrastructure for you to fine-tune it and if you like deploy it you could deploy it on our infrastructure called dgx cloud or you can employ deploy it on Prem you can deploy it anywhere you like once you develop it it's yours to take anywhere and so we are effectively an AI Foundry we will do for you and the industry on AI what tsmc does for us building chips and so we go to it with our go to tsmc with our big Ideas they manufacture and we take it with us and so exactly the same thing here AI Foundry and the three pillar ERS are the NIMS Nemo microservice and dgx Cloud the other thing that you could teach the Nim to do is to understand your proprietary information remember inside our company the vast majority of our data is not in the cloud it's inside our company it's been sitting there you know being used all the time and and gosh it's it's basically invidious intelligence we would like to take that data learn its meaning like we learned the meaning of almost anything else that we just talked about learn its meaning and then reindex that knowledge into a new type of database called a vector database and so you essentially take structured data or unstructured data you learn its meaning you encode its meaning so now this becomes an AI database and that AI database in the future once you create it you can talk to it and so let me give you an example of what you could do so suppose you create you get you got a whole bunch of multi modality data and one good example of that is PDF so you take the PDF you take all of your PDFs all the all your favorite you know the stuff that that is proprietary to you critical to your company you can encode it just as we encoded pixels of a cat and it becomes the word cat we can encode all of your PDF and it turns into vectors that are now stored inside your vector database it becomes the proprietary information of your company and once you have that proprietary information you can chat to it it's an it's a smart database and so you just ch chat with data and how how much more enjoyable is that you know we for for our software team you know they just chat with the bugs database you know how many bugs was there last night um are we making any progress and then after you're done talking to this uh bugs database you need therapy and so so we have another chatbot for you you can do it okay so we call this Nemo Retriever and the reason for that is because ultimately it's job is to go retrieve information as quickly as possible and you just talk to it hey retrieve me this information it goes if brings it back to you and do you mean this you go yeah perfect okay and so we call it the Nemo retriever well the Nemo service helps you create all these things and we have all all these different Nims we even have Nims of digital humans I'm Rachel your AI care manager okay so so it's a really short clip but there were so many videos to show you I guess so many other demos to show you and so I I had to cut this one short but this is Diana she is a digital human Nim and and uh you just talked to her and she's connected in this case to Hippocratic ai's large language model for healthcare and it's truly amazing she is just super smart about Healthcare things you know and so after you're done after my my Dwight my VP of software engineering talks to the chatbot for bugs database then you come over here and talk to Diane and and so so uh Diane is is um completely animated with AI and she's a digital human uh there's so many companies that would like to build they're sitting on gold mines the the Enterprise IT industry is sitting on a gold mine it's a gold mine because they have so much understanding of of uh the way work is done they have all these amazing tools that have been created over the years and they're sitting on a lot of data if they could take that gold mine and turn them into co-pilots these co-pilots could help us do things and so just about every it franchise it platform in the world that has valuable tools that people use is sitting on a gold mine for co-pilots and they would like to build their own co-pilots and their own chatbots and so we're announcing that Nvidia AI foundary is working with some of the world's great companies sap generates 87% of the world's Global Commerce basically the world runs on sap we run on sap Nvidia and sap are building sap Jewel co-pilots uh using Nvidia Nemo and dgx cloud service now they run 80 85% of the world's Fortune 500 companies run their people and customer service operations on service now and they're using Nvidia AI Foundry to build service now uh assist virtual assistance cohesity backs up the world's data they're sitting on a gold mine of data hundreds of exobytes of data over 10,000 companies Nvidia AI Foundry is working with them helping them build their Gaia generative AI agent snowflake is a company that stores the world's uh digital Warehouse in the cloud and serves over 3 billion queries a day for 10,000 Enterprise customers snowflake is working with Nvidia AI Foundry to build co-pilots with Nvidia Nemo and Nims net apppp nearly half of the files in the world are stored on Prem on net apppp Nvidia AI Foundry is helping them uh build chat Bots and co-pilots like those Vector databases and retrievers with Nvidia neemo and Nims and we have a great partnership with Dell everybody who everybody who is building these chat Bots and generative AI when you're ready to run it you're going to need an AI Factory and nobody is better at Building end-to-end Systems of very large scale for the Enterprise than Dell and so anybody any company every company will need to build AI factories and it turns out that Michael is here he's happy to take your order ladies and gentlemen Michael del okay let's talk about the next wave of Robotics the next wave of AI robotics physical AI so far all of the AI that we've talked about is one computer data comes into one computer lots of the world's if you will experience in digital text form the AI imitates Us by reading a lot of the language to predict the next words it's imitating You by studying all of the patterns and all the other previous examples of course it has to understand context and so on so forth but once it understands the context it's essentially imitating you we take all of the data we put it into a system like dgx we compress it into a large language model trillions and trillions of parameters become billions and billion trillions of tokens becomes billions of parameters these billions of parameters becomes your AI well in order for us to go to the next wave of AI where the AI understands the physical world we're going to need three computers the first computer is still the same computer it's that AI computer that now is going to be watching video and maybe it's doing synthetic data generation and maybe there's a lot of human examples just as we have human examples in text form we're going to have human examples in articulation form and the AIS will watch us understand what is happening and try to adapt it for themselves into the context and because it can generalize with these Foundation models maybe these robots can also perform in the physical world fairly generally so I just described in very simple terms essentially what just happened in large language models except the chat GPT moment for robotics may be right around the corner and so we've been building the end to-end systems for robotics for some time I'm super super proud of the work we have the AI system dgx we have the lower system which is called agx for autonomous systems the world's first robotics processor when we first built this thing people are what are you guys building it's a s so it's one chip it's designed to be very low power but it's designed for high-speed sensor processing and Ai and so if you want to run Transformers in a car or you want to run Transformers in a in a you know anything um that moves uh we have the perfect computer for you it's called the Jetson and so the dgx on top for training the AI the Jetson is the autonomous processor and in the middle we need another computer whereas large language models have the benefit of you providing your examples and then doing reinforcement learning human feedback what is the reinforcement learning human feedback of a robot well it's reinforcement learning physical feedback that's how you align the robot that's how you that's how the robot knows that as it's learning these articulation capabilities and manipulation capabilities it's going to adapt properly into the laws of physics and so we need a simulation engine that represents the world digitally for the robot so that the robot has a gym to go learn how to be a robot we call that virtual world Omniverse and the computer that runs Omniverse is called ovx and ovx the computer itself is hosted in the Azure Cloud okay and so basically we built these three things these three systems on top of it we have algorithms for every single one now I'm going to show you one super example of how Ai and Omniverse are going to work together the example I'm going to show you is kind of insane but it's going to be very very close to tomorrow it's a robotics building this robotics building is called a warehouse inside the robotics building are going to be some autonomous systems some of the autonomous systems are going to be called humans and some of the autonomous systems are going to be called forklifts and these autonomous systems are going to interact with each other of course autonomously and it's going to be overlooked upon by this Warehouse to keep everybody out of Harm's Way the warehouse is essentially an air traffic controller and whenever it sees something happening it will redirect traffic traffic and give New Way points just new way points to the robots and the people and they'll know exactly what to do this warehouse this building you can also talk to of course you could talk to it hey you know sap Center how are you feeling today for example and so you could ask the same the warehouse the same questions basically the system I just described will have Omniverse Cloud that's hosting the virtual simulation and AI running on djx cloud and all of this is running in real time let's take a look the future of heavy industri starts as a digital twin the AI agents helping robots workers and infrastructure navigate unpredictable events in complex industrial spaces will be built and evaluated first in sophisticated digital twins this Omniverse digital twin of a 100,000 ft Warehouse is operating as a simulation environment that integrates digital workers amrs running the Nvidia Isaac receptor stack centralized activity maps of the entire Warehouse from 100 simulated ceiling mount cameras using Nvidia metropolis and AMR route planning with Nvidia Koop software in Loop testing of AI agents in this physically accurate simulated environment enables us to evaluate and refine how the system adapts to real world unpredictability here an incident occurs along this amr's planned route blocking its path as it moves to pick up a pallet Nvidia Metropolis updates and sends a realtime occupancy map to kopt where a new optimal route is calculated the AMR is enabled to see around corners and improve its Mission efficiency with generative AI powered Metropolis Vision Foundation models operators can even ask questions using natural language the visual model understands nuanced activity and can offer immediate insights to improve operations all of the sensor data is created in simulation and passed to the real-time AI running as Nvidia inference microservices or Nims and when the AI is ready to be deployed in the physical twin the real Warehouse we connect metropolis and Isaac Nims to real sensors with the ability for continuous Improvement of both the digital twin and the AI models isn't that incredible and so remember remember a future facility Warehouse Factory building will be software defined and so the software is running how else would you test the software so you you you test the software to building the warehouse the optimization system in the digital twin what about all the robots all of those robots you are seeing just now they're all running their own autonomous robotic stack and so the way you integrate software in the future cicd in the future for robotic systems is with digital twins we've made Omniverse a lot easier to access we're going to create basically Omniverse Cloud apis four simple API and a channel and you can connect your application to it so this is this is going to be as wonderfully beautifully simple in the future that Omniverse is going to be and with these apis you're going to have these magical digital twin capability we also have turned om ver into an AI and integrated it with the ability to chat USD the the language of our language is you know human and Omniverse is language as it turns out is universal scene description and so that language is rather complex and so we've taught our Omniverse uh that language and so you can speak to it in English and it would directly generate USD and it would talk back in USD but Converse back to you in English you could also look for information in this world semantically instead of the world being encoded semantically in in language now it's encoded semantically in scenes and so you could ask it of of uh certain objects or certain conditions and certain scenarios and it can go and find that scenario for you it also can collaborate with you in generation you could design some things in 3D it could simulate some things in 3D or you could use AI to generate something in 3D let's take a look at how this is all going to work we have a great partnership with Seamans Seamans is the world's largest industrial engineering and operations platform you've seen now so many different companies in the industrial space heavy Industries is one of the greatest final frontiers of it and we finally now have the Necessary Technology to go and make a real impact seens is building the industrial metaverse and today we're announcing that Seamans is connecting their Crown Jewel accelerator to Nvidia Omniverse let's take a look seens technology is transformed every day for everyone team Center acts our leading product life cycle management software from the sems accelerator platform is used every day by our customers to develop and deliver products at scale now we are bringing the real and the digital worlds even Closer by integrating Nvidia Ai and Omniverse Technologies into team Center X Omniverse apis enable data interoperability and physics-based rendering to Industrial scale design and Manufacturing projects our customers HD market leader in sustainable ship manufacturing builds ammonia and hydrogen power chips often comprising over 7 million discrete Parts with Omniverse apis team Center X lets companies like HD yundai unify and visualize these massive engineering data sets interactively and integrate generative AI to generate 3D objects or HDR I backgrounds to see their projects in context the result an ultra inuitive photoal physics-based digital twin that eliminates waste and errors delivering huge savings in cost and time and we are building this for collaboration whether across more semens accelerator tools like seens anex or Star CCM Plus or across teams working on their favorite devices in the same scene together in this is just the beginning working with Nvidia we will bring accelerated Computing generative Ai and Omniverse integration across the Sean accelerator portfolio the pro the the professional the professional voice actor happens to be a good friend of mine Roland Bush who happens to be the CEO of seens once you get Omniverse connected into your workflow your ecosystem from the beginning of your design to engineering to manufacturing planning all the way to digital twin operations once you connect everything together it's insane how much productivity you can get and it's just really really wonderful all of a sudden everybody is operating on the same ground truth you don't have to exchange data and convert data make mistakes everybody is working on the same ground truth from the design Department to the art Department the architecture Department all the way to the engineering and even the marketing department let's take a look at how Nissan has integrated Omniverse into their workflow and it's all because it's connected by all these wonderful tools and these developers that we're working with take a look unbel [Music] for for that was not an animation that was Omniverse today we're announcing that Omniverse Cloud streams to The Vision Pro and it is very very strange that you walk around virtual doors when I was getting out of that car and everybody does it it is really really quite amazing Vision Pro connected to Omniverse portals you into Omniverse and because all of these CAD tools and all these different design tools are now integrated and connected to Omniverse you can have this type of workflow really incredible let's talk about robotics everything that moves will be robotic there's no question about that it's safer it's more convenient and one of the largest Industries is going to be Automotive we build the robotic stack from top to bottom as I was mentioned from the computer system but in the case of self-driving cars including the self-driving application at the end of this year or I guess beginning of next year we will be shipping in Mercedes and then shortly after that jlr and so these autonomous robotic systems are software defined they take a lot of work to do has computer vision has obviously artificial intelligence control and planning all kinds of very complicated technology and takes years to refine we're building the entire stack however we open up our entire stack for all of the automotive industry this is just the way we work the way we work in every single industry we try to build as much of it as we can so that we understand it but then we open it up so everybody can access it whether you would like to buy just our computer which is the world's only full functional save asld system that can run AI this functional safe asld quality computer or the operating system on top or of course our data centers which is in basically every AV company in the world however you would like to enjoy it we're delighted by it today we're announcing that byd the world's largest ev company is adopting our next Generation it's called Thor Thor is designed for Transformer engines Thor our next Generation AV computer will be used by byd you probably don't know this fact that we have over a million robotics developers we created Jetson this robotics computer we're so proud of it the amount of software that goes on top of it is insane but the reason why we can do it at all is because it's 100% Cuda compatible everything that we do everything that we do in our company is in service of our developers and by us being able to maintain this Rich ecosystem and make it compatible with everything that you access from us we can bring all of that incredible capability to this little tiny computer we call Jetson a robotics computer we also today are announcing this incredibly Advanced new SDK we call it Isaac perceptor Isaac perceptor most most of the Bots today are pre-programmed they're either following rails on the ground digital rails or theyd be following April tags but in the future they're going to have perception and the reason why you want that is so that you could easily program it you say would you like to go from point A to point B and it will figure out a way to navigate its way there so by only programming waypoints the entire route could be adaptive the entire environment could be reprogrammed just as I showed you at the very beginning with the warehouse you can't do that with pre-programmed agvs if those boxes fall down they just all gum up and they just wait there for somebody to come clear it and so now with the Isaac perceptor we have incredible state-of-the-art Vision odometry 3D reconstruction and in addition to 3D reconstruction depth perception the reason for that is so that you can have two modalities to keep an eye on what's happening in the world Isaac perceptor the most used robot today is the manipulator manufacturing arms and they are also pre-programmed the computer vision algorithms the AI algorithms the control and path planning algorithms that are geometry aware incredibly computational intensive we have made these Cuda accelerated so we have the world's first Cuda accelerated motion planner that is geometry aware you put something in front of it it comes up with a new plan and our articulates around it it has excellent perception for pose estimation of a 3D object not just not it's pose in 2D but it's pose in 3D so it has to imagine what's around and how best to grab it so the foundation pose the grip foundation and the um articulation algorithms are now available we call it Isaac manipulator and they also uh just run on nvidia's computers we are are starting to do some really great work in the next generation of Robotics the next generation of Robotics will likely be a humanoid robotics we now have the Necessary Technology and as I was describing earlier the Necessary Technology to imagine generalized human robotics in a way human robotics is likely easier and the reason for that is because we have a lot more imitation training data that we can provide there robots because we are constructed in a very similar way it is very likely that the human robotics will be much more useful in our world because we created the world to be something that we can interoperate in and work well in and the way that we set up our workstations and Manufacturing and Logistics they were designed for for humans they were designed for people and so these human robotics will likely be much more productive to deploy while we're creating just like we're doing with the others the entire stack starting from the top a foundation model that learns from watching video human IM human examples it could be in video form it could be in virtual reality form we then created a gym for it called Isaac reinforcement learning gym which allows the humanoid robot to learn how to adapt to the physical world and then an incredible computer the same computer that's going to go into a robotic car this computer will run inside a human or robot called Thor it's designed for Transformer engines we've combined several of these into one video this is something that you're going to really love take a look it's not enough for humans to [Music] imagine we have to invent and explore real and push Beyond what's been done fair amount of detail we create smarter and faster we push it to fail so it can learn we teach it then help it teach itself we broaden its understanding to take on new challenges with absolute precision and succeed we make it perceive and move and even reason so it can share our world with us [Music] 1:52:22.520,1193:02:47.295 [Music] this is where inspiration leads us the next Frontier this is Nvidia Project Groot a general purpose Foundation model for humanoid robot learning the group model takes multimodal instructions and past interactions as input and produces the next action for the robot to execute we developed Isaac lab a robot learning application to train gr on Omniverse Isaac Sim and we scale out with osmo a new compute orchestration service that coordinates work flows across dgx systems for training and ovx systems for simulation with these tools we can train Groot in physically based simulation and transfer zero shot to the real world the Groot model will enable a robot to learn from a handful of human demonstrations so it can help with everyday tasks and emulate human movement just by observing us this is made possible with nvidia's technologies that can understand humans from videos train models and simulation and ultimately deploy them directly to physical robots connecting group to a large language model even allows it to generate motions by following natural language instructions hi go1 can you give me a high five sure thing let's high five can you give us some cool moves sure check this out all this incredible intelligence is powered by the new Jetson Thor robotics chips designed for Groot built for the future with Isaac lab osmo and Groot we're providing the building blocks for the next generation of AI powered [Applause] robotics [Music] about the same size the soul of Nvidia the intersection of computer Graphics physics artificial intelligence it all came to bear at this moment the name of that project general robotics 003 I know super good super good well I think we have some special guests do [Music] we hey guys so I understand you guys are powered by Jetson they're powered by Jetson little Jetson robotics computers inside they learn to walk in Isaac Sim ladies and gentlemen this this is orange and this is the famous green they are the bdx robots of Disney amazing Disney research come on you guys let's wrap up let's go five things where you going I sit right here Don't Be Afraid come here green hurry up what are you saying no it's not time to eat it's not time to I'll I'll give you a snack in a moment let me finish up real quick come on green hurry up stop wasting time five things five things first a new Industrial Revolution every data center should be accelerated a trillion dollars worth of installed data centers will become modernized over the next several years second because of the computational capability we brought to bear a new way of doing software has emerged generative AI which is going to create new in new infrastructure dedicated to doing one thing and one thing only not for multi-user data centers but AI generators these AI generation will create incredibly valuable software a new Industrial Revolution second the computer of this revolution the computer of this generation generative AI trillion parameters blackw insane amounts of computers and computing third I'm trying to concentrate good job third new computer new computer creates new types of software new type of software should be distributed in a new way so that it can on the one hand be an endpoint in the cloud and easy to use but still allow you to take it with you because it is your intelligence your intelligence should be pack packaged up in a way that allows you to take it with you we call them Nims and third these Nims are going to help you create a new type of application for the future not one that you wrote completely from scratch but you're going to integrate them like teams create these applications we have a fantastic capability between Nims the AI technology the tools Nemo and the infrastructure dgx cloud in our AI Foundry to help you create proprietary applications proprietary chat Bots and then lastly everything that moves in the future will be robotic you're not going to be the only one and these robotic systems whether they are humanoid amrs self-driving cars forklifts manipulating arms they will all need one thing Giant stadiums warehouses factories there can to be factories that are robotic orchestrating factories uh manufacturing lines that are robotics building cars that are robotics these systems all need one thing they need a platform a digital platform a digital twin platform and we call that Omniverse the operating system of the robotics World these are the five things that we talked about today what does Nvidia look like what does Nvidia look like when we talk about gpus there's a very different image that I have when I when people ask me about gpus first I see a bunch of software stacks and things like that and second I see this this is what we announce to you today this is Blackwell this is the plat amazing amazing processors MV link switches networking systems and the system design is a miracle this is Blackwell and this to me is what a GPU looks like in my mind listen orange green I think we have one more treat for everybody what do you think should we okay we have one more thing to show you roll [Music] it [Music] [Music] he [Music] 2:01:21.920,1193:02:47.295 [Music] [Music] m [Music] yeah [Music] [Music] thank you thank you have a great have a great GTC thank you all for coming thank you

Transcript for:Nvidia GTC Keynote Insights

Transcript for:
Nvidia GTC Keynote Insights