Jensen Huang's Vision for AI Evolution

Jensen Huang, the CEO of NVIDIA, gave a keynote in India on November 1st, and I don't think we're paying enough attention to what he actually said inside this keynote. In this video, I've watched the keynote three times over, used my GPT bots to break down the transcript, and thought through the implications of what I thought were some absolute golden nuggets. from Jensen's keynote. If you're at all interested in the future of AI from the perspective of one of the most powerful tech CEOs in our current day and age, you're going to want to watch this video. Hey, if we haven't met, I'm Julie McCoy. I work full-time in AI. I'm the CEO and lead AI integrator at my company, First Movers, that literally just started late October 2024. I am building it from the ground up. I've worked in AI full-time for almost two years. And by the way, in this video, I'm not traveling. I'm here in my home office. So I actually have nice lighting, a good mic. You guys don't need to come here and rescue my lighting or my mic situation. Although I appreciate the thoughts shared on my prior videos. But yes, we're back here in my home office with a decent mic in front of me. All right, so on to this topic. On November 1st, Jensen Huang stood on stage in India and he gave a special address called AI Driving Digital Transformation in India and beyond. Now, if you've watched any content on my YouTube channel, you know it is not my style to jump on a headline just because it came out. And that's why I have sat on this keynote, digested it more than three times, and picked apart what I think are some key golden nuggets you need to know that literally affect computing itself and the future of technology as humanity evolves and we become one with the robots. Just kidding. Or am I? that's for you to decide all right so first of all at the very end of this keynote jensen wong stepped away from stage and played this video it's a little under three minutes long so i want to go ahead and play it for you in this video he drops some major hints as to the future of the technological revolution and jensen himself narrated this video take a listen and then i'll explain what i believe are the mic drop moments from what jensen is talking about here it is For 60 years, software 1.0, code written by programmers, ran on general-purpose CPUs. Then, software 2.0 arrived, machine learning neural networks running on GPUs. This led to the big bang of generative AI, models that learn and generate anything. Today, generative AI is revolutionizing $100 trillion in industries. Knowledge enterprises use agentic AI to automate digital work. Hello, I'm James, a digital human. Industrial enterprises use physical AI to automate physical work. Physical AI embodies robots like self-driving cars that safely navigate the real world, manipulators that perform complex industrial tasks, and humanoid robots who work collaboratively alongside us. Plants and factories will be embodied by physical AI, capable of monitoring and adjusting its operations, or speaking to us. NVIDIA builds three computers to enable developers to create physical AI. The models are first trained on DGX. Then, the AI is fine-tuned and tested using reinforcement learning physics feedback in Omniverse. And the trained AI runs on NVIDIA Jetson AGX robotics computers. NVIDIA Omniverse is a physics-based operating system for physical AI simulation. Robots learn and fine-tune their skills in Isaac Lab, a robot gym built on Omniverse. This is just one robot. Future factories will orchestrate teams of robots and monitor entire operations through thousands of sensors. For factory digital twins, they use an Omniverse blueprint called Mega. With Mega, the factory digital twin, is populated with virtual robots and their AI models, the robots' brains. The robots execute a task by perceiving their environment, reasoning, planning their next motion, and finally converting it to actions. These actions are simulated in the environment by the World Simulator in Omniverse, and the results are perceived by the robot brains through Omniverse Sensor Simulation. Based on the sensor simulations, the robot brains decide the next action, And the loop continues. While Mega precisely tracks the state and position of everything in the factory digital twin. This software-in-the-loop testing brings software-defined processes to physical spaces and embodiments. Letting industrial enterprises simulate and validate changes in an omniverse digital twin before deployment. ...to the physical world, saving massive risk and cost. The era of physical AI is here, transforming the world's heavy industries and robotics. All right, so I'm going to lay out one by one the major takeaways from Jensen's keynote. First of all, agents. We see it coming level three in OpenAI's five levels to AGI is agentic behavior, where these LLMs spawn bots that can be deployed in swarms to take over entire organizations and get work done. But the bridge to that end point in time is still being built. So of course, agents was a huge focus in Jensen's talk. Let me actually paint you a picture of what work could look like in a world run by AI agents, which by the way, I think could be a beautiful thing if you think of all the mundane rote tasks that we really shouldn't be doing. It isn't innate human behavior to sit at a computer for eight hours and code. It's innate human behavior to go be productive at something we enjoy, at something that brings meaning, and then step away, be with our family. So what if AI agents could curate and finish more work autonomously and then just serve us the priorities? This is the world of work that agents could bring us. Imagine waking up and your digital colleague is an AI agent. It's worked with all the other AI agents that all the other business professionals have deployed. or that work on their behalf. It's drafted all your reports, scheduled your meetings for you, prepared talking points for your next presentation. It knows the news, what's important in your industry, what you should be weaving into, what you're doing or saying. You just wake up and jump into the priorities. That's it. Your AI agent gets all the mundane done for you. That is the world that we are heading into. And it's not sci-fi and some movie. That world could be here in less than two years. If you think about AI agents and customer service, it will completely revolutionize the game. And the best way companies can be prepared for that shift is to have comprehensive knowledge bases drawn up. Because when you can deploy an AI agent to go curate information from a comprehensive knowledge base that has all the answers in a rich library of information, well, that AI agent will have the expertise, the data points, the training to get the answer right every time in lightning speed. I mean, talk about transformation of customer service. AI agents could bring the end of busy work. Stuff that takes up our time but doesn't really give us a lot of results. It's just busy work. They could test product ideas before we even launch them, test the viability of marketing campaigns, draft, deploy, iterate, produce, iterate again. the world of work will change forever. So listen to this clip where Jensen Huang talks about this idea of supercharging work and increasing productivity exponentially with AI agents. The large language models and the fundamental AI capabilities have reached a level of capability. we're able to now create what is called agents, large language models that understand the data that, of course, is being presented. It could be streaming data, video data, language model data. It could be data of all kinds. The first stage is perception. The second is reasoning about, given its observations, what is the mission and what is the task it has to perform. In order to perform that task, the agent would break down that task into steps of other tasks. And it would reason about what it would take and it would connect with other AI models. Some of them are good at, for example, understanding PDF. Maybe it's a model. that understands how to generate images. Maybe it's a model that is able to retrieve information, AI information, AI semantic data, from a proprietary database. So each one of these large language models are connected to the central reasoning large language model we call agent. And so these agents are able to perform all kinds of tasks. Some of them are maybe marketing agents, some of them are customer service agents, some of them are chip design agents. NVIDIA has chip design agents all over our company helping us design chips. Maybe they're software engineering agents. Maybe they're able to do marketing campaigns, supply chain management. And so we're going to have agents that are helping our employees become super employees. These agents or agentic AI models augment all of our employees to supercharge them, make them more productive. Okay, so the next big thing was what Jensen Huang said about... about general purpose computing. Now I have shared this factoid in a lot of talks I've given, even in some YouTube videos, but honestly it deserves another spotlight. For 60 years, General-purpose computing has existed. It's been the mainframe of our technology. But Jensen Wong said, we've left that era, we've entered a new era of accelerated computing, where he said, across multiple industries, we can accelerate productivity by 20 to 30 to 50 times. Listen to this. General-purpose computing as we know it has existed for 60 years, until now. For the last 30 years, we've had the benefit of Moore's Law, an incredible phenomenon. Without changing the software, the hardware can continue to improve in an architecturally compatible way. And the benefits of that software doubles every year. As a result of doubling in performance every year, depending on what your applications, you're reducing your cost by a factor of two every single year. The most incredible depreciating force of any technology the world's ever known. By depreciation, cost reduction, It made it possible for society to use more and more of IT. As we continue to consume IT, as we continue to process more data, Moore's Law made it possible for us to continue to drive down cost, democratizing computing as we know it today. Those two events, the invention of the System 360, Moore's Law with Windows PC drove what unquestionably one of the most important industries in the world. Every single industry has subsequently been built on top of it, IT. But we know know now that the scaling of CPUs has reached its limit. We can't continue to ride that curve, that free ride. The free ride of Moore's law has ended. We have to now do something different or depreciation will end. And we now will not enjoy depreciation, but experience inflation, computing inflation. And that's exactly what's happening around the world. We no longer can afford to do nothing in software and expect that our computing experience will continue to improve, that cost will decrease, and continue to spread the benefits of IT and to benefit from solving greater and greater challenges. We started our company to accelerate software. Our vision was there are applications that would benefit from acceleration if we augmented general-purpose computing. We take the workload that is very compute intensive and we offload it and we accelerate it using a model we call CUDA, a programming model that we invented called CUDA that made it possible for us to accelerate applications tremendously. That acceleration benefit has the same qualities as Moore's law. For applications that were impossible or impractical to perform using general-purpose computing, we have the benefits of accelerated computing to realize that capability. For example, computer graphics. Real-time computer graphics was made possible because of NVIDIA coming into the world and make possible this new processor we call GPUs. The GPU was really the first accelerated computing architecture running CUDA, running computer graphics. A perfect example. We democratized computer graphics as we know it. 3D graphics is now literally everywhere. It could be used as a medium for almost any application. But we felt that long-term accelerated computing could be far, far more impactful. And so over the last 30 years, we've been on a journey to accelerate one domain of application after another. The reason why this has taken so long is simply because of this. There is no such magical processor that can accelerate everything in the world. Because if you could do that... you would just call it a CPU. You need to reinvent the computing stack from the algorithms to the architecture underneath and connect it to applications on top. In one domain after another domain, computer graphics is beginning, but we've taken this architecture, CUDA architecture, from one industry after another industry after another industry. Today, we accelerate so many important industries. CULITO is fundamental to semiconductor manufacturing, computational lithography, simulation, computer aid and engineering, even 5G radios that we've recently announced, partnerships with that we can... Accelerate the 5G software stack, quantum computing, so that we can invent the future of computing with classical quantum hybrid computing. Parabricks, our gene sequencing software stack. QVS, one of the most important things every single company is working on. It's going from databases to knowledge bases so that we can create AI databases using QVS that we can create and vectorize all of your data. QDF, data frames. Data frames is essentially... another word for structured data. SQL acceleration is possible with QDF. In each one of these different libraries, we're able to accelerate the application 20, 30, 50 times. It amazed me that Jensen actually put a number like 50x on the increase of productivity we can gain. And this wasn't something far off attached to AGI. It was something he attached to the here and now. Next, he talked about something that I have believed all year. year is the biggest gateway to bringing AGI into our world. And that is closing the gap of helping AI understand the physical world around us as we know it. Hello, when it understands the real world and the nature of the context of real time, it will understand the essence of humanity itself. And that is when AI becomes truly relevant across the board. And I believe we will see this general nature of artificial intelligence that is capable at a generalized level across the board instead of what we have now which is artificial narrow intelligence where ai is actually better at specialized tasks than it is generalized but understanding the physical world it will change that forever so jensen wong talked about how they're building three computers to solve for helping ai understand the physical world first the omniverse they're creating a world of digital twins where physical AI can learn, train, refine its behaviors, become fine-tuned before it's deployed, iRobot style, into the real world. This is pretty mind-blowing because this means you can eradicate a lot of unsafe behaviors out of physical AI before it even gets into the real world. Next, the Blackwell DGX is a big part of that physical real-world training. and then he brought up the computer form itself is something they're involved in as well something called a manipulator which is that mechanical arm that does specialized tasks inside of warehouses assembly lines etc autonomous cars and of course the humanoid the robot that looks like a human listen to what he says about how the next generation of ai needs to understand the physical world that next generation of AI needs to understand the physical world. We call it physical AI. In order to create physical AI, we need three computers, and we created three computers to do so. The DGX computer, which Blackwell, for example, is a reference design and architecture to create things like DGX computers for training the model. That model needs a place to be refined. It needs a place to learn. It needs the place to apply. its physical capability, its robotics capability. We call that omniverse, a virtual world that obeys the laws of physics where robots can learn to be robots. And then when you're done with the training of it, that AI model could then run in the actual robotic system. That robotic system could be a car, it could be a robot, it could be an autonomous moving robot, it could be a picking arm, it could be an entire... factory or an entire warehouse that's robotic. And that computer we call AGX, Jetson AGX, DGX for training, and then Omniverse for doing the digital twin. Now here in India, we've got a really great ecosystem who is working with us to take this infrastructure, take this ecosystem of capabilities to help the world build physical AI systems. And you know what I've really loved is AdWord is one of the largest robotics company. They build robotics and more importantly they put it in a digital twin where optimization takes place. They teach the robot all the inputs that comes out of the physical world. Not only is that work taking place our system integrators Accenture, TCS, Tech Mahindra are taking that knowledge not only into India but also outside India. So do it in India for India and do from India for globe. The last huge golden nugget that Jensen Huang dropped was how we have in fact broken Moore's law and he put very specific numbers on it. You know, we kind of hear these great abstract numbers like we're 10xing every six months from Elon Musk. But really, that's just a hypey statement. Jensen gave us actual data. So Moore's law says that compute goes by two times every one and a half years. What's actually happening right now, Jensen said, is we are 4xing compute every single year. The Blackwell system is extraordinary. Of course the computation is incredible. Each rack is 3,000 pounds. 120 kilowatts, 120,000 watts in each rack, the density of computing, the highest the world's ever known, and what we're trying to do is to learn larger and smarter models. It's called the scaling law. The scaling law comes from the fact that the observation that, the empirical observation and measurements that suggest the more data you have to train a large language model with, And, therefore, the correspondingly large model size, you know, the more information you want to learn from, the larger the model has to be, or the larger model you would like to train, the more data you need to have. And each year, we're increasing the amount of data and the model size each by about a factor of two, which means that every single year, the computation, which is the product of those two, has to increase by a factor of four. All right. Now remember, there was a time when the world Moore's Law was two times every year and a half, or 10 times every five years, 100 times every 10 years. We are now moving technology at a rate of four times every year. Four times every year over the course of 10 years. Incredible scaling. That's insane. He said we've also revolutionized scaling laws as they exist. He said that they've discovered it. intelligence is more than just a one shot and that thinking results in higher quality answers. The second thing that we've discovered recently, and this is a very big deal, after you're done training the model, of course, all of you have used ChatGPT. When you use ChatGPT as a one shot, you ask, you give it a prompt. Instead of writing a program to communicate with a computer today, you write a prompt. You just talk to the computer the way you talk to person. You describe the context, you describe what it is you're querying about, you could ask it to write a program for you, you could ask it to write a recipe for you, whatever question you would like to have. And the AI process through a very large neural network and produces a sequence of answers, producing one word after another word. In the future, and starting with Strawberry, we realize that, of course, intelligence is not just one shot but intelligence requires thinking and thinking is reasoning and maybe you're doing path planning and maybe you're doing some simulations in your mind you're reflecting on your own answers and so as a result thinking results in higher quality answers and we've now discovered a second scaling law and this is a scaling law at a time of inference the longer you think the higher quality answer you can produce this is not illogical this is very very intuitive to all of us. If you were to ask me, what's my favorite Indian food, I would tell you chicken biryani. And I don't have to think about that very much, and I don't have to reason about that. I just know it. And there are many things that you can ask it. Like, for example, what's NVIDIA good at? NVIDIA's good at building AI supercomputers. NVIDIA's great at building GPUs. And those are things that you know that it's encoded into your knowledge. However, there are many things that requires reasoning. You know, for example, For example, if I had to travel from Mumbai to California, I want to do it in a way that allows me to enjoy four other cities along the way. You know, today, I got here at 3 a.m. this morning. I got here through Denmark. And right before Denmark, I was in Orlando, Florida. And before Orlando, Florida, I was in California. That was two days ago. And I'm still trying to figure out what day we're in right now. But anyways, I'm happy to be here. to be here. If I were to tell it, I would like to go from California to Mumbai. I would like to do it within three days. And I give it all kinds of constraints about what time I'm willing to leave and able to leave, what hotels I like to stay at, so on and so forth. The people I have to meet, the number of permutations of that, of course, quite high. And so the planning of that process, coming up with an optimal plan is very, very complicated. And so that's where thinking, reasoning, planning. comes in and the more you compute the higher quality answer you could provide. So a second scaling law has been born and it's called time of interference. Now what's interesting is we've known about these things for a while but we haven't actually been able to build them into computer models into LLMs and now of course we have reasoning inside ChatGPT's O1 which is where the model debates itself thinks through a chain of thought before producing a final answer. And Jensen said that the longer you think, the higher quality of an answer you can produce. Which is interesting. It points back to the importance of quiet time and thinking, which is something often lost in our real world and in humanity itself in this go, go, go nature and culture we live in. Listen to what Jensen says about how we have learned to understand the meaning of data itself. Several years ago, about a decade ago, something very important happened. And most of you have seen the same thing. AlexNet made a gigantic leap in the performance of computer vision. Computer vision is a very important field of artificial intelligence. AlexNet surprised the world with how much of a leap it was. that it was able to produce. We had the benefit of taking a step back and asking ourselves, what are we witnessing? Why is AlexNet so effective? How far can it scale? What else can we do with this approach called deep learning? And if we were to find ways to apply deep learning to other problems, how does it affect the computer industry? And if we wanted to do that, if we believe in that future, future and we're excited about what deep learning can do, how would we change every single layer of the computing stack so that we could reinvent computing altogether? 12 years ago, we decided to dedicate our entire company to go pursue this vision. It is now 12 years later. Every single time I've come to India, I've had the benefit of talking to you about deep learning, had the benefit of talking to you about machine learning, and I think it's very, very clear now the world has come. completely changed. Now let's think about what happened. The first thing that happened, of course, is how we do software. Our industry is underpinned by the method by which software is done. The way that software was done, call it software 1.0, programmers would code algorithms we call functions. into to run on a computer. And we would apply it to input information to predict an output. Somebody would write Python or C or Fortran or Pascal or C++ code algorithms. that run on a computer. You apply input to it, and output is produced. Very classically, the computer model that we understood quite well. And it, of course, created one of the largest industries in the world, right here in India. The production of software. Coding, programming, became a whole industry. This all happened within our generation. However, that approach of... developing software has been disrupted. It is now not coding, but machine learning. Using a computer using a computer to study the patterns and relationships of massive amounts of observed data to essentially learn from it the function that predicts it. And so we are essentially designing a universal function approximator using machines to learn the expected output that would produce such a function. And so going back and forth, looking, this is software 1.0 with human coding to now software 2.0 using machine learning. This is what a GPU looks like today. This is Blackwell. Incredible system that is designed to study data at an enormous scale. Yeah, thank you. This is the Greek breakthrough. In the last several years, we have now learned the representation or the meaning of words and numbers and images and pixels and videos, chemicals. proteins, amino acids, fluid patterns, particle, physics. We have now learned the meaning of so many different types of data. We have learned to represent, how to represent, represent information in so many different modalities. Not only have we learned the meaning of it, we can translate it to another modality. Let me tell you, this was a revolutionary keynote that Jensen gave. I think it will be extremely powerful to see what the entrepreneurs do with these new scaling laws and the new opportunities that the compute and breakthrough, thanks to NVIDIA, has given the world. We went from 2x-ing our compute every one and a half years, to 4Xing every single year. Just think on that for a minute. Think on what that means for your business, your work life, your home life, your efficiency capabilities. Yeah, it means the future is now. It's happening this year and it will only grow from here. I'd love to hear what you think in the comments. Let me know. It's always a joy to hear from you. I've got a lot of smart people watching my channel. So it is an honor to be here in this new age with you. Let me know what you think of the nature of compute scaling laws and what is possible in this new age. And as always, hit subscribe here on my channel at Julie McCoy, and I'll see you down the next AI Rabbit Hole.

If you're at all interested in the future of AI from the perspective of one of the most powerful tech CEOs in our current day and age, you're going to want to watch this video. Hey, if we haven't met, I'm Julie McCoy. I work full-time in AI.

I'm the CEO and lead AI integrator at my company, First Movers, that literally just started late October 2024. I am building it from the ground up. I've worked in AI full-time for almost two years. And by the way, in this video, I'm not traveling.

I'm here in my home office. So I actually have nice lighting, a good mic. You guys don't need to come here and rescue my lighting or my mic situation.

Although I appreciate the thoughts shared on my prior videos. But yes, we're back here in my home office with a decent mic in front of me. All right, so on to this topic.

On November 1st, Jensen Huang stood on stage in India and he gave a special address called AI Driving Digital Transformation in India and beyond. Now, if you've watched any content on my YouTube channel, you know it is not my style to jump on a headline just because it came out. And that's why I have sat on this keynote, digested it more than three times, and picked apart what I think are some key golden nuggets you need to know that literally affect computing itself and the future of technology as humanity evolves and we become one with the robots. Just kidding.

Or am I? that's for you to decide all right so first of all at the very end of this keynote jensen wong stepped away from stage and played this video it's a little under three minutes long so i want to go ahead and play it for you in this video he drops some major hints as to the future of the technological revolution and jensen himself narrated this video take a listen and then i'll explain what i believe are the mic drop moments from what jensen is talking about here it is For 60 years, software 1.0, code written by programmers, ran on general-purpose CPUs. Then, software 2.0 arrived, machine learning neural networks running on GPUs.

This led to the big bang of generative AI, models that learn and generate anything. Today, generative AI is revolutionizing $100 trillion in industries. Knowledge enterprises use agentic AI to automate digital work. Hello, I'm James, a digital human. Industrial enterprises use physical AI to automate physical work.

Physical AI embodies robots like self-driving cars that safely navigate the real world, manipulators that perform complex industrial tasks, and humanoid robots who work collaboratively alongside us. Plants and factories will be embodied by physical AI, capable of monitoring and adjusting its operations, or speaking to us. NVIDIA builds three computers to enable developers to create physical AI.

The models are first trained on DGX. Then, the AI is fine-tuned and tested using reinforcement learning physics feedback in Omniverse. And the trained AI runs on NVIDIA Jetson AGX robotics computers. NVIDIA Omniverse is a physics-based operating system for physical AI simulation.

Robots learn and fine-tune their skills in Isaac Lab, a robot gym built on Omniverse. This is just one robot. Future factories will orchestrate teams of robots and monitor entire operations through thousands of sensors.

For factory digital twins, they use an Omniverse blueprint called Mega. With Mega, the factory digital twin, is populated with virtual robots and their AI models, the robots' brains. The robots execute a task by perceiving their environment, reasoning, planning their next motion, and finally converting it to actions. These actions are simulated in the environment by the World Simulator in Omniverse, and the results are perceived by the robot brains through Omniverse Sensor Simulation. Based on the sensor simulations, the robot brains decide the next action, And the loop continues.

While Mega precisely tracks the state and position of everything in the factory digital twin. This software-in-the-loop testing brings software-defined processes to physical spaces and embodiments. Letting industrial enterprises simulate and validate changes in an omniverse digital twin before deployment.

...to the physical world, saving massive risk and cost. The era of physical AI is here, transforming the world's heavy industries and robotics. All right, so I'm going to lay out one by one the major takeaways from Jensen's keynote.

First of all, agents. We see it coming level three in OpenAI's five levels to AGI is agentic behavior, where these LLMs spawn bots that can be deployed in swarms to take over entire organizations and get work done. But the bridge to that end point in time is still being built. So of course, agents was a huge focus in Jensen's talk.

Let me actually paint you a picture of what work could look like in a world run by AI agents, which by the way, I think could be a beautiful thing if you think of all the mundane rote tasks that we really shouldn't be doing. It isn't innate human behavior to sit at a computer for eight hours and code. It's innate human behavior to go be productive at something we enjoy, at something that brings meaning, and then step away, be with our family. So what if AI agents could curate and finish more work autonomously and then just serve us the priorities? This is the world of work that agents could bring us.

Imagine waking up and your digital colleague is an AI agent. It's worked with all the other AI agents that all the other business professionals have deployed. or that work on their behalf.

It's drafted all your reports, scheduled your meetings for you, prepared talking points for your next presentation. It knows the news, what's important in your industry, what you should be weaving into, what you're doing or saying. You just wake up and jump into the priorities. That's it.

Your AI agent gets all the mundane done for you. That is the world that we are heading into. And it's not sci-fi and some movie.

That world could be here in less than two years. If you think about AI agents and customer service, it will completely revolutionize the game. And the best way companies can be prepared for that shift is to have comprehensive knowledge bases drawn up. Because when you can deploy an AI agent to go curate information from a comprehensive knowledge base that has all the answers in a rich library of information, well, that AI agent will have the expertise, the data points, the training to get the answer right every time in lightning speed.

I mean, talk about transformation of customer service. AI agents could bring the end of busy work. Stuff that takes up our time but doesn't really give us a lot of results. It's just busy work.

They could test product ideas before we even launch them, test the viability of marketing campaigns, draft, deploy, iterate, produce, iterate again. the world of work will change forever. So listen to this clip where Jensen Huang talks about this idea of supercharging work and increasing productivity exponentially with AI agents.

The large language models and the fundamental AI capabilities have reached a level of capability. we're able to now create what is called agents, large language models that understand the data that, of course, is being presented. It could be streaming data, video data, language model data. It could be data of all kinds. The first stage is perception.

The second is reasoning about, given its observations, what is the mission and what is the task it has to perform. In order to perform that task, the agent would break down that task into steps of other tasks. And it would reason about what it would take and it would connect with other AI models.

Some of them are good at, for example, understanding PDF. Maybe it's a model. that understands how to generate images. Maybe it's a model that is able to retrieve information, AI information, AI semantic data, from a proprietary database.

So each one of these large language models are connected to the central reasoning large language model we call agent. And so these agents are able to perform all kinds of tasks. Some of them are maybe marketing agents, some of them are customer service agents, some of them are chip design agents.

NVIDIA has chip design agents all over our company helping us design chips. Maybe they're software engineering agents. Maybe they're able to do marketing campaigns, supply chain management.

And so we're going to have agents that are helping our employees become super employees. These agents or agentic AI models augment all of our employees to supercharge them, make them more productive. Okay, so the next big thing was what Jensen Huang said about... about general purpose computing. Now I have shared this factoid in a lot of talks I've given, even in some YouTube videos, but honestly it deserves another spotlight.

For 60 years, General-purpose computing has existed. It's been the mainframe of our technology. But Jensen Wong said, we've left that era, we've entered a new era of accelerated computing, where he said, across multiple industries, we can accelerate productivity by 20 to 30 to 50 times. Listen to this.

General-purpose computing as we know it has existed for 60 years, until now. For the last 30 years, we've had the benefit of Moore's Law, an incredible phenomenon. Without changing the software, the hardware can continue to improve in an architecturally compatible way. And the benefits of that software doubles every year. As a result of doubling in performance every year, depending on what your applications, you're reducing your cost by a factor of two every single year.

The most incredible depreciating force of any technology the world's ever known. By depreciation, cost reduction, It made it possible for society to use more and more of IT. As we continue to consume IT, as we continue to process more data, Moore's Law made it possible for us to continue to drive down cost, democratizing computing as we know it today.

Those two events, the invention of the System 360, Moore's Law with Windows PC drove what unquestionably one of the most important industries in the world. Every single industry has subsequently been built on top of it, IT. But we know know now that the scaling of CPUs has reached its limit. We can't continue to ride that curve, that free ride.

The free ride of Moore's law has ended. We have to now do something different or depreciation will end. And we now will not enjoy depreciation, but experience inflation, computing inflation. And that's exactly what's happening around the world. We no longer can afford to do nothing in software and expect that our computing experience will continue to improve, that cost will decrease, and continue to spread the benefits of IT and to benefit from solving greater and greater challenges.

We started our company to accelerate software. Our vision was there are applications that would benefit from acceleration if we augmented general-purpose computing. We take the workload that is very compute intensive and we offload it and we accelerate it using a model we call CUDA, a programming model that we invented called CUDA that made it possible for us to accelerate applications tremendously.

That acceleration benefit has the same qualities as Moore's law. For applications that were impossible or impractical to perform using general-purpose computing, we have the benefits of accelerated computing to realize that capability. For example, computer graphics. Real-time computer graphics was made possible because of NVIDIA coming into the world and make possible this new processor we call GPUs.

The GPU was really the first accelerated computing architecture running CUDA, running computer graphics. A perfect example. We democratized computer graphics as we know it.

3D graphics is now literally everywhere. It could be used as a medium for almost any application. But we felt that long-term accelerated computing could be far, far more impactful. And so over the last 30 years, we've been on a journey to accelerate one domain of application after another.

The reason why this has taken so long is simply because of this. There is no such magical processor that can accelerate everything in the world. Because if you could do that...

you would just call it a CPU. You need to reinvent the computing stack from the algorithms to the architecture underneath and connect it to applications on top. In one domain after another domain, computer graphics is beginning, but we've taken this architecture, CUDA architecture, from one industry after another industry after another industry.

Today, we accelerate so many important industries. CULITO is fundamental to semiconductor manufacturing, computational lithography, simulation, computer aid and engineering, even 5G radios that we've recently announced, partnerships with that we can... Accelerate the 5G software stack, quantum computing, so that we can invent the future of computing with classical quantum hybrid computing.

Parabricks, our gene sequencing software stack. QVS, one of the most important things every single company is working on. It's going from databases to knowledge bases so that we can create AI databases using QVS that we can create and vectorize all of your data.

QDF, data frames. Data frames is essentially... another word for structured data. SQL acceleration is possible with QDF.

In each one of these different libraries, we're able to accelerate the application 20, 30, 50 times. It amazed me that Jensen actually put a number like 50x on the increase of productivity we can gain. And this wasn't something far off attached to AGI. It was something he attached to the here and now. Next, he talked about something that I have believed all year.

year is the biggest gateway to bringing AGI into our world. And that is closing the gap of helping AI understand the physical world around us as we know it. Hello, when it understands the real world and the nature of the context of real time, it will understand the essence of humanity itself.

And that is when AI becomes truly relevant across the board. And I believe we will see this general nature of artificial intelligence that is capable at a generalized level across the board instead of what we have now which is artificial narrow intelligence where ai is actually better at specialized tasks than it is generalized but understanding the physical world it will change that forever so jensen wong talked about how they're building three computers to solve for helping ai understand the physical world first the omniverse they're creating a world of digital twins where physical AI can learn, train, refine its behaviors, become fine-tuned before it's deployed, iRobot style, into the real world. This is pretty mind-blowing because this means you can eradicate a lot of unsafe behaviors out of physical AI before it even gets into the real world.

Next, the Blackwell DGX is a big part of that physical real-world training. and then he brought up the computer form itself is something they're involved in as well something called a manipulator which is that mechanical arm that does specialized tasks inside of warehouses assembly lines etc autonomous cars and of course the humanoid the robot that looks like a human listen to what he says about how the next generation of ai needs to understand the physical world that next generation of AI needs to understand the physical world. We call it physical AI. In order to create physical AI, we need three computers, and we created three computers to do so.

The DGX computer, which Blackwell, for example, is a reference design and architecture to create things like DGX computers for training the model. That model needs a place to be refined. It needs a place to learn.

It needs the place to apply. its physical capability, its robotics capability. We call that omniverse, a virtual world that obeys the laws of physics where robots can learn to be robots. And then when you're done with the training of it, that AI model could then run in the actual robotic system. That robotic system could be a car, it could be a robot, it could be an autonomous moving robot, it could be a picking arm, it could be an entire...

factory or an entire warehouse that's robotic. And that computer we call AGX, Jetson AGX, DGX for training, and then Omniverse for doing the digital twin. Now here in India, we've got a really great ecosystem who is working with us to take this infrastructure, take this ecosystem of capabilities to help the world build physical AI systems. And you know what I've really loved is AdWord is one of the largest robotics company.

They build robotics and more importantly they put it in a digital twin where optimization takes place. They teach the robot all the inputs that comes out of the physical world. Not only is that work taking place our system integrators Accenture, TCS, Tech Mahindra are taking that knowledge not only into India but also outside India.

So do it in India for India and do from India for globe. The last huge golden nugget that Jensen Huang dropped was how we have in fact broken Moore's law and he put very specific numbers on it. You know, we kind of hear these great abstract numbers like we're 10xing every six months from Elon Musk.

But really, that's just a hypey statement. Jensen gave us actual data. So Moore's law says that compute goes by two times every one and a half years. What's actually happening right now, Jensen said, is we are 4xing compute every single year.

The Blackwell system is extraordinary. Of course the computation is incredible. Each rack is 3,000 pounds. 120 kilowatts, 120,000 watts in each rack, the density of computing, the highest the world's ever known, and what we're trying to do is to learn larger and smarter models. It's called the scaling law.

The scaling law comes from the fact that the observation that, the empirical observation and measurements that suggest the more data you have to train a large language model with, And, therefore, the correspondingly large model size, you know, the more information you want to learn from, the larger the model has to be, or the larger model you would like to train, the more data you need to have. And each year, we're increasing the amount of data and the model size each by about a factor of two, which means that every single year, the computation, which is the product of those two, has to increase by a factor of four. All right.

Now remember, there was a time when the world Moore's Law was two times every year and a half, or 10 times every five years, 100 times every 10 years. We are now moving technology at a rate of four times every year. Four times every year over the course of 10 years.

Incredible scaling. That's insane. He said we've also revolutionized scaling laws as they exist.

He said that they've discovered it. intelligence is more than just a one shot and that thinking results in higher quality answers. The second thing that we've discovered recently, and this is a very big deal, after you're done training the model, of course, all of you have used ChatGPT. When you use ChatGPT as a one shot, you ask, you give it a prompt.

Instead of writing a program to communicate with a computer today, you write a prompt. You just talk to the computer the way you talk to person. You describe the context, you describe what it is you're querying about, you could ask it to write a program for you, you could ask it to write a recipe for you, whatever question you would like to have.

And the AI process through a very large neural network and produces a sequence of answers, producing one word after another word. In the future, and starting with Strawberry, we realize that, of course, intelligence is not just one shot but intelligence requires thinking and thinking is reasoning and maybe you're doing path planning and maybe you're doing some simulations in your mind you're reflecting on your own answers and so as a result thinking results in higher quality answers and we've now discovered a second scaling law and this is a scaling law at a time of inference the longer you think the higher quality answer you can produce this is not illogical this is very very intuitive to all of us. If you were to ask me, what's my favorite Indian food, I would tell you chicken biryani.

And I don't have to think about that very much, and I don't have to reason about that. I just know it. And there are many things that you can ask it. Like, for example, what's NVIDIA good at? NVIDIA's good at building AI supercomputers.

NVIDIA's great at building GPUs. And those are things that you know that it's encoded into your knowledge. However, there are many things that requires reasoning. You know, for example, For example, if I had to travel from Mumbai to California, I want to do it in a way that allows me to enjoy four other cities along the way.

You know, today, I got here at 3 a.m. this morning. I got here through Denmark.

And right before Denmark, I was in Orlando, Florida. And before Orlando, Florida, I was in California. That was two days ago. And I'm still trying to figure out what day we're in right now.

But anyways, I'm happy to be here. to be here. If I were to tell it, I would like to go from California to Mumbai. I would like to do it within three days. And I give it all kinds of constraints about what time I'm willing to leave and able to leave, what hotels I like to stay at, so on and so forth.

The people I have to meet, the number of permutations of that, of course, quite high. And so the planning of that process, coming up with an optimal plan is very, very complicated. And so that's where thinking, reasoning, planning.

comes in and the more you compute the higher quality answer you could provide. So a second scaling law has been born and it's called time of interference. Now what's interesting is we've known about these things for a while but we haven't actually been able to build them into computer models into LLMs and now of course we have reasoning inside ChatGPT's O1 which is where the model debates itself thinks through a chain of thought before producing a final answer. And Jensen said that the longer you think, the higher quality of an answer you can produce.

Which is interesting. It points back to the importance of quiet time and thinking, which is something often lost in our real world and in humanity itself in this go, go, go nature and culture we live in. Listen to what Jensen says about how we have learned to understand the meaning of data itself.

Several years ago, about a decade ago, something very important happened. And most of you have seen the same thing. AlexNet made a gigantic leap in the performance of computer vision. Computer vision is a very important field of artificial intelligence. AlexNet surprised the world with how much of a leap it was.

that it was able to produce. We had the benefit of taking a step back and asking ourselves, what are we witnessing? Why is AlexNet so effective?

How far can it scale? What else can we do with this approach called deep learning? And if we were to find ways to apply deep learning to other problems, how does it affect the computer industry? And if we wanted to do that, if we believe in that future, future and we're excited about what deep learning can do, how would we change every single layer of the computing stack so that we could reinvent computing altogether? 12 years ago, we decided to dedicate our entire company to go pursue this vision.

It is now 12 years later. Every single time I've come to India, I've had the benefit of talking to you about deep learning, had the benefit of talking to you about machine learning, and I think it's very, very clear now the world has come. completely changed. Now let's think about what happened.

The first thing that happened, of course, is how we do software. Our industry is underpinned by the method by which software is done. The way that software was done, call it software 1.0, programmers would code algorithms we call functions.

into to run on a computer. And we would apply it to input information to predict an output. Somebody would write Python or C or Fortran or Pascal or C++ code algorithms.

that run on a computer. You apply input to it, and output is produced. Very classically, the computer model that we understood quite well.

And it, of course, created one of the largest industries in the world, right here in India. The production of software. Coding, programming, became a whole industry.

This all happened within our generation. However, that approach of... developing software has been disrupted. It is now not coding, but machine learning. Using a computer using a computer to study the patterns and relationships of massive amounts of observed data to essentially learn from it the function that predicts it.

And so we are essentially designing a universal function approximator using machines to learn the expected output that would produce such a function. And so going back and forth, looking, this is software 1.0 with human coding to now software 2.0 using machine learning. This is what a GPU looks like today. This is Blackwell.

Incredible system that is designed to study data at an enormous scale. Yeah, thank you. This is the Greek breakthrough. In the last several years, we have now learned the representation or the meaning of words and numbers and images and pixels and videos, chemicals. proteins, amino acids, fluid patterns, particle, physics.

We have now learned the meaning of so many different types of data. We have learned to represent, how to represent, represent information in so many different modalities. Not only have we learned the meaning of it, we can translate it to another modality. Let me tell you, this was a revolutionary keynote that Jensen gave.

I think it will be extremely powerful to see what the entrepreneurs do with these new scaling laws and the new opportunities that the compute and breakthrough, thanks to NVIDIA, has given the world. We went from 2x-ing our compute every one and a half years, to 4Xing every single year. Just think on that for a minute. Think on what that means for your business, your work life, your home life, your efficiency capabilities.

Yeah, it means the future is now. It's happening this year and it will only grow from here. I'd love to hear what you think in the comments. Let me know. It's always a joy to hear from you.

I've got a lot of smart people watching my channel. So it is an honor to be here in this new age with you. Let me know what you think of the nature of compute scaling laws and what is possible in this new age.

And as always, hit subscribe here on my channel at Julie McCoy, and I'll see you down the next AI Rabbit Hole.

Transcript for:Jensen Huang's Vision for AI Evolution

Transcript for:
Jensen Huang's Vision for AI Evolution