Mark Zuckerberg and Satya Nadella's Tech Dialogue

[Music] Please welcome Meta founder and CEO Mark Zuckerberg and Microsoft chairman and CEO Satia Nadella. [Applause] [Music] All right, good to see you all again. Uh, I hope it's been a good day. A lot of exciting stuff. Really grateful to be here with Zata who really does not need any introduction. He is sort of the uh the legend who is behind um this the the great transformation of um the greatest technology company of all time. Thank you, Mark. um basically leading um uh you know the you know pushing towards AI and cloud and all these important areas and you you've always been um from my perspective a a kind of friend and ally on the open source work that we've done. So I've really appreciated our partnership on this over time and and the council that you've given me on on how we should approach building out the llama ecosystem and the infrastructure around that. So thank you for being here. Absolutely. It's my pleasure, Mark. And I should say that, you know, my earliest memory is meeting you um when I was working on Bing in 2008 or N and so on and getting u this massive lecture from you on uh on something that I was wrong about which is even more embarrassing in retrospect. Yeah, I remember this. I'll always remember this. You said you got like web needs people. You need to see people. I'll never forget that. like the the ability to actually have a profile page everywhere you go. And that's a memory I have. Well, I appreciate that you've forgiven me for that. Although the web does need people, so I mean I guess at that level I was correct, but maybe agents now. Yeah. Well, I think maybe thank god. Yeah. Um but anyway, so you know, you've said a number of times that this moment in technology around um the the growth of AI sort of reminds you of of some of the important transformations in the past from going to client server and the beginning of the web and things like that. So I'm I'm curious to Yeah. So for me u you know I grew up um when sort of the client was being born. And I joined Microsoft just after Windows 3 and um and and so I saw the birth of client server then the web mobile cloud and then you could say this is the fourth uh or the fifth depending on how you count and um it's interesting right which is each time there is this transition uh everything of the stack gets relitigated um and you get to sort of u go back to the first principles and start building uh I mean I I I thought like even the the shape of the cloud infrastructure uh for me that I built starting let's say in 2007 8 to what have you the core storage system for training doesn't look like the core storage system you built or this this workload of training right the data parallel synchronous workload is so different let's say Hadoop or what have you so the the fact that you kind of have to rethink everything um up and down a tech stack uh with each of these platform shifts is sort of what I think we face from time to time. It sort of grows from what was there. The web was born on Windows, but it went far beyond that. That's kind of how I think about this as well. Yeah, that makes sense. I mean, you've you've made this point a bunch of times around how as things get more efficient, um, it sort of it changes the way it works and people just end up consuming a lot more of the services, right? And um I I guess one of the things that I'm curious about because you guys are are in this great enterprise business and and we don't have as much visibility into this is is sort of how how you're seeing that play out around um around all these AI models, right? You're seeing like generation over generation they're just getting so much more efficient and delivering more intelligence than the last generation. And you know, obviously it's all happening super quickly. So I'm not sure kind of how what what you can what you see in that. Right. Yeah. I mean if you think about it right which is you know we were all you know a few years ago sitting around and saying oh my what's happened to Mo's law you know is it over what do we do and here we are in some crazy sort of hyperdrive Moors law and it's always been also the case right which is any one of these tech platform shifts has not been about one scurve it's been multiple scurves that compound right even if you take uh just the fact that the chips are getting better uh you know people like Jensen or Lisa doing tremendous innovation uh you know their cycle times have gotten faster. So let's say that's Moore's law. But on top of that, everything at the fleet, the system software uh optimization, uh the model architecture optimizations, uh the kernel optimizations for inference, the app server, even the prompt caching, how good we've gotten in, uh and so you add all of that up for every 6 months, 12 months, you have a 10x perhaps improvement, right? And so when you have capability improvement of that rate and the prices drop at that rate, fundamentally consumption goes up. So I'm very optimistic uh that we are at a stage where deep applications can get built. Um and so these things where you have an orchestrating orchestration layer with these agents with multiple models I feel like we're at that place because if you think about even the first generation of apps they were very coupled uh very coupled to one model but we are finally getting to multimodel applications where I can orchestrate in fact a deterministic workflow an app agent that was built on one model talking to another agent. we even have these protocols are helpful whether it's MCP whether it's A2 to whatever these are all good things if we can standardize a bit and then uh we can build applications that are you know taking advantage of I would say uh the you know these capability building but has flexibility and that's where I think open source absolutely has a massive massive role to play. Yeah. Well, well, I definitely want to make sure we can get into discussing how to use multiple models together. And I think that there's this whole kind of concept of a like distillation factory and the information and the infrastructure around that that that you think Microsoft is well positioned to basically provide as there are multiple models. Maybe we'll come back to that in a minute. But before we do that, you know, Microsoft obviously has been on this interesting journey around open source, right? And this is one of the big things that you did, you know, under your leadership early on was embracing it. And you know, you you had the early partnership with Open AI, but then also were very clear that in addition to working with closed models, you wanted to make sure that Microsoft served open models well. And I'm I'm curious how you think about that and how you think that the open- source ecosystem is going to evolve and why that's important to um your customers and how you think about that with all the infrastructure that you're building. Yeah, it's it's it's interesting you ask that because I I I grew up um in fact one of my formative jobs at Microsoft was also making sure uh that we had interoperability with the various flavors of Unix out there between NT and Unix. uh and that taught me one thing which is interoperability is what first of all customers demand and if you do a good job of it um that's good for your business and obviously you're meeting customers where they are and so to us I think that's what has shaped my thinking when it comes to open I mean it's not I'm not dogmatic about closed source or open source both of them are needed in the world uh and in fact I think customers will demand them right even if any one of us has dogma doesn't matter because at the end of the day the world will break that way which is there will be a need for it. So given that I think uh like for example there was SQL server there was my SQL or Postgress uh there's Linux there's Windows in fact there is Linux on Windows in fact my favorite thing to use is WSL on Windows because it just makes it uh easy to take a lot of the dev tools and deploy them on Windows. So overall I think having an posture that allows you to mix and match these two things is super helpful. It also fits with what you just talked about because a a lot of my enterprise customers want to be able to distill in many cases models that they own. It's their IP. Uh so that in that place where an openw rate model has a huge structural advantage compared to a closed model. Uh and so I do feel that the world now is better served with great closed source frontier models, great open-source frontier models. And to us as a hyperscaler this is a great thing because after all uh our job is to serve like as if you go to Azure you can get fantastic Postgress you can get great SQL server or you can get Linux or Windows VMs same way we want to have the choice available and great tooling around it. Yeah. So, so what's basically kind of the the pitch or the role that you see Azure playing um for open source but I guess across all of these it doesn't need to be exclusively that so for developers who are getting started like what's where are the areas that you're trying to differentiate and be the best. So the first thing is it's not like an AI workload just has an AI accelerator and a model that you know at inference time right because the reality is in fact if you look at any underneath any AI workload there's storage uh right there is other compute than an AI accelerator and there is a lot of dependency on the network and so on. So the core infrastructure so for us in Azure we want to build compute storage network plus AI accelerators at as even infrastructure as a service that's world class for someone who wants to be able to build uh the next generation of agents. Then above that we're also building uh with foundry uh an app server like for example every platform shift of ours uh there's been an app server. How do you package up all the services uh you know search or memory or um uh safety all of these services that are needed for someone or eval uh these are all things that every developer is trying to go do so if you wrap them all in uh frameworks for them tools for that uh that's I think the core and then the other one is we we're very focused on GitHub copilot uh as the tooling as well uh we're excited about sort of the progress that's making so the combination of great tools, great app server and great infrastructure uh for us is what I think is needed to accelerate application development. Yeah. So, so how um you mentioned agents and and increasing productivity and that's obviously a huge theme for for the whole ecosystem and community. I'm I'm curious how are you seeing that play out inside Microsoft and then also how are you seeing that um what are some of the most interesting examples that you're seeing with development? I mean I think the the the thing that obviously has been most helpful for us to see is what's happened with software development right I mean there are a couple of things if you look at even the evolution of GitHub copilot you started with code completions then you said let's add chat so that that means your you don't need to go to Reddit or Stack Overflow and you could stay in the flow so that was good then uh the agentic workflow so you could just go assign a task um so those three If I look at even any one of us using it, you're using all three at all the time, right? So it's not like one substitutes the other and now you have a proto agent even uh and so you can literally go highle prompt or you can just get a PR assigned to a suite agent. So all four of those and the productivity gains the biggest lesson learned there Mark is you got to integrate all of that with your current repo and your current developer workflow right I mean it's one thing to build a new green field app but none of us get to work on complete green field all the time right so uh you are working in a large code base with a large sort of co you know set of workflows so you got to integrate the tool chain and that's the systems work uh that I think any engineering team has to do and that's when you see the productivity. The same applies quite frankly for the rest of knowledge work as well. So when in our case with co-pilot deployment uh for knowledge work um you know if you take sales like you know one of the things I always describe is the uh let's say I'm getting ready for a customer meeting the workflow of how I get ready for an enterprise customer meeting has not changed since 1992 when I joined Microsoft right basically someone will write a report that'll come in email or will be shared in a document and I'll read it before the night before now I just go to researcher and copilot and I get the thing which is the combination of what's on the web, what's internal and even in my CRM all done in real time, right? But that's a change in there's no need for somebody to prepare anything just because it's all available on tap. So it requires you to change work artifact and work flow and that's where that's a lot of change. Um and it happens slowly at first and then all of a sudden right I saw that with the PCs, right? I I you know think about how the world did forecasting before email and Excel like inter off you know the faxes went around. Yeah. I guess you never lived that world. I I was in middle school. Yeah. nuts. But there was a world where people sent around faxes and people went and did uh inter office memos and then somebody said hey I'll send a spreadsheet in an email and people enter numbers and that changed how people did forecasting and that's I think what we are at the very beginning of and you see it in customer service you see it I think in marketing collateral creation content creation so that's where we are and you're seeing tangible progress and tangible productivity gains. Yeah. Interesting. um in terms of the coding and and how it improves that do you have a sense of how much of the code like what percent of the code that's being written inside Microsoft at this point is written by AI as opposed to by by the engineers yeah so there's two sort of things we're tracking one is the accept rates itself right that is sort of whatever 30 40 it's growing up monotonically uh and it depends like one of the big challenges we had for a long time is we are a lot of our code is still C++ um C and C# is pretty good but C++ it was not that great. Python it's fantastic. So we now gotten get better at that. So as language support has increased the code completions have gotten good. The place where the agentic code still it's very it's sort of nent for new green field it's very very high. uh but as I said it's nothing is green field uh in many cases and so therefore I would say maybe at this point the PR oh by the way code reviews are very high so in fact the agents we have for reviewing code uh that that usage has increased and so I would say maybe 20 30% of the code that is inside of our repos today in some of our projects are probably all uh written by software. What about you guys? Um, I actually don't have the number off the top of my head, but I mean it's um I I I think you know we I think a lot of the stats that people say are still effectively of this like autocomplete variety. Yeah. But we have a bunch of teams that are working on um on basically doing feed ranking experiments and ads ranking and like very contained domains where you can study the history of all the changes that have been made and like and and and make a change. And that I think is like is kind of an interesting area for for us to work in. But the big one that we're focused on is um building an AI and a machine learning engineer to advance the llama development itself. Right? Because I mean our our bet is sort of that in the next year probably you know I don't know maybe half the development is going to be done by AI as as opposed to people and then that will just kind of increase from there. So I was just curious if you were if you were seeing something different. Yeah. I mean to me the the the sui uh agent is the sort of the first attempt. So the question for us is in the next year can we get um like let's take a kernel optimization right will we get to sort of something like that that happens I think it's more likely whether it comes up with a novel model architecture change probably not so the question is which task yeah no optimizations security improvements that type of stuff I think seems like it's it's pretty high opportunity um yeah no I we're also trying to solve a different problem on it because I mean you guys serve of like a lot of developers and engineers that's like your core business. Um whereas for us we're thinking about this more as a thing to improve our internal development and then improve the llama models which other people can use but it's not something that we do the endto-end workflow on in the way that you do. So it's always just interesting to hear how you're thinking about that. Yeah. And the other thing for us is yeah to your point our core business in fact you know Bill started the company as a tools company. And so to us uh the interesting thing I think about now is maybe the way we we should reconceptualize our tools is and infrastructure quite frankly are the tools and the infrastructure for the agents to use because even the three agent needs a bunch of tools and what shape should they be uh what should their infrastructure what should their sandboxes be so a lot of what we're going to do uh is essentially evolve even what does the GitHub repo construct even look for the sui agent Yeah. No, it's that that's a it's a very interesting concept. And I mean I I tend to think that like like every engineer is effectively going to end up being more of like a tech lead in the future that has sort of their own little army of of engineering agents that they work with. But yeah. Um so with that with that all in mind, I mean I guess I'm um I there are a few directions to go in. I mean I'm curious like how I mean you mentioned your personal workflow for for using AI. I'm curious how that's changed. Um, I'm I'm I'm also kind of curious because you were talking about how uh how Microsoft got started on this and the legacy there, but I mean I guess there's always this question if you were getting started as a developer today building something, how would you think about which tools you'd be using and and um yeah, I think that one of the one of the biggest sort of I'll call it I don't know dreams, pursuits, questions that Bill sort of inculcated in all of us was what's the difference between um uh he used to sort of talk about it more like what's the difference between a document and an application and a website right now if you use meta chatgpt copilot what have you it's unclear to me what's the difference between a chat session uh and then I go to pages in our case like literally even coming down you know I was like reading up everything about Llama 4 all the models like literally I was just doing a bunch of chat sessions adding it to a document effectively in pages persisting it uh and then you can go give it I mean since you have code completion you can go you know make it an app or what have you and so this idea that you can start with a highlevel intent and end up with what is an artifact that is a living artifact that you would have called in the past an application is going to have profound implications I think on workflows and I think we are at the beginning of that. Uh and that's what my dream is if I sort of say as a builder of infrastructure and tools and as a user of it. These artificial category boundaries not artificial or these category boundaries that were created uh mostly because of limitations of how our software worked. Perhaps you transcend um in fact the other thing we used to always think about is why are word excel powerpoint different? Why isn't it one thing? And we've tried multiple attempts of it. Uh but now you can conceive of it, right? Which is you can start in word and you can sort of visualize things like Excel and present it and they can all be persisted as one data structure or what have you. And so to me that malleability that was not as robust before is now there. Yeah. Interesting. Makes sense. So, one of the things in our conversations over the years that that has kind of stuck with me, I I feel like you have a very kind of I it's a very like reasonable way of looking at the the the the way that technology trends unfold. And there's all this hype around AI and I I feel like you've been able to kind of see through that and make very rational investments at each step along the way. And one of the points that you've made is that okay there's all this hype but like really at the end of the day if this is going to lead to massive increases in productivity that needs to be reflected in major increases in GDP and that this is going to take like some like multiple years many years to to kind of play out. And I'm I'm curious how you think what's your current outlook on on sort of what we should be looking for to understand the progress that this is making and and how we would and and kind of like where you expect that to be over like a three, five, seven year period. To me, I think that that's right. I because to us I would say it's a pretty existential priority. Quite frankly, the world needs sort of a new factor of production, an input that allows us to deal with a lot of the challenges uh we have. Um and uh and the best way to think about it is hey what would it take let's say for the developed world to grow at 10%. Um right which may have been some of the peak numbers during uh let's say the industrial revolution or what have you. Um and for that to happen then you have to sort of have productivity gains in every function right in healthcare in uh in retail uh in broad knowledge work in any industry and for that to happen that I think AI has the promise but you now have to sort of really have it deliver the real change in productivity and that requires software and also management change right because in some sense people have to work with it differently. uh you know people always quote what happened with electricity right it was there for 50 years before people figured out that hey we got to really change the factories to really use electricity differently right and that was the the famous Ford case study and so to me we're in somewhere in between I hope we don't take 50 years um but I do feel that by just thinking of this as whatever the horseless carriage uh is also not going to be uh the way we're going to get to the other side So it's not just tech. Tech has got a progress. Uh you got to put that into systems that actually deliver the new work work artifact and workflow. Yeah. Well, we're all investing as if it's not going to take 50 years. So I hope it doesn't take 50 years. Um but all right. So, so we've kind of been, you know, we're doing more of the technical questions up front and then getting to the big picture stuff, but I realized that I we forgot to dive into the distillation factory thing and how you basically combine all the different um all of the the different AI models that are getting built out for for open source and like what the infrastructure is that you think is going to be necessary to build that out. I mean this is something that you've talked about and I think have like a yeah to me I think that that that to me is I think one of the biggest roles of open source right which is to be able to take let's say um your some of your even even inside of the llama family taking a large model and then to be able uh to distill it into a smaller model that has even that same model shape uh I think is a big use case. Um and so to be able to then build the tooling for it as a service um and make the barrier to that like I mean to your point standing up some of these large models is a bunch of infrastructure not everybody needs to do it but if you do it as a cloud and then pull the tools around it and the outcomes a distilled model in our case let's just say there was let's say for every tenant of Microsoft 365 if they could have a distilled task specific model that they can create as an agent or a workflow that then can be invoked from within copilot. That to me is a breakthrough scenario and people are already doing a lot of that and we want to make that a lot easier. here and so when I say distillation factory uh that's the the many to many type of or one to many relationship I want between one large model uh these distilled models that then get composed uh with a lot of other workflows inside of something like a product like GitHub copilot or copilot because now they all support with MCP servers and so on the invocation of these other agents. Yeah. No, it's it's it's I've always been fascinated by this. I mean, I think the distillation is one of the most powerful parts of open source. And I I think just because of our kind of respective parts of what we do here in terms of, you know, we're training the initial llama models, but we don't build out most of the developer infrastructure ourselves. I think having companies like yours that that basically are going to build out this complex infrastructure. And we have models like um like the Behemoth one that we're working on, which I I I think it's really unclear how you would use it except to distill it into more reasonable forms. Yeah. So I mean even for us to use it ourselves we had to build a bunch of infrastructure internally even just to be able to like post-train it and you know there's no way we're going to run infra Maverick is you said is distilled all yeah a bunch of Maverick I mean the way that we basically got the performance on Maverick to be at the level it is as it is right it's basically it's multimodal right so it's leading multimodal on the text performance it's basically you know up there with the other leading um text models but it's smaller right so I mean deepseeek is a bigger model than it but on text it's it's basically comparable and then on images and and kind of multimodal it exists and the others you know don't so um so that is yeah I mean a lot of how we basically got that is you know we we have the pre-train of the behemoth is is done and we're we're working on the post- training but even just kind of getting it I mean the distillation it's just like it's magic I mean you basically can make it so that you can get 90% or 95% of the intelligence of something that is 20 times larger in a in a form factor that is so much cheaper and more efficient to use. So then the question is okay how do you make it so that that's available to people who are not as um you know who aren't able to build up their own infrastructure aren't as technically sophisticated to go do that because right now I think that there's a relatively small number of labs in the world that could do that kind of um either distillation or even operate models of that scale and I think by the time that that like this vision that you have is built out and like it's it's accessible for most developers around the world to be able to not only distill from a single model but hopefully over time be able to mix and match and take different aspects of intelligence from different models where they're stronger. Um that just seems like one of the coolest things that I think is going to get built. Yeah. No, I think that that that's correct. And so therefore a little bit about like what's the um if you have multiple models you're distilling from and then what's the eval around this distilled model that you can then qualify. I think that's where uh we can do a lot of our tooling work, our infrastructure work, reduce the barriers for people to be able to have that flexibility. Uh and the good news here is um it's sort of already started. People have ex there's existence proof of all this. It's just a question of can you reduce the barrier to building all of it. And the other thing is the speed with which people can move. One of the challenges to date has been uh I do something with one model, I fine-tuned it, a new sample drops, I need to move fast to the new sample. So that's the other thing that we have got to get good at. Uh because you can't be saddled with what you did uh because the world is moving too fast. Yeah. Yeah. And also I mean developers just need things in different shapes. I mean the llama 4 shape of 17 billion parameters per expert was designed because the atomic unit that we have at meta is an H100 right so we want to be able to run like these things very efficiently on that um it's it's one of the things where I mean you look at some of the other models that have come out some of the other open source models it's like they're good intelligence but sort of awkward to inference because of the scale and maybe they're targeting different kinds of infrastructure um but that's but we're basically we're we built this for kind of server production, but a lot of the open source community wants even smaller models. So to be able to, you know, build, you know, the most popular Llama 3 model was the 8B and we we'll we'll work on a smaller version. I talked about that earlier, the internal we refer to as little llama, but we'll see what what we actually ship um at as but um but being able to basically take whatever intelligence you have from bigger models and distill them into whatever form factor you want to be able to run on your laptop, on your phone, on whatever whatever the the thing is, I think is just I I don't know. I mean, to me, this is like one of the most important things. Yeah. And also I think you guys are obviously working on this and um and I think that this is good to see which is um if we can get to these hybrid models um you know whether it's you know dense plus thinking models that combine and then you're able to get the latency uh that you want or the thinking time you want uh and it's flexible uh that I think is where we will all want to end up. Yeah. All right. Well, maybe a good note to end on is just, you know, when you look at everything that's going on, I mean, I'm curious what is what gives you like the most optimism or what are you most excited about over the next period for what developers are going to be doing over the next few years? Yeah. I mean, look, I mean, you know, I'm I always take my inspiration from that whatever that Bob Dylan line, right, which is either you're busy being born or you're busy dying. It's get better to be busy being born. And especially at a time like this, um I think what gives me optimism is even with all the various constraints, it turns out software in this new form of AI is still the most malleable resource we have to use to go solve these hard problems. Um and so that's what gives me the optimism and and I also sort of say that's the call to action I think for the people in the room and for all of us is to be able to sort of take the opportunity to lean into this uh but then also build solutions the when I look at the whether it's an IT backlog in a company or the unsolved problems in the real world both of them need something new uh and in order to work that and that's where I think the greatest uh benefit of all of this tech is then it'll only come down to, you know, developers in particular, uh, being able to go at it fearlessly. Awesome. All right. Well, thank you, Satia, and thank you all for coming out. This has been an exciting day, and I'm very excited about what we're all building. So, [Music] [Music]

Transcript for:Mark Zuckerberg and Satya Nadella's Tech Dialogue

Transcript for:
Mark Zuckerberg and Satya Nadella's Tech Dialogue