Transcript for:
AI Agents Overview and Tools

heat heat heat heat hello everyone good morning afternoon evening wherever you are today i believe we have people from about 50 countries maybe more um joining us but uh my name is Maline Perez i am a program manager here at Google Cloud and I'm very excited to be here with you as the host of today's session today we'll be talking about the Agentic Revolution why now is the time to build intelligent agents proful will explore the Agentic Shift uncovering why agents are revolutionizing industries what their core uh capabilities are and why this is the golden era for startups to begin building them but before we jump into today's today's session I just have a bit of spiel to go through so we'll get started with everything about the curriculum for uh this year's school if you haven't heard we're we already had the week zero session available on demand you can check it out um on our platform just to see kind of what we've already covered we're currently in week one of the sessions and you'll learn more about why now is the time to build intelligent agents in week two you'll learn more about the agent engine and its potential and in week three we will deep dive on building and deploying your first agent and we'll talk about LLM applications with data dog for those who are here with us for the first time this is a bit of a quick overview about Startup School at Google Cloud each week we have live sessions like this one on Tuesdays and Thursdays which are later available to watch on demand if you want to go back to some of the topics that have been covered most of our sessions also have a non-mandatory but very recommended labs and notebook sessions so you can put the knowledge that you gained into practice in the cloud skills boost platform you can also obtain skill specific badges in terms of the cloud skills boost uh piece of all of this uh you'll be able to access these labs free of charge um you'll just need to fuel your account with credits in the in our platform all the information you see from this slide was provided to you in a reminder email or confirmation email that you all re if you haven't seen it already please double check your spam folder we hear about it all the time that this kind of gets lost there you'll just see an email from [email protected] but again in case you have any problems as you can see on the bottom of this slide you can reach out to startupschool [email protected] and we can help get you all set up make sure that you have no issues uh hopefully one last thing here is just to go over a little bit of housekeeping and explain again how the session will go um before we jump in just again quick things um first off on the right hand side of the screen there's a chat function and a Q&A panel at the bottom we have Google AI experts on hand to answer any questions that come up throughout class please ask your Agentic AI specific questions in the QA Q&A panel below and interact with other founders and developers in the chat and then one more request uh please don't share your LinkedIn profiles in the chat space uh we'll go into where you can share that instead uh after we're going to save the last 10 minutes of class for Q&A with our instructor and we'll select the most common questions and those related specifically to the content they cover please upvote questions in the Q&A panel and use the chat space to converse with other participants we're recording each class and you'll be able to watch them again on demand at a later stage and you can access these recordings via the main program page we won't be sharing slides unfortunately but we will share links to the labs covered in class both during the class in the chat Q&A and also in our weekly recap emails you can also find the list of the resources tab on our website and last but not least this is a hands-on training series and most classes will cover a lab all of these again are available on the cloud skills boost platform on our Google Cloud GitHub repo we'll share links with you to access both of these we recommend using this time during class to watch our instructors going through the lab and completing it after the session the credits we provide will last you another few months so you have plenty of time to get these done that is all from me um now it's a massive pleasure for me to welcome to the virtual stage our amazing speaker speaker Proful welcome to the start of school stage i will see you all later for the Q&A the floor is yours thank you great thanks Meline hey everyone nice to meet all of you uh if only virtually uh my name is Proful i'm a customer engineer here at Google Cloud i've been around for the last five six years at this point so I've been lucky to get the front seat to see how quickly AI has evolved to a point where people are now building agents very very quickly right um now in terms of what we will cover today I think I would want to focus today's session on the why right u everyone is building agents everyone's talking about it everyone's discussing it there's so much that's going on when it comes to agents right now right but this session is more about us taking a step back and reflecting why exactly should you even be building agents in the first place right now uh I know I'm talking to folks who are you know founders aspiring founders folks who are you know uh fairly you know further along in their journey when it comes to building uh their organizations but I believe that this session will really help you understand uh why agents and why now so with that not surprisingly let's talk about agents so what exactly are AI agents now this is a term that's used so loosely so often in so many forums and in so many ways that people just assume that everyone knows what they even mean when they talk about uh when they say AI agents but let's maybe start with a few simple definitions so we had this era of what I like to call as the chatbot era right imagine this uh entity sitting at the bottom right hand side of your of your web page answering a few specific questions for you and then completely failing when it when you try to ask it something uh outside this context right uh what we call conventional agent today is the evolved variant of what used to be the chatbot right so think of the conventional agents as basically agents that have a single purpose they have lean interactions they are fairly limited in their ability and in the memory and a simple example would be imagine an agent that can just set up a meeting for you right so that's what we call as a conventional agent but a big chunk in fact the This entire series not just the session today but the entire session from today till the end for the next two and a half 3 weeks we will be focusing more on what you see on the right hand side which is the agentic system how exactly can you build a solution that is uh you know goal oriented which is multimodal which can combine agents for you right so that's what we will be focusing on and we will build up to a point where not only will we talk about it but we will show you how to build it and build you know uh at a high level how to use multiple pieces to build uh a production rate solution for that matter now I spoke a lot about agents but I like this definition about agents right like this really captures what we want people to know when we talk about agents so primarily agents are you know autonomous entities they do not require any human intervention right like they they're designed by us but they don't really need a human in the loop sort of element for them to actually work they are goal oriented which means that you know uh you can um have a one single goal for all of these agents and these agents are basically think of these as multi- aent systems that are working with each other and are using uh tools at an agent level as well again if these are terms that you don't necessarily understand u don't worry about it we will talk talk and show you some of these things today uh in fact this is a great white paper that we released i would very strongly recommend that you take some time to really go through it now it's one thing to talk about it it's another to actually you know show you something so with that in mind let me switch to a very simple demo of what an agent looks like so give me a second while I do that so this is AI Studio um I don't know if you folks are familiar with it but this can think of it think of this as your playground of sorts right when it comes to building AI solutions on Google cloud today there are two approaches one is AI studio which you see right now and one is Vortex which you we will show you a little bit later today and will be the focus of the next two and a half weeks now once you come here you can already see that there are bunch of things that you can quickly try out right you can click on u let's say one of these models and you can just get started and play around with them you can go ahead and you know set up a few things as well so there's a lot that you can potentially do right away what I will focus on right now is something very simple build a simple agent for me as simple as that and you will get to see this in action right now where I will build a conversational agent using Gemini that will book a travel reservation for me and can use Google Google search grounding as well so let me copy let me paste my prompt okay great so let's go ahead and use this to build a very simple agent right away so what you will see here now is the model is thinking figuring out what the conversation flow should be uh refining a bunch of things and as it does that on the right hand side you'll start seeing some code appear so give it a few minutes actually not a few minutes maybe a minute because the last few times I've done this it took about 50 55 seconds to finish the thinking and actually design all of this for you so let's wait and see this come up okay so now it's define a data structure so as you see the thinking ability you will start as a developer a lot of these will be things that you will be doing on your own when you're building something from scratch right here Gemini is doing all of the thinking for you it's thinking through the logic thinking through the databases thinking through the API setup thinking through how to interact with Gemini it's going through all of that thinking through it's effectively behaving as an architect for you for that one simple prompt that you given to it and based on that prompt it's figuring out what makes sense and uh using that it's going ahead and you know wiring all of this together for you so while we wait for this now it's building the UI structure as well i think we're very very close to now it being wrapped up with it thinking which means that in a few seconds time you'll start seeing code show up on the right hand side okay streamline the core the UI the logic I think Gemini is fairly thorough as thorough as how you would want all your software engineers to be as well which is a good thing I suppose okay while we keep waiting for this actually while we wait for this let's switch back to the presentation and go the next slide and I'll come back to this in a second okay so we were here so the focus of our session today and the rest of the entire start of school series will be the agentic system right how do you go ahead and build a solution um like how do you focus on building an agentic system because to be very very honest it's become very very easy to build conventional agents right like and you will see this in a minute in fact let's go back and see if it's ready okay it's still building it out so we'll come back to it later right but it's become very very simple and very very trivial to actually build these agents so the focus here is how do you build solutions that are multi- aent in nature that are able to solve for one big problem statement so for example something like let's say an executive assistant that can manage your work life for you right in fact now let's go back to the code that we were trying to uh power up so you can see here it's this is the same prompt it thought for about a few minutes and now it's designing all your code right in front of you now a lot of you might be familiar with this whole concept of why you're getting a feel for that right now as you see all of this happening where with just a simple prompt what's happened is that Gemini has figured out all the structure that need to be put together and based on those structures it's going ahead and just putting the entire code together for you and this is all live by the way right I actually put in the prompt and uh it chose today to take 3 minutes usually takes only about a minute or so But uh as you can see it's actually writing all of the code as we speak so let's wait for it to finish this up and there we go we now have a very simple app which which you can talk to which basically behaves like an assistant for you that you can like let's say book a trip for to Vietnam so Vietnam was one of those places that I've not been able to visit this year so let's maybe try that okay so what you will see is it's now doing this in a conversational manner where you can give it stuff and end date so let's say 1st of July to 10th of July Anyway so I'm assuming you folks get the gist that this is how you can just keep talking to it and keep designing an entire itinonary for yourself right keep in mind this was done in a matter of minutes right there was no existing code there was no existing UI just through a simple prompt you were able to build a very simple conventional agent which goes back to the point that I was trying to make earlier that it's become so simple to build agents that now you want to look at how to build these systems as a way to get started right so let me go back to my presentation again and that's where we are we were talking about agentic systems which leads us to the question that why agents and why now right now all of You folks are founders all of you are building businesses all of you know exactly the value of building these um companies you will know that most modern businesses work in a specific manner and they at a very high level and again I know I'm maybe making a massive approximation when I say this they work in a matter of two layers one is what we call as a SAS layer this is the solution layer this is the layer where all these solutions that let's say you might be using in your organization this is where all of those sit let's say Salesforce for tracking uh workday from an HR standpoint Canva Figma there's so many tools that you might be using day in and day out that's the SAS layer but there's also this second layer which is in my opinion the most important layer which is the human layer so this is the layer of the smart talented folks in your organization that are hired to you know operate all of that software and derive real business value from all of them now when when I when I when I put this together I put layers in the same size but the reality is quite different the dollar value of the work that goes into building the human layer is is massive right in fact it may well be called as the most inefficient layer out of the two right because of the sheer amount of work that goes into let's say hiring people in training them in getting them to understand uh what sort of priorities they need to meet uh and achieve within an organization how to train them how to get them to execute a certain number of tasks right so from that sort of point uh it's important to then see that how can let's say AI help you in that scenario and that's where agents come in because agents are autonomous like I said a little bit earlier you could have a scenario where you could have vertical agents sitting in the human layer that can allow humans to augment everything that they're doing day in and day out and and for all of you who are building startups today this is where the money sits this is where the opportunity sits for all of you and this is exactly why we very strongly recommend that uh folks look at building agents for them and just to maybe close loop on that point right this is the entire value chain when that when it comes to building an agent right um you will see that a most of the pieces of this entire value chain have been covered in some shape or form by multiple companies be it let's say um Google and Microsoft right who cover things like fine-tuning training uh creation of agents combining agents in inferring uh things from agents right so all of that is covered so the horizontal stack for want of a better word is covered Right so where does the opportunity opportunity sit that's where the vertical agents really comes in and that's the point that I was trying to make on the previous slide that building a solution specific to that human layer specific to a vertle vertical is where the opportunity really sits for all of you which then brings me to the question of why now right why exactly do you need to think about building agents right now well that's because AI in itself has evolved quite a bit right um and like I said at the beginning of this session right i've been at Google for almost six years now and when I the first two years of my time here at Google any AI discussion would be centered on things like can I do uh a very simple classification model on my data let's say that data is sitting in bitquery which is our serverless data warehouse solution right can I do very simple regression models right so a lot of those uh were the points that were being made but now AI has changed and it all changed in the last 2 three years where we started with this whole uh focus on LLMs right and I'm sure most of you know what LMS can do right I don't need to maybe talk too much about it but at a very high level LLM give you the ability to generate content to retrieve data from existing sources as well right so you have this powerful ability through LLMs the next thing that appeared was what we call as a rag solution or retrieval augmented generation right how could you maybe have a retriever um attached to an LLM and use that as a way to retrieve data because to be very honest LLMs are fixed in from the point of view of the data that they're trained on you can't really ask them information that they're not trained on they can generate content from the data that they're trained on but they are not able to retrieve data that they're not trained on so that's where rag can really help you the next evolution was tooling now this is another way by which we've been able to augment what all would be possible through your rag solution right you can have these tools that you could use uh to access uh let's say imagine a scenario where as a software engineer you want to look up um a very simple Jira case where you may have logged u a case that you need to work on right so with this ability now through an LLM you can interact with Jira right and now we are in this whole world of multi- aents where you don't have just one agent and all of this uh you know talking to each other you there are multiple agents potentially talking to each other trying to keep each other in the loop so that's the evolution that we have seen in the AI agents ecosystem especially in the last one year so but more than that there was there's another important event that's happened another important evolution that's happened and that's the confluence of three very specific advancements that have happened one is the models the models have so at this point as someone who follows AI quite a bit I see a new model being announced just about every other week right even here at Google we do this where we have announced so many models we have evolved so much just in the last six months and we will continue to evolve but that's not just us that's going to be all the other players in the space right but the great thing that you will notice as a pattern is is that with every successive iteration of a model it's become cheaper and cheaper to use the models itself right that that results in lower lower cost for you to build applications that results in um a cheaper inference setup more than that models have become multimodal which means that you can now give inputs to a model in different formats be it text video image right so they have multiple formats in which you can give the data to a model and more importantly we now have models that are giving you a response in different formats as well which which we will see a little bit later in the session where I will talk to a model and get a live response from it as well right the second thing is the tooling now with the focus on building production grade applications one of the biggest area that we have seen a lot of growth is the tooling so most of you would have heard of frameworks like lang chain langraph right uh crew AI so these are all framework that allow you to you know combine a model with a tool and and drive real business value and then the third bit is the platform the ability to you know build test and deploy all of these agents what we have seen over the last let's say 18 months is that these three specific elements have evolved so much that you are now at a point where quite frankly you're not this is the best time to build agents it's as simple as that and that's what we are here to help you right over the next three weeks we will focus on how um you can use the Google platform specifically elements like Vert.ex AI agent engine again if these are terms you have not heard of don't worry about it we will spend quite a bit of time going over all of them over the next two and a half weeks so that's just going back to this again this confluence makes it a very unique time to actually go ahead and build agents for yourself now again I don't expect all of you to follow along this slide which is fine but the hope will be that by the end of the start of school series you will be able to understand every single thing that we have on this slide because every single thing on this slide will be talked about um in the future sessions but this in some ways is a high level architecture of what agents can look like where you have a serving layer which could be our own uh vortex solution or it could be let's say the fast API it could be uh you can use agent orchestration so that could be line chain lang graph I will spend some time talking about ADK or the agent development kit today which is Google's orchestration layer uh then you have the evaluation the LLM itself the data stores how do you log and monitor all the activity right so this is how uh architecture of a production grade agent solution should look like and we will cover all of this in more detail over the next few sessions which then brings me to why Google cloud right so we talked about why should you look at building agents and just maybe take it all the way back to the beginning what exactly are agents right why exactly should you look at using agents now why should you look at using Google cloud for building agents and that's primarily because when it comes to these three specific innovations right Google may well be the only uh platform out there that has a solution for all of them right so for example starting with for the model we have the Gemini family of models this is the uh family of models that you can use as a way of just getting started and there's so many models that we have already released released and I'll show you one of them in action as well then you have the ADK which is the agent development kit that's the framework that you can use to build your own agents so uh just like how you have langraph crowi you also have ADK as a great way of building out agents on the Google platform and then you have Vertex vertex is our platform for building and deploying agents now just taking a step back right they are solutions for tools they are solutions for models they are solutions for a platform right but with Google you're getting a single solution a single platform for all three at the same time with that in mind let's maybe start by talking about Gemini so Gemini as I was saying a little bit earlier is our family of foundation models and Gemini is truly multimmodal what I mean by that is that not only is the input to Gemini uh multimodal but the output could be multimodal as well depending on the model that you end up choosing from the platform now it comes in many flavors as you start working with the platform you'll hear terms like the Pro platform or the Gemini Pro family of models or the Gemini Flash family of models right so uh as the term suggests Pro is our um is the leader in some ways right of the family of models that we have it's our best performing model idle for reasoning idle for coding any sort of software agents right so it's ideal for all of those sort of workflows flash is again equally good in in a lot of use cases but more than that it's an extremely cheap model as well so it's good it's fast and it's cheap which is what I think most startups want as they're building out solutions right so you have the flash model as well now it's one thing to talk about them it's another to actually hear them and engage with them right so I'll just switch to another quick demo so give me a second while I do that okay so we are now back again to AI studio with AI studio what we will do now is we will click on stream so what I'll be showing here to you is how you can talk to Gemini and get a response back from Gemini uh through voice so like I was saying a little bit earlier the output could be uh text it could be audio you can um give an video input an audio input an image input right so it's truly multimodal quite frankly here exactly what we will do is I will also enable a few things specifically grounding with Google search so grounding is uh the ability for the model to retrieve data that was not in its uh training data set okay so in this case I'll be talking to Gemini and I'll ask it a very simple question for those of you who follow sports and follow uh tennis quite a bit you may have watched the French Open recently so I'm just going to ask Gemini who exactly won the French Open okay so with that in mind and I think all settings already enabled let's go ahead and talk to the model hello hello how can I Hello hi hi there how can I help you today can you tell me who exactly won the French Open uh last week carlos Alcarez won the French Open last week he had a remarkable comeback winning the final against Janick Center after losing the first two sets okay so what you saw here was a very simple demo of Gemini where I asked it the question basically asking who won the French Open last week and it came back to me with a very simple response that Carlos Alcra has won it over a five set game what happened here was you had the LLM that was doing the conversation with me and the LLM retrieved data from Google search and came back to me with the right response right so Gemini allows you to ground against Google search and you just got to see that in action okay with that let's go back to uh the previous slides so that was Gemini in a nutshell right like how simple and easy it is to test out Gemini i would very strongly recommend all of you to try out AI Studio today right because it really helps you get up and running very very quickly okay so what you saw uh when I was talking to the model was a Gemini 2.5 family of model so with Gemini 2.5 we have now reached this unique phase where what we are building now is called as the hybrid models right so till now you had Gemini that was 1.5 Gemini 2.0 So these models did a very good job of you know executing against uh multiple sort of task with 2.5 you now have the ability to get enhanced performance but also thinking or reasoning right so you can take advantage of all of that to get better responses as well and more than that a lot of the use cases that I see when I talk to startups today is can I use Gemini as um as someone that can you know pair program with Right can I see can I work with Gemini to come up with a few ideas for this use case that I have in mind right so Gemini is extremely powerful in all of those scenarios right it really helps you uh with all of those use cases in fact uh today when you go and look at the LM arena uh Gemini 2.5 Pro tops the leaderboard right so and by quite a bit of a margin right and we've just announced so many um new versions of it over the last one month um that we have managed to stay as the leaders on the leaderboard but more than that I would very strongly recommend that one look at using AI studio right you can just go and type in a.dev in your search bar and get started right away and play around with Gemini 2.5 right just just get a feel for what the model can do for you get a feel for the multiple use cases that are possible right so that is one thing that I would very strongly recommend that all of you do okay now what exactly did you see me doing on that uh in that demo so this gives you a sense of it right there was this multimodal input that went to the model right as you can see on this slide uh it can be text image video and audio i sent an audio input to the model now the model used its uh ability to um connect to Google search to give me a response but that's not the only thing that Gemini can do for you right you have other features as well so things like uh multimodal understanding and generation right so it it understood the audio input understood the crux of the question and then generated a response in an audio format for me Gemini has a massive uh context this is where we are very unique compared to everyone else in the market where we have a 1 million uh size context window that you can take advantage of but more than that Gemini also allows you to connect with tools in a native manner and if you if you're familiar with MCP you can use MCP with Gemini as well so that's another thing to potentially try out which then brings us to ADK uh agent development kit now what exactly is ADK adk is our flexible and modular framework for developing and deploying a agents now ADK is not a solution where it has to only deploy Gemini products or Gemini models uh you can use ADK for all the um popular LLMs out there right so let's say for cloud if you want to use uh ADK to build an agent that's powered by cloud you can do that right and with ADK the advantage that you get is the tight integration with the Google ecosystem which I'll talk about in a bit right uh which is where we talk about Vortex AI as a platform of choice for you now where does ADK sit right uh basically it sits somewhere in the middle in terms of uh the complexity and the ease of use in terms of what all is uh possible in fact I would say it sits somewhere right in the middle right because with lang chain and langraph you get a little bit more flexibility and then you have other solutions like um I don't know if you folks have built out agents from scratch or let's say um a simple bot from scratch if you have you will remember how painful it can sometime get to behave like a conversational linguist who has to design an entire flow of let's say conversation going from point A to point B understand what potential questions could be asked understand potential responses right so there's quite a bit of pain that's involved in doing all of that when you're trying to build from scratch right now with ADK what you're getting is you're getting the flexibility of you know residing somewhere in the middle where you can very quickly prototype and build an agent you can very quickly you know um um combine the model combine the tools wrap it up all together and expose it as an agent in fact what I'll do now is I'll actually go ahead and show you some code and execute that code and show you how easy it is to actually build an agent with ADK give me a second while I before I do that I just want to show you a few quick things this is the landing page for ADK this is again very very it's it's super new and by super new I mean it's just a few months old so I would very strongly you know recommend that in addition to the two things that I have said which is you know use AI studio and Gemini look at ADK as well we will share these links as well with you so do take a look at this the second thing is we have a lot of very simple samples that you can take advantage of so this is an a very very well-designed uh code repository that you can take a look at which has a lot of samples of ADK it will walk you through uh how you can build uh ADK agents for simple use cases like let's say uh customer service academic research a simple rack solution right so for all those sorts of agents we've already built examples that you can go ahead and deploy and play around with now what exactly will I show you well something very very simple i'll go ahead and share my editor so this is the uh Google cloud platform itself right and you will see the editor here let me just pull this down this is what we call as the console most of you folks who are using Google Cloud Platform today will be very very familiar with this right where you can use this as a way to look up resources look up what you've already built or maybe try to build something so this would be your starting page for all of it what I am now pulling up here is the editor so this is an editor that sits within the GCP console itself okay now what exactly have I done here so I have gone ahead and created a very simple multi-tool agent which you can see on the left hand side i'll then pull up the code for agent.py so this is the very simple agent that I have built and all that it does is it gets the weather details for me and the current time of a given city and with ADK what we're doing is we are providing these two specific functions wrapping them up into tools combining that with a model and creating an agent for you so anyone who is familiar with Python code will notice that there's nothing new at least at this point this is where the actual ADK bit really comes in this is all you would be doing when you're trying to wire up um two tools to build an agent right you will just go ahead and just name the agent identify which model uh you want to use as part of the agent give a description for the agent give it a specific instruction of what exactly should the agent be uh able to do for you and then combine the two function calls uh as tools for your agent so very simple you just use those two existing um uh Python functions and it's combined them all into a simple agent for you okay now what I'll do is I'll actually go ahead and execute this and show you how you can interact with this okay so I already have um just for the sake of the demo I've already installed ADK and I have all this running in a virtual environment so I'll just go ahead and type ADK web and I'll wait for the server to get up and running okay now the server is up i'll go ahead and just preview this so by the way just to let all of you know right this sort of ability to work with the agents through the web is actually possible with ADK right i have not built some um let's say a front end of sorts right this is just ADK's own way of getting you to interact with its own agents so I'll just go ahead and type very simple hello okay I think there is an issue with my agent so that's the danger with live demos there's always one demo that seems to break at the wrong time but uh just going back to this what will actually happen here is that it will interact with the agent and retrieve in details for let's say the weather or the time for a given city in fact sometime later in the session I'll get this demo up and running and I'll show this uh to you folks again okay with that I just come back to the session again and share the presentation deck okay yeah so now just going back to the architecture that we were talking about at the beginning you have ADK sitting as part of the orchestration layer and again we'll we will really go very deep into ADK in week three if I remember this correctly where we will walk you through uh a level 300 session where we will take an idea and build an agent uh using ADK but at a very high level just for all of you to understand what exactly that what exactly it is that you can get with agents um you get all of the pieces that I was I was showing you right which is the instruction so in in the case of the demo that I was trying to show the instruction was very simply uh you are a helpful agent that can help with weather and time the tools in this case it was those two functions that I had created and then the model which is in this case Gemini 2.0 so it just wrapped all of it together as a single agent and uh you will see two more terms on this slide one is model context protocol or MCP i'm assuming most of you folks have heard of MCP in some shape or form but MCP uh basically is the ability for you folks to interact with uh multiple tools and multiple agents as well so that also is something that we provide in terms of support through ADK and then we also have announced support for A2A so a A2A or agent to agent protocol that's a protocol that we at Google have designed that allows agents to talk to each other so again just to maybe take a step back and repeat myself all over again you have the instructions which is basically what should the agent do then you have the tools which is basically what how can the agent execute the instruction and then you have the model that powers all all of it all of these are p combined together as a single agent you then have A2A which allows agents to talk to each other and then you have MCP where one given agent can talk to multiple other APIs or also have grounding setup okay okay with that let's get to the last bit which is uh vortex AI so we we spoke about Gemini and the Gemini family of models and you got to see the 2.5 uh model in action then you saw ADK which is the ability to you know actually build something with the model now with Vortex this is where you think about how do I build a production system i'm assuming some of you folks are familiar with Vortex if you're not that's absolutely not a problem uh Vortex is Google's end toend machine learning platform and when I mean that I truly mean that because let's say if you look at the entire ML workflow right um think of steps like pre-processing your data identifying the right model designing experiments uh running a pipeline um triggering a training job triggering a uh an inference job all of these are actions and activities activities a lot of ML engineers and architects would be doing day in and day out vertex AI is the single platform for all of it and you can use Vert.ex as a way to just power all of those uh activities that you might be doing uh in your respective startups and this is the Vortex stack we will focus on a very specific part of it for the sake of this session but uh this entire stack comprises of a bunch of things right and as I was saying starting from bottom and going all the way up you have model garden model garden is vertex's single source of all the models right so it could be Google's own models so that's the Gemini family of models that you saw a little bit earlier then you have the open and partner model so let's say you want to use uh the llama models right and let's say you want to use claude let's say you want to use mistral or anything from hugging face also for that matter right all of those models can be used through vertex itself so it becomes your single source for all the models in the uh in the model ecosystem and as of today uh unless things have changed as of this morning we support more than 200 models uh through model garden then going up one layer of the stack you have the vortex AI model builder so this is where you actually interact with the models right so you have the ability to design your prompts so uh you can use model builder to come up with your prompts you can come up uh you can use it to manage your prompts as well uh use model builder as a way to run an inference job or even serve a response to the end user so all of that is possible uh the other thing is how do you come up with let's say um doing some sort of fine tuning for the model right because in most scenarios and you know what let's take a step back what exactly is fine-tuning right these models are trained on massive data sets but these models are general purpose in nature let's say you want to build a very specific use case let's say you want to build a model which is an exper which is an expert in a in um existing manufacturing processes so the Vertx AI model builder will allow you to tune the model where you bring a small amount of data and use that as a way to make a general purpose LLM a specialist LLM so tuning gives you that ability and that is now that is possible through Vert.x model builder you then have the ability to distill a model the distillation is basically how can a large model teach a specific skill to a smaller model this is a very powerful thing uh because uh let's say you are trying to save uh inference cost let's say you're trying to inference infrastructure cost and let's say you potentially want to run models at the edge now it's not possible to you know use uh Gemini or Gemini scale models all the time with distillation you can then have the ability of Gemini like models on substantially smaller models which then can be deployed anywhere and then you have the last bit which is agent builder where you can build an agent uh just using the UI or you can build custom agents as well and this is the core part that you will end up using as you start building out your agentic solutions on Google cloud this is the part that you'll end up using quite a bit with agent builder you have access to extensions to connectors to retrieval engines so let's say you add solution for your agent you can use agent builder for that let's say you want uh grounding enabled for your agents you will use agent builder for that right so agent builder becomes your single solution for all of those experiences specifically what I would want to talk about today is agent engine now let's say you've built your agent right you've used ADK uh you've used your model you've used a few uh Python functions now that's not the only way to build agents by the way but just for the sake of you know continuing what I was showing you a little bit earlier let's say you have done all of that then what right once you have that piece of code that actual agent needs to be sitting somewhere now you can go ahead and deploy that agent on your own infrastructure right again you can do that but there's a lot of work involved in doing all of it things like how do you ensure that your agents are scalable because that's where things can get a little bit painful for you right like can you ensure that the agents that are being created are as scalable as possible so with Vortexi agent engine that's the first advantage right we will scale the agents for you like we allow you to use APIs that uh can be used to manage and query the agents and to scale them as well the second thing is how do you ensure for security um so a lot of you might be in industries where security and compliance is a big deal right how do you ensure that the right best practices are being followed so Vortex AI engine does that by out of the box itself the next thing is session management and memory management now especially when you're building multi- aent solutions how do you track session state across uh multiple agents how do you ensure that there is consistency across that entire flow right now again you can do that right but with agent engine you get a solution that allows you to do all of that the next thing uh I spoke about security but there's something called you know monitoring and logging as well how do you monitor actions and activity that are being done by the agent and how do you analyze all of that so again agent engine can help you with all of it so there's so much that you get as part of agent engine and the best part is it's not again uh limited to the uh Google family of models right it doesn't have to just be uh working with Gemini you can use it with any other models in the Vert.xi model garden let's say you have used your own fine tune model or any of the open source models as well you can use it for all of them and again from a framework standpoint it need not be ADK from us it need not be the agent dev development kit that I was showing you a little bit earlier it could be lang chain it could be langraph so you can potentially have a scenario where you use Gemini uh with lang chain build an agent and deploy that on top of agent engine as a way to build in a scalable and secure manner so that is absolutely possible in fact that is what we would very strongly recommend as you start looking to productionize your agents so what does it all look like right so this is the entire flow where um you have the agent framework um which could be ADK Langraph Lchain Crew AI these are the most popular ones that at least I see day in and day out when I'm talking to um uh startups in the ecosystem today obviously you have llama index as well and other frameworks also so these frameworks can all of them can work with agent engine then you have the tools right now the tools could be tools that you have built these could be you know tools completely built in a custom manner by you what we have also done as part of the Vert.x XKI platform is that we have we have also released a bunch of Google cloud tools that you can use as a starting point you can also you know use um MCP as well so we have support for that so tools exposed for an MCP server can also be used by your agent and then the open API tools also so all of these tools that you see on your slide they can also work in conjunction with your framework and the agent engine setup and then you have the models uh like I said it could be any of those models it need not be uh just Gemini and then finally all of this sits on top of agent engine as your runtime solution where it's fully managed fully secure uh patchwork all of that is managed for you right so you don't have to manage actual infra when you're deploying your agent and context management as I was saying on the previous slide that's completely managed for you and uh you can also take advantage one very very important thing that comes with agent engine which is the ability to evaluate your agent so we have something called vortex AI evaluation service where you can track the ability of the agent over a period of time to see that whether or not the responses are meeting a certain standard okay and this slide does a at least I personally feel like this slide does a very good job of capturing everything that we have talked about today it just puts it all together as a single thing for you to understand what all is possible okay so where exactly does agent engine fit in so you can see that red box on the slide that's the deployment part that's where you would deploy your agent on top of agent engine again a lot of these things will be talked about uh a little bit later so you don't have to uh follow along and understand everything right now because I don't expect you to but we will go into a little bit more detail uh into this um in the next few sessions okay with that that brings me to the end of my session thank you folks for following along and uh as far as the demo that didn't work i'll try to maybe power it up again and see if I can get it to work and show it to you guys thank you thank you so much Profuffle that was fantastic um now I'm going to let you catch your breath a little bit take a moment um and I will walk through a few more bits of information on the Google Cloud Front first off for any startup that's early stage we have something called the Google for Startups Cloud program um which I actually own in AMIA um as you're kind of building out learning more about all the offerings at Google Cloud and really kind of trying to get your hands around um our Genaii tooling you can start learning about the different products and solutions um and really apply for the credits that we have available for the cloud program um you can see the QR code there as well as the website um but these are credits that help um that you can use on both standard GCP offerings as well as any of our Genai um first-party tooling so again you click on the code or use the QR to kind of understand a bit more nuances around the program the other benefits not just related to credits um this whole program is really tailored to early stage startups across all of the kind of journey and stages so in terms of funding or product development um so we go up from $2,000 all the way to $250 depending on if you're an AI for startup if you're web 3 um so again please you can learn more about the program with these links we have covered a lot of information um just in one session and I know any challenge around this sort of um session is you're taking a lot of the excitement and knowledge back to your teams to your work and you're having to get a lot of buyin from that and sometimes you just need a simple powerful and visual way to share this vision and this information and show them exactly how to kind of get started uh that's why I'm so excited to share the AI agent handbook we have it's not just a recap of what we discussed it's a strategic asset for you it's incredibly designed easily digestible guide that you can forward to your team leads or your colleagues to help them spark your their imagination we have 10 specific actionable use cases and it makes the potential of AI agents tangible for everyone regardless of role use it as your go-to source to build momentum and start a real conversation about implementation in your organization the link to download should be in the chat now and you can also use the link in the QR code from the screen consider it your first soul to driving adoption if you want to keep the conversation going before during or after class please join our AI school Discord channel to connect with everyone now I will welcome Pruffle back to the virtual stage and we've got some great questions from our audience hey Prof how are you doing good good i think there was a lot that we covered and I hope folks are following along but more than that I believe uh I hope you folks are all excited about what all is going to come very soon over the next few sessions awesome so we will jump into the Q&A um question one comes from Carolyn can I build even when I'm not an engineer or if I'm not a non-technical founder i think absolutely um like today what you would have seen right the very first demo where we were just a very simple VIP coding example where I just gave it a simple question and it designed the entire code for you that's exactly what is happening today as as we speak there are a lot of startups there are a lot of founders there are a lot of companies that are now building this as a culture within their organizations right so absolutely possible now from a solution standpoint there's so many players in this space we are one of the players as well um but just to you know be a little bit neutral about this um cursor for example is a great example of using wipe coding as a way to build applications when it comes to a Google native way of doing this we have something called Firebase Studio so do try out Firebase Studio you can just go to Firebase.studio type that in in your search bar and that allow you to uh go ahead and just type a very simple prompt you can say things like uh build a calculator app for me in Python and it'll just go ahead and build the entire solution for you in a matter of seconds right so absolutely possible Caroline to answer your original question in fact a lot of people who are nontechnical are using this as a way to revive their undergrad days of what they missed out on yeah awesome and I guess sort of piggybacking on that first question how might a user then deploy that example agent that you created um the travel chatbot in um onto a server absolutely i I see it's the it in some ways they are uh linked to each other um depending on the platform that you use for the first one so let's take the example of Firebase which is Google's way of building applications in a V coding manner firebase studio allows you to use a publish feature that you can use to just publish the entire code that you have uh created automatically onto Firebase you don't need to do anything all by yourself all you need to do is you should have a credit card ready to set up your billing account and that's it you're not really becoming a DevOps engineer or an infrastructure engineer to build all of this that will be managed for you through Firebase Studio and now other agents for example right that's where agent engine becomes powerful right and the great thing is agent development kit that I was showing you folks a little bit earlier has this native integration with agent engine so you can just use ADK itself to build everything and use ADK itself to deploy to agent engine so that becomes even easier right so this is for your developers so my first answer was for the non-coders and the ones who don't want to be devops engineers and this one is for those of you folks who have engineers within your organizations and might want an easy way of you know deploying this that's where agent engine becomes your answer perfect thank you um and then our last question comes from defeat and he's really trying to understand how audio works especially um as you compare it to speech so it do they work the same way is there sort of uh the same as like a web speech h API and then also like how is it answering in an emotional voice so that's what has happened because of um the latest launch of Gemini 2.5 right um now if you take a step back in u the December of last year we announced something called Gemini 2.0 And that was the first time where we said that you do not have to like voice will be generated in a native manner and what I mean by that is you give a voice input to a model and you get a voice response back this is not text to speech let me be very clear this is not a model that's giving you a response in text and then I'm using something on top of it to generate speech that's not what's happening in fact that's another reason why I recommend that you all of you try out AI studio like go into AI.dev uh click on the stream option on the left hand side and try it out yourself you're streaming a response streaming an input to the model and the response is being streamed back to you and it could be text it could be uh audio as well right so I'm not doing anything special uh to generate the response now obviously because this is a streaming solution I might use something like a web soocket to have consistency in my interaction with the model but except for that I'm not doing anything else under the covers what we what Gemini is doing all of this right it's trained on these data sets and it has this innate ability to generate the response now to answer the bit about emotional voices this is the newest feature that we have announced as of 20 days ago I think it was on the 20th of May if I'm getting my date my dates right that's when we announced this feature called native audio so till now we had an audio response available but with native audio the voices have become very humanlike so the voice you would have heard that was the Zephr voice from the Gemini 2.5 model so we support multiple voices for the Gemini 2.5 family of models across 20 plus languages if I remember my numbers right but uh yeah it's all native there's not some sort of a middleware that's sitting that's doing the uh the conversion for you perfect well thank you so much for answering your questions and hosting the session proful I'm sure everyone really appreciated it um with that said I think we are all set to wrap everything up great thank you folks thank you once again thank you so much to Proful for walking us through that i know I'm inspired to start building um after this session um with that being said um I just wanted to let you know that next up we have architecting intelligent agents um building the foundation for success which will be the same time as today's session on Thursday June 12th if you haven't already added it to your calendar now's the time to do so to make sure you don't miss out of course it'll also be available for you on demand after um we complete it it'll be packed full of practical tips from our AI expert Matias definitely don't want to miss this one have a great week and looking forward to seeing you on class in the next class thank you so much everyone