Transcript for:
Building Advanced Research Agents with Langra

today we're going to be taking a look at how we can use langra to build more advanced agents specifically we're going to be focusing on the Lang graph that is compatible with the V2 of Lang chain so it's all the most recent methods and ways of doing things in the library and we're going to be building a research agent which gives us a bit more complexity in the agentic graph that we're going to be building now when I say research agent what I am referring to is essentially these multi-step AI agents similar to you know like a conversational react agent but rather than aiming to provide fast sort of back and forth conversation they aim to provide more indepth detailed responses to a user so in most cases what we would find is that people don't mind research agent taking a little bit longer to respond if that means that it is going to reference multiple sources and provide all of this well research information uh back to us so we can afford to wait a few extra seconds and because we have less time pressure with these research agents that also means that we can design them to work through multiple research steps where it's referencing these different sources and it can also go through those different sources multiple times it can do a Google search it can reference archive papers and it can keep doing that and so on and so on now at the same time we might also want to make our research agent conversational so we would still in many cases want to allow a user to chat with the research agent so as part of the graph that we're going to build for the search agent we also want you know just something simple where it can respond okay so let's start by taking a quick look at the graph that we're going to be building so we have the obviously the S of the graph over here which is where we start and this goes down into what we call the Oracle down here now the Oracle is like our decision maker it's an llm with a prompt and with access to each of these different nodes that you can see so the rag search filter rag search fetch archive web search and final answer so the Oracle can decide okay given to users query which comes in from up here what should I do okay so for example if I say something like hello how are you hopefully what it will do is it will go over to the Final Answer over here and we're going to just provide an answer and end the execution of the graph but if I ask something that requires a little more detail for example I ask about a particular LM what we might want to do first is okay we do a web search find out about the llm that will return information to the article and the article will decide okay do I have enough information here or is there anything in these results are particularly interesting like for example is there an archive paper mentioned in these Google results if so it you know it might just decide to give us an answer straight away or it might decide oh okay let's go and refer to our archive papers database which we do have within the rag components so one of these two and we also can fetch the archive paper directly but this is just a summary so it could refer to any of these so maybe it wants to have a look at the summary okay so have look at summary says okay this is relevant cool now what I'm going to do is I'm going to refer to my regag search with a filter okay so the filter that it has here allows it to filter for a specific paper and that might return you know all of the information it needs and then at that point it would come over to our final answer and basically build us this little research report and submit everything to us so that's what we're building there are a fair few steps here and yeah a fair few nodes that we need to build in order for all of this to work but before we do jump into actually building that graph I wanted to talk a little bit more about graphs for agents and just line graph at higher Lev level so using almost like a graph based approach to building agents is relatively new at least I I haven't seen it for a very long time and what we had before was just more like okay we have this way of executing agents or we have this way and the most popular of those agent execution Frameworks that we've seen is called react now what react does is it encourages an llm to break down its generation steps into these iterative reasoning and action steps the reasoning step encourages the llm to outline the steps it should take in order to fulfill some objective and that is what you can see with the thought steps here and then the action or acting steps are where the llm will call a particular tool or function as we can see with the ACT steps here okay so when that tool is called so for example here we have the search Apple remote we of course return some observation from that which is what we're getting here and that is fed back into the at L so it now has more information and that is very typical okay that sort of react framework has been around for a long or at least a fairly long time and has by and large been the most popular of agent types and this agent type in some form or another found its way into most of the popular libraries so Lang chain llama index Haack they all ended up with react or react like agents now the way that most of those Frameworks implemented react is with this objectoriented approach which is nice because it just kind of works very easily and you just plug in a few parameters like you plug in your your system prompt you plug in some tools and it can just go but it doesn't leave much in the way of flexibility and also just for us to understand what is actually going on you know we're not really doing anything we're just defining an object in one of these Frameworks and then the rest is done for us so it means that we miss out on a lot of logic that is going on behind the scenes and makes it very hard to adjust that for our own particular use cases and an interesting solution to that problem is to rather than building agents in this object oriented approach to instead view agents as graphs so even the react agent we can represent it as a graph which is what we're doing here so in this react like agent graph we have our input from the user up at the top that goes into our llm right then we are asking the LM to kind of reason and act in the same step here it's react like it's not necessarily react and what it is doing is saying okay I have this input and I have these tools available to me which is either tool a tool b or a final answer output which of those should I use in order to produce an answer so maybe it needs to use tool a so it would go down generate the action to use tool a and we would return an observation back to the LM then it can go ahead it can maybe use another tool again right it could use tool B get more information and I can say okay I'm done now I will go through to my final answer and return the output to the user so you know that's just a react like agent built as a graph and when you really look at it it's not all that different from the research agent that were building again it's similar in some ways to a react agent but we just have far more flexibility in what we're doing here so you know if we really wanted we could add another step or we could modify this graph we could say okay if you do a web search you always after doing a web search must do a rep search and then only after doing that you can come back to the article right we could do things to modify that graph make it far more specific to our use case and we can also just see what is going on so for that reason I really like the graph based approach to building agents I think it is probably the way to go when you need something more custom you need transparency and you just want that you know degree of control over what you're doing so that brings us over to Lang graph so Lang graph is from line chain and the whole point of line graph is to allow you to build agents as graphs okay as far as I'm aware it's the most popular of the Frameworks that are built for graph based agents and it allows us to get far more fine grain control over what we are building so let's just at a very high level have a quick look at a few of different components that we would find in langra now to un lra and of course just Behold our agent as well I'm going to be going through this notebook here you can find a link to this notebook in either description of the video or I will leave a link in the comments below as well so getting started we will just need to install a few prerequisites so I'm going to go ahead and do that so we have this um install graph vid Li graph basically all of the things that we need here are just so we can visualize the the graph that we're going to be building you don't need to install these if you just building stuff with line graph it's purely if you want to see the graph that you're building which I think is to be fair very useful when you are developing something just so you can understand what you've actually built because sometimes it's not that clear uh from the code so visualizing things helps ton now there are quite a few python libraries that we're going to be using here because we're doing a lot of stuff so we're going to go ahead and install those we have hunger data sets obviously for the data that we're going to be putting into the red components we have pine cone for the red components open AI the llm of course just line chain in general note that we are using V2 of line chain here we have Lang graph semantic router for the encoders Sur API for the Google search API same here and then py graph f for visualizing the asent graph now before we start building different components I will just very high level again go through what each of these components is actually going to do so we have the archive paper fetch component here so what this is going to do is given an paper ID it's going to return the abstract of that paper to our llm web search over here is going to provide our llm with access to Google search for more general purpose queries we have the rag search component here so we're going to be constructing a knowledge base containing AI archive papers and this tool is the access route for our llm to this information and also very similar is the rag search with Filter component and this is exactly the same it is accessing the same knowledge base but it also adds a filter parameter so that for example in the web search if the agent finds a archive ID that it would like to filter for it can do so with this tool then finally we also have the final answer component over here so this is basically a custom format for our final answer and that custom format looks like this so it is going to Output a introduction research steps report conclusion and any sources that the agent use and it will output all this in a an adjacent format and we just simply reformat that into the format this sort of plain text format um after the fact so we're going to set up the knowledge base first to do that we're going to use pine cone and also this AI archive two chunks data set so it's basically a prebuilt prech data set of AI archive papers take a moment to download and what you're going to see in there is SLI this so This is actually I think it's the authors and the abstract of the mixture of experts paper this is all being constructed using semantic chunking so yeah you basically the chunks that you'll see they vary in size but they for the most part should be relatively concise in what they are talking about which is ideally what we want with trunks so first thing we'll need to do is uh build our knowledge base for that we need the embedding model so we're going to be using open AIS text embedding through small so you will need an open AI API key which you would get from here we enter and then we also get a pine cone API key as well Okay so we've entered that in there we are going to be connecting to I think Us East one is the free region so you should use that and what we're going to do is just create a new index so to create an index we do need dimensionalities from our encoder so we get that 1536 and then we create this index okay so dimensionalities we have the index name you can make that whatever you like the metric should be cosign or dot products I think you might also be able to choose ukian with embed three and then we're just specifying that we'd like to use serverless here so we run all of that I already have this index created so well you can see that here basically I already have uh like the vectors are in there 10,000 and yeah that's already C so the 10,000 that I have here comes from here so you can index the full data set but I think it's 200 ,000 records which will just take a little bit of time again you can it's up to you but it's time also the the cost of embedding so it's up to you but 10,000 is fine for this example so you just need to run this I'm not going to rerun it because I already have my records in there and with that our know base is ready and we can just move on to all of the the graph stuff so the first thing is graph State now at the core of a graph in line graph is the agent State the state is a mutiple object that we use to track the current state of the agent execution as we pass through the graph we can include different parameters within this agent state but in our example I want to be very minimal and just include what we what we really need for this example so in here we have the input put which is the actually the input from the user we have the chat history so you know we do want to have this as more of a conversational research agent we have the chat history in there and intermediate steps which is where I'm going to be tracking what is going on within the graph so we have all of that the I say probably the main confusing thing here would be this operator ad thing here and how we are constructing intermediate steps so essentially this operator ad tells line graph or the graph within line graph that when we are passing things back to the intermediate steps parameter of our state we don't replace intermediate steps with the new information we actually add that information to a list okay so it's probably the the main slightly different thing to be aware of there now we're going to go ahead and start our custom tools so as I mentioned before we have those of different components first of those is relatively straightforward so we'll build that first we have the archive paper fetch so the fetch archive tool is where we given a archive ID we're going to return the abstract for that paper and all we do is we import requests Let's uh we're going to test it with this with the mixure p paper here and all we really need is this okay so we're just sending a get request to the export. archive site and we pass in the archive ID and when we do that we should get something like this now this isn't everything let me expand this it's relatively big you see there's quite a lot going on there so what we need to do is within this mess we need to ract what we actually want which is the abstract which is somewhere in here I'm not entirely sure where but we can extract that with this RX here okay so we use that and there we go that's our abstract so relatively straightforward tool it just takes in the archive ID and we're going to get the abstract for a paper now the way that line graph works is that it expects all of this to be built into a tool which is essentially a function like this which consumes a particular value which is the archive ID in this case and it is decorated with the tool decorator from line chain we specify the name of our tool here which is I'm just using I'm keeping things simple we're using the same name for the tool as we use as for the name of the function so I'm going to run that and with that our tool is ready and we can test it so to run any tool we have to run invoke then we pass input and to input we pass a dictionary which must align with the parameters that we set for that particular function so we do that the next component that we want to be using is web search so web search we're going to be using the SE API and for that again we do actually need an API key so to get that API key I don't remember exactly where it is I will give you the link okay so you'll need to create an account and go to here so Ser ai.com manage API key then you'll get an API key and just enter in here okay so basically we do a search like this so we have a query we're creating for coffee here and we'll get all of our results in the dictionary here now what we do want to do here is restructure this into a string that is a bit more readable so that's what I'm doing here and let's see what we get so using that we're going to get something like this right cool so we have that and again we're going to put all that into a tool one thing that I actually didn't mention before is that we also use the DOT string in our tool to provide a description to our LM of you know when to use a tool how to use it all and so on so same thing we initialize that now the r tools we have two of them we have the just the rag search and we also have the rag search with a filter so let's go through that we are going to should come to that in a moment we'll create a tool first so to create the tool I have a query this is the filter one so we have query which is natural language string and we also have the archive ID that we'd like to filter for all we do is we encode the query to create a query Vector we query we include our filter so we have that here and then from that we're going to use this format rag context to format the responses into again another string okay so you see that here so title content archive ID and any related papers to that what we've returned we also have the same here we have rag search and it does the exact same thing but it just doesn't have the archive ID filter and it also returns a fewer top K here I'm not actually sure why I did that but yeah I'm I'm going to keep it but you can adjust those if You' like to return more stuff you can but yeah we have that so finally we have the final answer which is you the way that I've done it here is is basically another tool and the reason that I've set it up as a all is because we have this particular structure that I want our llm to Output everything in we're also using the doct string here to describe a little bit of what we want for all of this so yeah we have all that yeah that's it so we construct that and actually this bit here isn't even used so you can just remove that doesn't matter I'm returning nothing here and the reason I'm returning nothing is because we're actually going to return what the llm generates to use this tool out of our graph okay and then we have another function you'll see in a moment that will take that structure and reformat it as we would like it we could also do that restructuring in here and just return a string but I I've just left it outside the graph for now okay so that is all of our components so we have covered basically all of these here they're all done next let's take a look at the Oracle which is the main decision maker in our graph Oracle is built from a few main components okay so we have obviously L LM The Prompt which you can see here and of course just binding that llm to all of our tools and then putting that all together so there's a few parts we'll go through very quickly here the first is yes the prompt so we are using blank chains you know prompting uh system here so we're using the chat prop template we also in include these messages placeholder because that's where we're going to be putting in our chat history and we're also going to add in a few other variables as well so we have system prompt and chat history followed by our most recent user input which comes in here and then we follow up with the scratchpad now the scratch Pad is essentially where we're going to be putting all the intermediate steps for our agent so yeah you can imagine okay you have your system prompt you have the chat history you have the most recent interaction from the user and then following that we're going to be adding uh assistant messages that are saying okay I'm going to go do this search I've got this information so on and so on so yep that is our prompt then we want to set up our llm now the llm we're just using gb40 here uh there's nothing yeah nothing special going on there but the one thing that is important if I come down to or I need to import something quickly let's rerun that okay so the part of this that is important is we have this tool Choice option here and what that tool Choice any is doing is essentially telling our llm that we have to use a tool all right because otherwise the LM may use a tool okay it may use one of the components or may just you know generate some text and we don't want it to be generating text we want it to always be using a tool and the reason for that is even if it does just want to respond it has to go through the final answer tool to respond so we're just forcing it to always use the tool so that is what our tool Choice any does here and we can go through the rest of this quickly as well so we have yes the LM the tools uh the scratch Pad here is basically looking at all of the intermediate steps that we've been taking and reformatting them into a string here so we can see the the name of the tool that was used the input to that tool and the result of that tool okay so the output from that tool and we just put all that together into the string which goes into that agent scratch Pad section so this bit here and then here we're using using the L chain expression language to basically put everything together so our Oracle consists of these input parameters we have input chat history and scratchpad these are all fed into our prompt okay so the prompt if I come to here you can see that we have chat history input and scratchpad the exact same parameters that we're using here so they need to align so yep they populate our prom and then the prompt is passed over to our LM which also has access to the tools that we have defined here okay and that's it so that is our that's our Oracle now we can test it very quickly to just confirm that it works so we run this okay and we basically get a ton of output here and this is the output that we're getting from our model okay so the output the uh AI message actually just empty because what we really want here is we want to see what the Oracle is deciding to use so ah we can go here right you can see the name is rag search it's deciding to use the the rag Search tool here we are not training this to give us facts about dogs so it's not perfect usage but anyway we have the r Search tool and the query so the input query is interesting facts about dogs and it's going to go and and search for that okay there we go can and you can keep rerunning that and seeing what it comes out with it will probably vary every now and again because there is of course some degree of Randomness in there now our agent the Oracle is going to be outputting a decision to use a tool so when it outputs that decision to use a tool we want to be having look at what output and saying okay it wants to use the rug Search tool let's go and execute the rug Search tool it wants to go use the web search tool we go and execute that so that is what our router will be doing so that is what we can see down here okay so the router is literally consuming the state it's looking at the most recent intermediate step which will be will have been output by our run Oracle function here right you can see that it Returns the intermediate steps which is the action out and we're just going to be returning the name of the tool that we got from the Oracle so I'm going to run that we don't need these extra cells and with that the only remaining thing that we need to you know turn into a function and um which we will use to add to our graph in a moment is the Run tool function now the Run tool function we could basically we could split all of our tools into multiple functions kind of like how we have our own function for the Run Oracle here but all of these tools can be executed using the same bit of code so there's no real reason to do that so instead I just Define a single function that will handle all of those uh tools so again that's taking the state looking at the most recent intermediate step we look at the tool name and the tool input which has been generated by our Oracle I'm going to print this out so that we can see what is actually happening when this is running and then we would here tool string to function is just this dictionary here this is basically going to take the tool name map it to a function and then we're going to invoke it with the input that we we have from here so we run all that then we create this agent action object which basically is just a way of us describing or yeah having a object that describes what happened or what tool we use what the inputs to that tool were and the observation i. the log that we got back from it and then after that's done we pass all of that information back to our intermediate sets so we run that and now we can Define our graph so we have you know all the components there they're already now it's time to define the graph so for the graph we already have the agent state that we've defined so we actually need that agent state to um initialize our graph so we use a state graph object and then once we have our initialized graph that graph is empty right it doesn't know you know all the stuff we just done or the components we' defined so we need to then tell it which components we have defined and start adding all of those so we do that using the graph add node method and what we're going to do is just take our our string and we're going to map it to the function that would run that tool okay so or run that tool or run that comparon okay so for Oracle we will be hitting the Run Oracle function for these ones here so rag search filter rag search so on and so on as I mentioned before they can all be executed using the exact same bit of code so that's exactly what we do here we just pass in the Run tool function so we do that we Define all of our nodes there then the next step is to Define okay which one of those nodes is our entry point you know where where does the graph start okay and that is of course course our Oracle right we saw the graph earlier here so we're always starting okay we start with the start where we inut our query and that goes directly to the Oracle so in reality the Oracle is our entry point okay then we have the following our Oracle we don't just have you know One Direction to go we have many different directions and the way that we set that up is we use our router and we have what called conditional azures which are basically these lines here so we add those conditional edges the source for where we're starting from there is our Oracle and the thing that decides which path we should take is our router so our Oracle outputs the tool name and our router basically passes that tool name and then directs Us in whichever direction we should go then we need to add edures from our tools back to the Oracle so if you see on here we have these you know the lines here that's basically saying okay if we are at one of these tools we need to go back to the Oracle like it can't go anywhere else so it's not a conditional Edge it isn't dotted it is a a normal Edge I when you're at this component this is where you're going next and all of these components except from Final Answer go back to the Oracle and that is what we have defined here so we say for Tool object in tools if it is not the final answer we're going to add an edge from the tool name or the tool object. name back to the Oracle so what we do and then finally we have the final Edge which goes from our final answer to the end node which is exactly what you can see from here all the way over to our end node there okay and that is the definition of our graph we can just confirm that it looks about right which we do here okay actually the Oracle has a conditional Edge over to the end I'm not sure why that is but for the most part this is what we're looking for so going stick with that and yeah I mean everything is compiled we can see the graph looks about right and yeah we can we just go ahead and and try out now so let's see what that looks like we'll go to the first question again not on topic for our AI researcher but it's we'll just try it so tell me something interesting about dogs let's run that okay and we can see all the steps that are you know being processed and that is because I have put a load of print statements in throughout the code so we can see that it's hitting the Oracle it's going to the rag search and it's invoking this query interesting facts about docs right it's probably not finding much so then it's going to the web search well actually sorry it goes back to the Run Oracle then it goes to web search performs the same and at this point it probably has some interesting information so we go back to the Oracle and the Oracle says okay we have the information let's go and invoke the final answer and then the final answer is you can see right we have this introduction and we will go through so on and so on we have the research steps and basically we have all like that full format that I mentioned before so with that full format I'm going to define a function that will build the report and just format it into a nicer like more easy to read format so that is within this bod report it's going to consume the output from our graph and it's going to restructure everything into this here so let's see what it came up with for the first question okay so we can see there's quite a bit in there given the the question as well so introduction dogs are fascinating creatures that been companions to humans for thousands of years so on and so on uh you know it's it's a real introduction the research steps that were performed so it actually says okay I went to Archive for academic papers or research related to interesting facts about dogs then it performed a web search to gather general knowledge and fun facts about dogs from various reputable sources then it gives you the little report here which is I think looks pretty good and then we have little conclusion finally we have our sources okay so we can actually see from the sources I didn't really rely so much on the archive papers which is not surprising given that they are a archive papers but we can see most of these are I well I assume all of these are actually coming from the web search tool so yeah that is our first little research report let's try something a little more on topic although still quite General and Broad we're going to ask it to tell us about AI so let's run this so it goes with rag search first the query is just AI then we go back to the Oracle then we have web search then back to the article again we have RI search filter so it's really you know it's it's going for it here so it's looking at this paper which I assume is gotten from the references of the previous oh probably from the web search step or maybe even the rag search step as well so yep it's going for this archive paper then we're going for another archive paper here and finally we have our final answer so this there's a lot of information coming in from here so let's see what we let's see what we get there all right nice so we have nice little introduction the three steps it performs so specialist search and AI using the archive database a web search to gather general information and then filtered specific archive papers to extract detailed insights on AI related topics cool now we have the report so yeah I'm I'm not going to go through all of that um but just high level it looks kind of relevant so nice has some like recent stuff in here GPT 4 we have chat GPT and Pi 3 in there so it's getting some relatively recent information then we have the sources so this is probably the most interesting bit to me so you know what have we got here so an in-depth survey of large language modelbased AI agents seems pretty relevant and cognitive architectures for language agents another kind of interesting sounding paper we have the Wikipedia page for AI and Google cloudo what AI so yeah some I think good sources there let's try one more so we'll get a lot more specific on this one so I want to ask what is rag let's see what we see what we get so we are running the Oracle we go to rag search asking about rag go back to the Oracle then we have web search again asking about rag goes back to the Oracle again and from there it's like okay final answer so it seems to had enough with that let's see what we get nice okay so rag is an advanced technique in the field of AI integrating external knowledge sources cool research steps specialized search using the rag Search tool perform the web search to optain general knowledge and additional perspectives on ride from various sources compiled and synthesize information to provide a comprehensive understanding of rack we have the little uh report here looks reasonable two main components Retriever and Generator nice and yes uh addressing limitations of traditional LMS so on and so on and then we have all of these sources here so we have a from autogen Simply retrieve AWS Nvidia gole cloud and IBM Okay so the autogen here is that oh and both of these sorry both of these seem to be the archive papers that they that found and then the rest of these are I assume from the web search so that looks pretty good so you can already see that well langra was pretty nice in allowing us to build a relatively complex agent structure and it was at least in my opinion far more transparent in what you're building like we know what prompts are where we know what tools we're using and we can modify the order and the structure of you know when and where we are using those tools which is is nice to have and something that is kind of hidden away when you're using you know the more typical or the past approach of like react agents as objects or classes in Python and yeah we' seen how that all goes together quite nicely to build a research agent which from what we've seen like given we didn't really work M tweaking it it worked pretty well like it was going to different sources it was doing like the web search the rag search and then like going through and filtering for specific papers to get more information and you know that was were just a few you know quick tries so it's hard to say how it would perform use with you know a broader set of questions but just from those quick experiments that seems pretty cool in my opinion now I think this sort of approach to pairing both graphs and agents it's just nice for all the reasons I just mentioned it works well and it's not just langra that does it I mean langra is probably the best known library but I'm also aware that Haack have something in there at least and I believe llama index either have or they might be putting something together I'm I'm not sure but I've heard something about llama index in that space as well so they're probably things I'll look into in the future but for now I'm going to leave it with langra and this research agent so yeah I hope all this has been useful and interesting thank you very much for watching and I will see you again in the next one bye w [Music]