Exploring Effective Agent Workflows

so a while ago entropic published an article called building effective agents where in it they discuss about their experiences and findings building agentic Frameworks for their clients so after working with dozens of teams across Industries for over a year entropic discovered some agentic Frameworks that are effective in production and shares them with us in the article and upon reading it I realized just how valuable the insights really were because after all they were sharing with us what they had learned building agents in production for over a year so I figured this this is definitely not something we should take for granted but actually dig into and because of that I decided to fully grasp these concepts by implementing the workflow blueprints from the article as practical examples in n8n and in this video I'm going to walk you through each of the workflows step by step so that you can understand them too if you're interested in building agentic Frameworks then understanding each of the workflow patterns is crucial to making the right design choices when building these systems so I would definitely recommend you to master them and by the way the art article also discusses the distinction between workflows and agents which is an important concept that we will explore in more detail after we've gone through the Practical examples to ensure that it fully clicks so stay tuned for that as well and without further Ado let's just get started all right so before we start I'd like to let you know that I'm going to be uploading this template to my school community business AI Alliance which is entirely free so feel free to join to get access to the resources shown in this video and in my previous videos and with that said let's go to the first example which is going to be prompt chaining let's take a look at what this is so prompt chaining decomposes a task into a sequence of steps where each llm call processes the output of the previous one which is exactly what's going on in the diagram below right we can see there are three llms that's processing different parts of a particular task we have llm call one processing the first part of the task generating an output the llm C to uses this output to process and complete the second part of that TP task and then generates an output which is then used by the llm call Tre to process the final part of that task to finally generate a response so when to use this workflow this workflow is ideal for situations where the task can be easily and cleanly decomposed into fixed subtasks the main goal is to trade off latency for higher accuracy by making each llm call an easier task so we are trading off latency because we are adding three llm calls instead of one but with that we are getting higher accuracy right which is what we would want in most cases so the core idea behind prompt chaining is to break down a larger complex task into manageable steps that can be handled by multiple llms this way instead of loading all the responsibilities on one llm each part is handled more effectively let's actually take a look at an example that I created for this blueprint here in this example we are going to create a report based on a given topic okay I've decided to break down the task into three separate parts in the first part we're going to generate key points and angle for that given topic list coze structured key points and angles about the following topic and then we are going to use these points and angles to pass it to our report planner who is then going to use this as a reference to create the outline of the report so it's going to be able to use this and come up with a more cohesive outline for the report that we want to create we have the introduction right we plus it the main body sections which is what it's going to also include based on the notes up here and then we are going to have a conclusion for each of these sections it's also going to include brief descriptions by the way and once we have the outline of the report we are going to pass it to our report generator which is then going to generate this report based on the given Outline by our report planner let's take a look at this in action by the way before that once we get the output oops sorry about that once we get the output we're going to send it to me on Gmail so that we can see the result and for the input by the way I'm just using a set field so that we can save time right so we have the topic here and the topic we're going to be researching on is obesity and without further Ado let's just actually test this workflow now I'm going to click on test workflow right we started processing it we are generating the key points and to do this as you can see we are using separate models by the way just to show how much more flexible it can get where we can decide which model to use in each section right we got our key points generated so these are our key points and angles we pass it to our report planner which is now going to use this as a reference to come up with the outline of the report and then we pass that to our report generator which is now going to use the outline of the report to actually gen to generate the actual report and we are waiting for that to complete and once it's done it's going to send us an email and we're going to take a look at the result all right we receive the email let's take a look at what it looks like all so this is the report generated for us comprehensive report on Obesity we have the introduction as specified in the report planner than the main body this is what the main body is made of right and then finally we have a conclusion section and with that we have our report so going back to the canvas we saw how we were able to split this Tusk into subtasks and and assign them to separate llms which work together to come up with the report we just saw another benefit to promp chaining apart from the higher accuracy and the flexibility is the ease of debugging if you had this whole t done by a single llm if an issue occurs within it we will not know which part of the prompt is causing the issue right so it we would have to take a look harder probably fix something that's not already broken but in this case because we've broken it down into subtasks we can see exactly which part of it is causing the issue and just fix the llm that actually has the issue so you can imagine how much easier this makes our jobs right with that we have prompt chaining out of the way now let's look at the the second example which is routing all right so routing classifies an input and directs it to a specialized follow-up task this workflow allows for separation of concerns and building more specialized prompt without this workflow optimizing for one kind of input can hurt performance on others this is actually super simple to understand right and it is exactly the same idea as prom chaining where we are trying to minimize the scope of responsibility given to a single llm let's imagine that you want to create a system where you can manage your Gmail your calendar or your select right in that case you don't want to dump all of these responsibilities and all these tools into just a single llm but instead create specialized agents for each of those platforms right one for your calendar one for your Gmail and one for your slack that way we are able to minimize the responsibility a single llm has so that we can also create more focused prompts So based on this diagram in that case if you wanted to do something about our Gmail for example you know get my last emails we would pass our input and the classifier llm is going to decide what kind of request it is and based on that it's going to Route the workflow right in our case we want it to be Gmail so it's going to pass it to the Gmail llm and then the llm is going to process our request and respond to it and that's pretty much it really right let's take a look at an example I created for this which is super simple so we have a bunch of agents that are specialized in different things one of them is specialized in generating poem the other one is specialized in generating jokes and finally we have one that's specialized in generating stories right and we also have another route where it's not clear what exactly the user wants to do so we have our input in this case it's tell me a joke about a cat okay about a cat and then we pass our input to our text classifier within this Tex classifier we created categories right it's going to classify the text and figure out what the user exactly wants it could be a poem a joke story or where again it's not clear what kind of content the user wants right so I'm expecting it to Route us to the joke generator without further Ado let's start with the example let's test it out I'm going to remove this manual trigger from here and attach it to this workflow okay I have this ready now I'm going to refresh the test and click on test workflow and let's see what happens so exactly what we expected happen right we passing our input to the text classifier the text classifier figure that the the user wants a joke and then it routes the workflow to the joke generator the joke generator gets it and generates a joke for us sure here's one why did the catch sit on the computer because it wanted to keep an eye on the mouse for example right so that is exactly what routing is and of course this is a very simple example but imagine a scenario where we had actual complicated tasks that may require detailed prompts and even multiple steps of actions to complete that is where this pattern truly shines and with that we are done with routing now let's move on to our next example which is parallelization all right so parallelization llms can sometimes work simultaneously on a task and have their outputs aggregated programmatically this workflow parallelization manifests in two key variations which we'll actually be seeing both in a bit as you can see this is called sectioning and then this is called woing right let's take a look at their descriptions so sectioning is breaking a Tusk into independent subtasks that run in par and voting is running the same task multiple times to get diverse outputs so let's take a look at the first example here which is sectioning I'm just going to go ahead again and remove this trigger and attach it to this workflow right and here we are going to be testing out sectioning in this scenario what we have is we have a sort of travel agent right we'll tell it where we want to go so for instance toxim stumble we'll give it a location somewhere in the world and it's going to take that and find us restaurants hotels and activities to make this more detailed I actually attached Ser API which is a browsing tool so it's going to be able to look for these things in real time and actually give us accurate data right so we got restaurant finder with this simple prompt here right it's going to find us five restaurants and then we have Hotel Finder our hotel finder is also going to do the same thing but it's going to be looking for hotels and then we're going to have activity finder which is also going to be doing the same thing just finding activities so what's going to happen happen is when we pass in our location they're going to run in parallel and then once they're all done we're going to merge their outputs here okay and we're going to aggregate them and send it to us through email and we'll look at what result we come up with so we can see that we're going to get the hotel recommendations restaurant recommendations activity recommendations so let's actually quickly test this out now I am going to click on test workflow but before that let me click on reset another thing before we start testing this is that due to the limitations in n8m we are not able to actually run this in parallel this way there is a work around but it's too complicated for me to include into this video which is supposed to be beginner friendly but at least this way we are still able to follow the same concept right the idea is the same just without the parallel part so with that out of the way I'm going to click on test workflow and we're passing in our location right we're telling it talkim stumble we can see that now our first L&M is looking for restaurants and for this it actually hit the Ser API to at the Google results we are done with finding restaurants and now we are looking for hotels right or the llm is looking the second llm is looking for the hotels for us and done with that as well and finally we are looking for activities that we can do in tuim stumble all right so all three is done we got all of the outputs and we merged it and finally aggregated it then we formatted the response and sent it to me on email all right so here is our email we got the hotel recommendations the restaurant recommendations and activity recommendations basically each of these aspects were taken care of by separate llms and we got all of the responses merged together formatted and then sent us to email and that's exactly what sectioning is and how it works so now let's take a look at the second variation of parallelization which is voting I'm going to attach this to this workflow here this time and reset the test okay so let's first remember what voting was it's running the same task multiple times to get diverse outputs so what we have here in our example is we have a copywriter system okay that's going to do this you are a creative copywriter the user wants three catchy slogans for their brand or product emphasize originality memorability and brand appeal in each slogan do not include any extraness details or explanations we got the same exact prompt in all these three llms but if you notice the models attached to each of them are different which is going to further help with the diverse outputs once we have all the responses we are going to merge them right we're going to do it with this note here and then further aggregate them and format them into a text that we can read so we can understand which model uh was which were generated from which by which model and finally we'll send it to us on Gmail so without further Ado let's actually get started right let's first take a look at the input which is AI athletic an AI driven fitness app that will personalize exercises based on user profiles right so we are going to create a combination of slogan for our app here and let's get started I'm just going to click on test workflow for this and there we go all right so we got all the responses ready and it sent us an email but I just noticed that I misplaced the models as you can see I I wrote open AI here but attached it Google's Gemini and for Gemini I attached O3 mini which is a mistake sorry about that but let's now go take a look at the email and see what we got all right so this is the email and as you can see we got three different versions of output from the same input and this is exactly what the voting variation of the parallelization is and with that we are done with parallelization and we can move on with our next example the orchestrators workers pattern all right so in the orchestrated workers workflow a central llm dynamically breaks down tasks delegates them to worker llms and synthesizes their results and when you look at the diagram if you notice it's kind of similar to to the one we have in parallelization right and this actually also involves running llms parallell to complete a certain task but the only difference is that in this case we added the flexibility of choosing how many of them to run for a particular task for instance in the example we had here our execution guaranteed to run all these three llms where you'll always look for restaurants hotels and activities right but if you were to implement this example using the orchestrator workers workflow pattern what we could have done was we could have given the user the flexibility to decide which of these three they would like to research about right for instance they could decide to go just with restaurant or just with activity or activity with hotel or all of them together or even none of them so that is the beauty of the orchestrator workers workflow pattern and honestly I'm really in love with it it just allows us to create a very flexible and dynamic system we're actually going to be taking a look at an example that I created which is about different scenario so that we can see more diverse use cases so to test this I'm going to remove this trigger from the previous workflow and attach this to this workflow here and I'm going to again reset it right and now we are going to start but before we do let me explain to you what this scenario is so we have an orchestrator agent as we can see in the diagram and let's take a look at the prompt of this orchestrator you are an orchestrator a user will provide a text and language or languages they want to translate the text into your goal is to identify all the languages the user wants to translate the text into identify the text right produce a valid Json array where each element has a language field and a text field and then the constraints are outputs only Json nothing else if multiple languages are listed create one object per language if no languages are mentioned output an empty array all right so our expected output is going to be something like this where we got the text right the text will always be the same and the languages that the user wants a particular text to be translated into let's actually take a look at our input before we start this okay so we got our orchestrator input created using a set note so our input says please translate this the following to Spanish Turkish and French and this is the text we want to actually get it get translated all right and then we have a translator agent this one is really straightforward right it's just going to translate the text to the given language and so overall what's going to happen is we are going to pass in the input the orchestrator agent is going to receive this input and then based on that it's going to generate a structured output similar to this one here it's going to generate arrays of languages and text we want to translate the text into okay and then we're going to split out that response so that we can create an array out of a direct array out of it instead of an array nested in an object and we will then run this translator agent once per item in the array and we are going to be doing it parallelly and then we're going to aggregate the text and finally send it to us in email all right so let's actually test this out real quick I'm going to click on test workflow right in our case I think we have like three languages no uh Spanish Turkish and French right so we expected to Output three items as you can see so we got three items output Spanish and then the text here right Spanish Turkish French and then finally we are running the translator agent three times parallell to generate the desired output now we got the whole thing sent to us through email so let's take a look at it all right so we got the email and the translated text text as you can see we got it translated into three different versions and got them sent to us on Gmail so that is the idea of the orchestrator workers workflow pattern right and before the input the orchestrator didn't know how many subtasks to create so instead of having three languages that we wanted it to be translated into we could have also given it 10 15 right based on that we are going to actually run that many agents or llms parallel by the way I accidentally mistaken called this an agent but these are llms even though we are using the agents note but we are using them as simple augmented llms right and we're going to be talking about this like as I mentioned earlier at the end of the video once we are done going through the examples we'll understand the difference between a workflow and an agent but anyway that was the orchestrator workers pattern now let's move on to the next one which is the evaluator Optimizer which I think is super cool all right so the evaluator Optimizer is super simple to understand but it's also super effective especially when used in right places so let's take a look at the description in the evaluator Optimizer workflow one llm call generates a response while another provides evaluation and feedback in a loop right we can take a look in the diagram we have an input and then the input goes to the first llm which is the uh generator llm right it generates an output then the second llm evaluates this output based on a given criteria and determine whether or not the output meets the criteria and if it does not is going to reject the request and provide this llm with a feedback on What's Missing so that it can reflect on it and generate its next output accordingly right and this Loop is going to continue until the evaluator accepts the output in which case is going to proceed with the workflow take a look at when to use this workflow this workflow is particularly effective when we have clear evaluation criteria and when iterative refinement provides measurable value the two signs of good fit are first that llm responses can be demonstrably improved when a human articulates their feedback and second that the llm can provide such feedback this is analogous to the iterative writing process a human writer might go through and producing a Polish document right let's actually take a look at an example that I've created here which is an email-based customer support system so we'll receive an email right and then we'll have a email classifier that's going to determine whether or not this email is a customer inquiry or whether it's not in which case it's going to Output as no action required and not do anything right so we have a customer support llm that is going to represent this llm call generator and then we have our evaluator llm that's going to evaluate the customer support email response and if it accepts it it's going to proceed with the workflow and end up sending the email but if it does not accept it and rejects it it is going to provide a feedback which is going to be passed to this feedback and later pass back to the customer support to reflect on and refine its output so let's take a look at the example right so we have the classify email here we already looked at it we have two categories customer inquiry no action required in the case that this email is a customer inquiry email we'll go to our customer support and we'll pass in the email content as the user message and then we have a system message down here where we provided with some instructions right you're a customer support specialist at Tech spark Solutions and you can see there's another section here that's called feedback and prepared email this is going to be empty in the first R simply because there won't be a feedback back in the first run right we'll take a look when this is actually going to be populated so the customer support outputs a email response which is going to be forwarded to the evaluator the evaluator is going to observe this email content that's generated by our customer support and this to has its system message where it's going to evaluate the email based on the clarity and completeness tone professional friendly Etc and to sign off as John Doe which by the way I purposefully excluded from the customer support llm so that we can see this workflow pattern in action so I would expect our evaluator to at the very least to provide a feedback on this in the first run apart from that we just tell it to Output its response in a structured manner right we have two felds pass and feedback if pass is true we don't need any feedback if pass is false and we also provide it with feedback so then we have this if note here that's going to check whether or not it's a pass if it's a fail we are going to go to this note here and in this note we are going to have the current email that this customer support generated and then the feedback provided by our evaluator and finally the text which is the email of the customer right all this information is going to be passed to our customer support llm and this time it's going to have the feedback present in which now it can use it to refine its output and output accordingly and the cycle will continue until our evaluator accepts the email and finally the email gets sent back to the customer so without further Ado let's test this workflow and see what happens all right so I have this email prepared which is going to be sent to this email which represents the customer support right the subject is operational hours and the content is are you open on Sunday and what are your operational hours so I'm going to click on send once I'm done clicking on send we are going to go back to our canvas and then click on the email trigger and see this baby in action okay I'm going to click on test workflow now and let's see what happens so we are classifying the email it's supposed to hit the customer inquiry right that's exactly what happened our customer support created a response right and now our evaluator decide that it decided that it's not correct and it provided a feedback and finally we got the email sent so let's take a look at what happened so going to the customer support let's take a look at the first one right so the text contains subject and we can't see any uh sign off as John do right so we hit the evaluator the evaluator gets this input okay and the first output was remove the subject line from the email content content and update the sign off to joho instead of using a placeholder which is perfect exactly what we wanted so we go back to our customer support again and take a look at the second output this time you can see that we don't have the subject line and we are signing off as John do which is exactly what we wanted and with that our evaluator accepted this and proceeded with the workflow and finally we received an email so let's take a look at the email all right so this is the response as you can see we don't have the subject line on top and then finally we did sign off with joho apart from that the body it came up with some fictional operational hours and with that we were actually able to see how we utilize evaluator and Optimizer pattern to ensure that the email met the specifi criteria and it did exactly that and with this we are done with evaluator Optimizer and we're going to move on with the next design pattern the agents all right so now we are at agents and by this point we already know what exactly an agent is right it's an llm it's a framework where we can provide it with tools and a memory and then an llm to connect to it as a brain and with that we can tell it what we need and it's going to decide on the steps to take and the tools to use in order to complete our request but what exactly is the difference between an agent and the workflows that we see above here the difference is that none of these were actually agents even though I used the agent note here we didn't use them as agents but we use them as augmented llms that has access to tools right apart from that what they had to do was clear they didn't have any kind of agency or autonomy in the decision they wanted to make other than the request they generated so what makes them different then so if you notice with our agent there are no predefined steps we just provided with the tools we want here we have a calendar agent who has access to g a Gmail tool and calendar tools right so I'm able to tell it to for instance create a calendar event for me with this person and send them an email to in inform them about this event and it's going to be up to the agent to decide which actions to take in order to do that it might decide to First send the email or it might decide to First create a calendar event and then send the email right but if you notice with the other patterns that we see here we actually have predetermined steps right where the steps that should be taken in order to complete something is clear for instance we had this travel agent system where we provide it with an input and it would guarant to pass through this stage of the workflow and then merge the item and then aggregate it and then send us an email or when we look at this orchestrated workers workflow here we first get the input then we pass it to the orchestrator agent again I'm mistaken they called it an agent it's the llm right it outputs the arrays for us then we split it out we decide to split out again it splits out and then passes it to the translator llm and then Aggregates the translated text and sends it to the Gmail again the workflow is determined right we determin the step that should be taken the difference is that with agents these are not predefined and it's up to the agent to decide the course of action in order to complete a particular task so that's exactly what the difference is knowing this difference we can now make an important design decision when building these systems right A good rule of time to follow is if the task involves repetitive well-defined steps and predictable outcomes then used a a fixed workflow like the ones we have above and for cases where the task is complex Dynamic requires decision making and adaptation then definitely go with AI agents and with this out of the way we are now done with the videos we went through all of the patterns that were shown in the article by anthropic and yeah it was really cool to learn and it was also really fun to film this video I hope you enjoyed it as much as I enjoyed filming this video and I hope that this adds a lot of value to you as a developer of these systems and to iterate what I said at the beginning of the video I'm going to be up uploading this template here to my goog community and it's completely free so again feel free to join to get access to the resources shown in all of my videos and with that have a great day guys take care peace out

Transcript for:Exploring Effective Agent Workflows

Transcript for:
Exploring Effective Agent Workflows