Grok Mixture of Agents Setup Guide

So Grok just dropped Mixture of Agents natively. For those of you who don't remember, Mixture of Agents allows you to take quote-unquote less capable models and make them incredibly capable, nearly GPT-4.0 level. And I made an entire video about exactly how it works. I'll link that in the description below. But today, I'm going to show you how to set it all up. And it's actually really easy. So this video is going to be on the shorter side. The only thing you're going to need is VS Code and a Grok API key. First, open up VS Code. Then you're going to cd into whatever directory you like to store your project. So for me, when I'm playing around with it, I like to go to the desktop. Now here's the project repository, and it was created by Saomi from Grok. So thank you so much to him for putting this together. It is so much better than my hacky version that I put together, and it comes with an interface, so it is especially good. So again, this is how Mixture of Agents works. Here's the prompt. It goes through multiple agents work together over multiple layers to come up with the best possible output. Now, when I first read this paper, what came to mind immediately is it's probably pretty slow, and they actually pointed that out in the paper. But of course, immediately my mind went to let's power this with grok because then you have the massive speed advantage. So I will drop the link to this in the description below. So back to VS code, you're going to type git clone and then the GitHub URL for this project. and switching back, you can find it right here. So if you click this little green code button, there's a little copy button right there. Then you're going to hit enter. Once that's done, let's CD into Grok MOA. Now we're in that project directory. Next, you're going to want to click this Explorer button, open folder, go to your desktop and then click that new project that you just downloaded and click open. Okay. Now we have it open. Let's open up the terminal down here. The next thing we're going to do is create a new conda environment. So conda create dash n grok dash moa Python equals 3.11. Then hit enter, hit enter again to confirm. And then when it's done, you're going to grab this command right here to activate it, copy, paste it into terminal, and then hit enter. And you can see it's active because it says so right there. Next, we need to install all of the dependencies. So we're going to type pip install dash r requirements.txt, then hit enter. and I didn't run into a single problem while getting this spun up. So hopefully you don't either. And it also looks like it comes with a Docker file if you wanted to use that. Next, you're going to right click in this project directory and hit new file and type dot env because we need to set our environment variables. From there, it's just going to be one line grok underscore API underscore key equals and then you place your grok API key right here. Now, if you don't already have a grok account, go ahead and sign up. Go to console.grok.com slash keys. Create key, we're going to type M-O-A-Y-T, so I know it's for YouTube. Hit enter. I am going to revoke this key before publishing the video. Let's copy, switch back to VS Code, paste it in, and hit save. And we're pretty much done. So last to get it up and running, you're going to type streamlit run app.py and then hit enter. All right, then it opens up localhost. So this is running locally. It'll take a second to spin up. And there we go. This is the interface. So let me give you a demo of it really quickly. So on the left side, this is where we can have all of our settings and we can select our main model and we probably want the most capable model. So right now, Lama 370B sounds good. We can actually subtract and add number of layers. So you can experiment with the number of layers that works best for your use case. Now, according to the paper, three layers seems to be optimal and that is the default for this project. So we're going to leave it at three. But of course, if you want to play around with it and experiment, see what works for you, please do. Then we have the main model temperature. I am going to leave it where it is. But again, something else you can experiment with. And here is where we can actually customize the agents for each layer. So we have layer agent one layer agent two and layer agent three, we're going to be using llama three, eight B Gemma seven B and then llama three, eight B again. And of course, grok has other models that you can play around with. And you would just change the model name right there. And you can also change the temperature and other settings as you see fit. So with that set, Let's give it a try. Write 10 sentences that end with the word apple. Now it is going through multiple layers with multiple agents, but still look how fast that is. And the cool thing is you can actually dig into each layer and each agent to see what they've done. So here's layer one, agent one. Here's the output. And it looks like layer one agent one actually got it right, right out of the box. Layer one agent two thought and response. So that is a really poor response and that just happens to be Gemma. And then agent three looks like it almost got it right but here's one sentence that it did not. So you can see that was layer one now we have layer two of the three agents and layer three and then at the end of it we have llama370b putting it all together and it got the right answer and it was super fast too. So really cool project out of Grok. Thank you for making mixture of agents native to Grok now. I actually hope they build this into the main interface as an option. And I hope inference companies actually start building a lot of these types of things. Mixture of agents, route LLM, chain of thought, all of these things should be built into the inference interface. So a couple last things up in the top right we have a deploy button if you wanted to deploy this with streamlit makes it very easy we also under the three dots have rerun which we can do just like that we also have different settings so run on save wide mode and the app theme you can print you can record a screencast which i wasn't expecting to be in here and you can also clear the cache so a lot of cool things i have a feeling this project is going to evolve quite a bit so i'm really excited to see it So check it out, I'll drop all the links in the description below. If you liked this video, please consider giving a like and subscribe, and I'll see you in the next one.

But today, I'm going to show you how to set it all up. And it's actually really easy. So this video is going to be on the shorter side. The only thing you're going to need is VS Code and a Grok API key. First, open up VS Code.

Then you're going to cd into whatever directory you like to store your project. So for me, when I'm playing around with it, I like to go to the desktop. Now here's the project repository, and it was created by Saomi from Grok. So thank you so much to him for putting this together. It is so much better than my hacky version that I put together, and it comes with an interface, so it is especially good.

So again, this is how Mixture of Agents works. Here's the prompt. It goes through multiple agents work together over multiple layers to come up with the best possible output.

Now, when I first read this paper, what came to mind immediately is it's probably pretty slow, and they actually pointed that out in the paper. But of course, immediately my mind went to let's power this with grok because then you have the massive speed advantage. So I will drop the link to this in the description below. So back to VS code, you're going to type git clone and then the GitHub URL for this project. and switching back, you can find it right here.

So if you click this little green code button, there's a little copy button right there. Then you're going to hit enter. Once that's done, let's CD into Grok MOA.

Now we're in that project directory. Next, you're going to want to click this Explorer button, open folder, go to your desktop and then click that new project that you just downloaded and click open. Okay.

Now we have it open. Let's open up the terminal down here. The next thing we're going to do is create a new conda environment.

So conda create dash n grok dash moa Python equals 3.11. Then hit enter, hit enter again to confirm. And then when it's done, you're going to grab this command right here to activate it, copy, paste it into terminal, and then hit enter.

And you can see it's active because it says so right there. Next, we need to install all of the dependencies. So we're going to type pip install dash r requirements.txt, then hit enter. and I didn't run into a single problem while getting this spun up.

So hopefully you don't either. And it also looks like it comes with a Docker file if you wanted to use that. Next, you're going to right click in this project directory and hit new file and type dot env because we need to set our environment variables. From there, it's just going to be one line grok underscore API underscore key equals and then you place your grok API key right here. Now, if you don't already have a grok account, go ahead and sign up.

Go to console.grok.com slash keys. Create key, we're going to type M-O-A-Y-T, so I know it's for YouTube. Hit enter.

I am going to revoke this key before publishing the video. Let's copy, switch back to VS Code, paste it in, and hit save. And we're pretty much done. So last to get it up and running, you're going to type streamlit run app.py and then hit enter. All right, then it opens up localhost.

So this is running locally. It'll take a second to spin up. And there we go.

This is the interface. So let me give you a demo of it really quickly. So on the left side, this is where we can have all of our settings and we can select our main model and we probably want the most capable model. So right now, Lama 370B sounds good. We can actually subtract and add number of layers.

So you can experiment with the number of layers that works best for your use case. Now, according to the paper, three layers seems to be optimal and that is the default for this project. So we're going to leave it at three. But of course, if you want to play around with it and experiment, see what works for you, please do. Then we have the main model temperature.

I am going to leave it where it is. But again, something else you can experiment with. And here is where we can actually customize the agents for each layer.

So we have layer agent one layer agent two and layer agent three, we're going to be using llama three, eight B Gemma seven B and then llama three, eight B again. And of course, grok has other models that you can play around with. And you would just change the model name right there. And you can also change the temperature and other settings as you see fit. So with that set, Let's give it a try.

Write 10 sentences that end with the word apple. Now it is going through multiple layers with multiple agents, but still look how fast that is. And the cool thing is you can actually dig into each layer and each agent to see what they've done. So here's layer one, agent one.

Here's the output. And it looks like layer one agent one actually got it right, right out of the box. Layer one agent two thought and response.

So that is a really poor response and that just happens to be Gemma. And then agent three looks like it almost got it right but here's one sentence that it did not. So you can see that was layer one now we have layer two of the three agents and layer three and then at the end of it we have llama370b putting it all together and it got the right answer and it was super fast too. So really cool project out of Grok. Thank you for making mixture of agents native to Grok now.

I actually hope they build this into the main interface as an option. And I hope inference companies actually start building a lot of these types of things. Mixture of agents, route LLM, chain of thought, all of these things should be built into the inference interface. So a couple last things up in the top right we have a deploy button if you wanted to deploy this with streamlit makes it very easy we also under the three dots have rerun which we can do just like that we also have different settings so run on save wide mode and the app theme you can print you can record a screencast which i wasn't expecting to be in here and you can also clear the cache so a lot of cool things i have a feeling this project is going to evolve quite a bit so i'm really excited to see it So check it out, I'll drop all the links in the description below.

If you liked this video, please consider giving a like and subscribe, and I'll see you in the next one.

Transcript for:Grok Mixture of Agents Setup Guide

Transcript for:
Grok Mixture of Agents Setup Guide