Transcript for:
Ranking of Popular Large Language Models

At this point, there's so many LLMs, also referred to as AIs, out there. Chat GPT, Claude, Gemini, Llama, Perplexity, and so many more. All of these have something unique about them, but most people realistically only need one, maybe two of these in their everyday usage. And that's why I went ahead and compiled all the various LLM platforms out there into one comprehensive list. And look, I know some of this might be subjective, but what isn't is how useful it is to an individual. And that's why we compiled this list based on my experience. the AI Advantage Teams experience, and the entire AI Advantage Communities experience. That's a paid membership we have with some of the most hardcore AI enthusiasts in this entire space. And based on all that collective experience, we formed this ranking of the various LLMs for September 2024. Quick side note, this entire ranking is freely available in the free area for community. Link is below. You can find all the different LLMs we talk about today linked in there too. And we go ahead and update this once a month so you can bookmark it to check back. what's best for October 2024. So let's get into it. Starting from the bottom in D tier, this is the sort of useless tier. And the reason for that is that there's just better tools for this specific purpose. I will keep this too brief, but Pi basically is a conversational AI that is quite good at what it does. But if you want to talk to AI characters, character AI that we talk about later on, it's just better for that. And if you're looking for help, ChatGPT's world knowledge is just unbeatable. So just use the voice mode in there to get the best results. Pi does sound a little bit more like a human, but that's not really a use case as a standalone feature. And that's why it ended up in the D tier for our ranking today. And the same goes for Meta AI. This is their platform where they give you access to their Lama models, just like ChatGPT gives you access to their GPT-4 model. The only difference is that Meta doesn't really have any tooling. and the model is not state of the art. So unfortunately, at this point in time, they get outranked by other solutions that are also freely available, but offer you a more powerful language model in the background. Okay, now let's talk about C tier, which basically means keep your eye on this one. Starting with the Copilot app. And look, this is sort of a weaker version of ChatGPT, but it passes a lot of the privacy requirements that big corporations have, meaning... In a lot of corporations, the Copilot app is the only thing that you can use within your company, and therefore it's your only choice. If you're a common consumer, an entrepreneur, marketer, or whatever else, there's no reason to use the Copilot app, just because ChatGPT, for example, is just a more capable version of the same large language model. more tooling, more features. Copilot app is only useful if you don't have any other choice. And then there's find. And this is an interesting one. This came up in one of the office hour sessions that I hold inside of the community. And a community member basically brought this up as an alternative to perplexity, which is entirely focused on development related questions. So if you were going to use perplexity for development related questions, consider find instead for this one particular use case i've been told it's superior i don't use it on my everyday workflow but after we pointed out multiple community members actually brought up the fact that yeah they use find in their workflow and that's why it landed in the c tier of our ranking as it can be useful for a more focused search but generally speaking perplexity will give you the same type of answers but let's keep our eye on fine Moving into B-tier of this ranking, we have some interesting ones, starting with the Dolphin model. If you're not familiar, the Dolphin series of models are basically uncensored open-source models. So when something like Llama releases, we have access to the weights, and then the creators of Dolphin go ahead and they tweak these weights in a way that the language model spits out any sort of response. And by any, I literally mean anything. But as B-tier is about the fact that it can be useful in a specific situation, if you need fully uncensored answers, the Dolphin models are your friend. And beyond that, we have Le Chat from Mistral. This is basically Mistral's chat GPT interface. This one wasn't on my radar and would have never landed in the B tier without the discussions inside the community. Because what I have been told is that Le Chat actually offers a free alternative to GPTs that runs on the Mistral large model, which is pretty much on par with GPT 4.0. The way it works is quite simple. You go to console.mistral.ai and then here you can create a brand new agent. And if you select Mistral 2 large, which is their best model, You can add a custom set of instructions here, and I have a separate field for the few short prompt here. And then here's the key. Once you publish this, you can deploy it to Le Chat, like so. And then if I move over to the Le Chat interface, you will see that I have access to my various agents here, which I can now freely access. And as the model is super capable, this is just a free year of running GPT-4 level intelligence in a convenience web interface, which I wasn't aware of. And so Le Chat actually landed in B tier. which means it can be useful in specific situations. In this case, if you don't want to pay for the chat GPT plus plan to create GPTs. Okay, moving on with Poe. We talked about this many times, but it's basically a chat bot marketplace hub directory sort of site where you have all these different chat bots. You can build your own ones and they actually have this functionality of... offering chatbots and then charging per usage. This is the one feature that really made it stand out and that's why we put it into the B tier. But also beyond that, you can build custom bots with all sorts of models. You can easily model switch in the middle of a conversation and you have various flux fine tunes or LLM fine tunes that you wouldn't find on any other platform. So basically, this is a fantastic testing hub for all LLM connoisseurs. And with that, we move to the last LLM in our B tier category, which... is gemini advanced this is the web interface of gemini i've actually been playing with this over the past week ever since they introduced this gem manager and now they even added imaging free and i'll be covering that in this week's episode of ai news you can use but if you ask it for an image of our cat with a hat with the text llm ranking on it so one caveat with this while it generates is that sometimes it's a bit unreliable and if i go into different accounts there it's not accessible so the rollout is staggered But there you go. LLM ranking catwithhat.works. Gem Manager is basically a weaker version of the GPT builder, with this interesting fact that these actions in here actually are interesting ones, like a full Gmail integration or a YouTube integration. But the problem is they're just not very consistent yet. So while I think this direction is right, whenever I build a YouTube-enabled GPT, it cannot actually pull in the transcripts every time. It's quite unreliable. Nevertheless, Gemini is a bit of a unique and interesting alternative to some of the S-tier LLMs presented in this ranking. I'll see. distal. The tone is quite unique. And if you're unhappy with ChatGPT's tone, I would strongly recommend you test out Claude first and then also check out Gemini to see how it sounds. Because let me tell you, it's a bit unexpected. And in my experience, people wouldn't even recognize that this is written by AI because it sounds so different. It sounds more like a friend, whereas ChatGPT is a professional assistant. And if you find this type of conversation useful, chances are you would absolutely love our weekly newsletter, which is also... completely free. Just head on over to myaiadvantage.com slash newsletter or follow the link in the description and you can sign up for our weekly digest that gives you one innovative AI app, one prompt and one news story that actually matters and why it matters. And as a freebie for signing up, you get this Notion template and ebook with various prompts and assistant presets that will get you kick-started on your AI journey. So go ahead and subscribe for free. And now let's talk about the next best LLM in our AI Advantage tool ranking. Okay, moving on to A tier. Meaning all of these LLM platforms are excellent at a single thing. And in the case of GigaBrain over here, it's literally one thing. And that's what it's best at. It's a search engine for all of Reddit. So if you're looking for a human opinion, this is one of your best bets. I can ask something like, what do people think of the new iPhone 16 Pro? Search for it and GigaBrain on the free version will give me this response based on various Reddit threads and various subreddits. It reads all the comments. It pulls all of that in. It gives you... a synopsis of everything like so. And it's super fast. So again, if you're looking for human opinions, this is my go-to over perplexity because that looks at all of the internet. Moving on to Copilot and 365. This is one that I personally don't use, but I've heard many very positive reviews from community members that use it in their work regularly. Basically, if you use Office 365, Microsoft Word, Excel, PowerPoint, et cetera, then this is the paid AI assistant that helps you within the context of the application, which is helpful, but... at the end of the day, if you have something like ChatGPT and you know how to prompt a little bit, you're aware of the different functionalities within your software like Excel, then you could easily just be feeding screenshots to ChatGPT as context and get very similar results as you would from Copilot in 365. But every time there's a native integration into application, the LLM has an easier time of pulling in all the context that it needs to give you relevant answers. And that's why this is an A tier because if you want an AI for Excel, Copilot in 365. is your best bet. And now let's talk about Character AI, because this is the site that goes unnoticed many times. But this is the site where young people go to talk to various characters that are powered by LLMs. Just for July 2024, which is lower than usual because it's summer, 226 million. visits with an average visit duration of almost 14 minutes. That is insane. This site is so popular. It's unbelievable. And it's for good reason. Their product just works. And the interface is quite good. So if you want to do something like practice interviewing, you can actually go here. And let me tell you, the mobile app is so good. You can just click on this phone call icon and have a phone call with an interview assistant right there. Is it perfect? No. Does it work flawlessly and get the job done? Yep. It's very reliable, free to use. And there's so many characters on here. You could even create ones of your own. So yeah, if you want to talk to various chatbots, character AI is the best choice for that. Okay, the last one in A tier is Gemini Studio. And this is something many people should seriously consider because it offers a few unique capabilities within the AI space. Now, is this the best option for your everyday usage? No, probably not. It's more of a developer interface and there's many manual options, but that also comes with great opportunities. Because with this Gemini 1.5 Pro model, now they even have this experimental version, but I'll just go. With this one, you get a whopping 2 million tokens of context, which if you're not aware, that's around 3000 pages of text, basically multiple books. And not just that, this also allows you to upload a file format, which no other LLM at this point in time allows, which is video. As you can see in this little tooltip, yes, this can take videos. And that's what makes it super unique. But if you don't need this big context limit and the video upload functionality. There's some slightly better models with better tooling out there that we'll talk about now as we're moving into S tier. And S tier basically means best in class. These tools excel at a multitude of functionalities, starting with perplexity, which at this point is basically the... go to AI search engine. If you compare it to something like GigaBrain, which only does Reddit, or SearchGPT, which doesn't even properly give you the references, likes 95% of the perplexity features, especially when it comes to the pro tier of perplexity, you'll quickly find that perplexity basically is the market leader for AI search. Now, when do you want to use AI search? That'd be kind of a separate video. I did one News You Can Use episode where we compared the different search engines and talked a little bit about this, but... It basically comes down to the fact, if you're looking for a specific answer to a question or a solution to a problem, AI search is your friend. Whereas Google would be better if you're looking for further resources to dive deeper. So if I'm researching new tools, I absolutely love perplexity because I can easily compare the release notes of the last version with the new version. I can easily search for various comment sections and websites and blogs and see what different people think of it. And perplexity gives me a synopsis of all of that. Whereas in Google search, I would have to read through it all and make up my own mind. perplexity is faster that's what i'm trying to say and that's why it landed an s tier as the best ai search engine to date the free tier is absolutely excellent you don't really need the pro tier so i would just encourage you if you haven't done so yet to go and try out perplexity and see if some of the recent google searches you did would perform better inside of perplexity you might just be surprised by what you find all right and now we get to the top of the list we have the two big players here claude and chatgpt let me start by talking about claude and what makes it unique from chatgpt because at this point, I think most people have an opinion on ChatGPT and a lot of times it's the only LLM they're familiar with. But let me tell you, Claude really carved out a bit of a niche for itself and found a place in not just my heart, but also my toolbar. As you can see here on top, it's one of the LLMs that I use on a regular basis, not every single day, but regularly. ChatGPT I use every single day. Now, why would you use Claude over some of these other tools? Well, it's really two use cases that it comes down to. First of all, code generation. It's just head and shoulders above all. all other models in September 2024 when it comes to code generation. I don't really have to defend this point. Every developer that ever used Sonnet 3.5 will know what I'm talking about. And it's not even about the artifacts feature, which is also unique to Cloud. It's this ability to not just write code, but to display it right away in the web interface. That's well and good, but it's really about the quality of the code it produces, with the exception of the out... output limit because the output limit on chat GPT is actually longer. So if you want to create a longer application and get the code all in one, then you want to go to chat GPT. That's the one case where chat GPT code generation is preferable to Claude. And the second big reason to use Claude over chat GPT is actually the tone. And this is the main argument that many people have. It just sounds a bit more human. It sounds more like a conversation that me and you would have rather than this robotic tone. I am a large language model. how can i help you no cloud is more relaxed than that and it doesn't sound so ai like which is a wonderful thing because people are tired of this chat gpt tone these days but those are basically the two big categories of use cases where you want to prefer a cloud for everything else pretty much chat gpt still is king and that's why it sits at the top of our ranking the base model is state of the art there's other competitors which are arguably equally as good in terms of reasoning and prompt adherence and the world knowledge they have These are the things I care about as a consumer. But then the tooling of ChatGPT is just unmatched between the different ways of infusing custom context, like custom instructions or memories or the GPT's feature. They just have all these extra modalities that are useful sometimes, like the code interpreter or the ability to search the web within your chat. I mean, I guess they have DALI too, but at this point that's long overdue for an update. And I suppose that will come soon. I mean, with the voice assistant announcements back in May. They already showed us the new imaging tool, which is excellent. That does text with no problem. So I suppose that will be coming soon. But that's just where the list begins. They have the mobile app. They have the desktop app. They have GPTs with actions and knowledge bases. And you can add mention them and have them talk to each other. The ability to upload files and images is best in class pretty much. There's maybe other players that match it, but nobody that is really better. And you have context that is long enough. And did I mention that when you combine some of these features and have a GPT that uses code interpreter, all of that would also perfectly work on your phone. So you can start or continue your work from wherever you are. Now, did they kind of stop innovating a few months ago? And are people mad about that? Yes, absolutely. And some of these competitors are catching up. As I talked about these different use cases like uploading video or searching Reddit or code generation, there's other players out there who are better at these specific things. But. when it comes to the overall tool set chat gpt still is king but that might change soon because we're going to be updating this ranking for the 1st of october completely for free as i mentioned before this is accessible in the public area of our community you can basically go to the link in the description check out the full ranking with all the explanation and every single link to the apps and we have two more rankings one for image generation tools and one for video generation tools those two we update once a month and you can access them for free in the public area of our community. We used to keep this in our community square, which is inside of the membership, but we decided that all of us in the community would love the feedback and the participation of all you lovely YouTube viewers. So if you have an opinion on one of these rankings or we missed a tool, feel free to create an account completely for free and leave a comment here below with your thoughts. We will certainly consider it in next month's ranking of the best LLM platforms out there. All right, that's all I got for today. Check out the other rankings and I'll see you soon.