in this video I want to do a comprehensive head-to-head test comparing GPT 40 versus claw 3.5 Sonet which just came out and it beats GPT 40 in their Benchmark testing but I don't want to do a scientific test like they do in their benchmarks I want to do a practical test how we use these AI models every day for work and business so then we could actually figure out which one is the most practical model that we use if we had to pick one which one is the right one to pick so that's typically how I do these tests with very practical applications so I'm going to cover about 10 different things and we're going to use things like writing text summarizing text and then we'll get into vision and data analytics and then we'll do some coding and reasoning too at the end so both GPT 40 that I'm going to use and the Claude 3.5 that I'm going to use are going to be the paid version but both of those are available completely for free they are extremely limited and how much you could use them right now so I was using CLA 3.5 sonnet and I only got about 10 messages before I ran out so I had to upgrade and get the subscription to do a real head-to-head test first let's start with a writing prompt a lot of us use these models to write all kinds of different things right so this one is going to be a little bit of a creative writing in the world of marketing you're launching a gamechanging software Tool revolutionizing customer relationship management for business write a short Punchy product description and I told it exactly how many words in this case and I'll do the same thing for cloud 3.5 here here's the result from chat GPT it went right to the answer introducing customer connect the ultimate CRM tool that transforms customer interactions pretty good I asked out how many words it was 41 our instruction is 50 words approximately now Claud gave us a little bit of a longer one it's 54 wordss here but again I said approximately and he's asking me if I wanted to adjust it this one named it autoc CRM revolutionize your customer relationship our AI powered platform automized follow-ups delivers real time insights and boost retention effortlessly okay so they both did a good job nothing here that screams that an AI wrote it and I typically use these models to help with writing email drafts so write an email introducing this to our email list and keep it short let's see what it comes up with now here we got claude's answer and this time again it does the same thing it gives us that sentence that we don't need that's not part of our email and it kind of gives you a quick recap of what it's done and then you'll have to kind of copy and paste the middle part here where chat GPT just gives us the copy option right here I could just copy this there's no extra words so I personally prefer to have zero extra words when I ask something in a prompt I just want the answer I don't want it to kind of explain it to me if I need it to explain to me I would ask it to explain it to me and I found chat GPT typically every time I give it tasks where it has to give me an answer that I could just copy and paste it just does exactly that and as I'm reading the tone of this email here they both again kind of did an equal job this one is a little bit overly excited so it's a little bit sounds too promotional uses words like boost which is very common with AI and if I go to chat GPT again the exact same thing happens there's boost I commonly see that and I erased my memory here on chat GPT because I've actually trained it in the memory function which I've made a different video about not to use specific words but since Claude doesn't have memory I excluded that and I just cleared CLE down my memory here so it just writes like it would without any kind of backend instructions here and when it comes to writing I ran it across 10 different tests and there was not a clearcut winner I think if I was comparing claw 3.5 versus the previous GPT model it would have been a clear winner but with 40 and Claw 3.5 Sonet right now I think when it comes to writing is pretty on par now next I want to show you text summary and a lot of times I consume information now using these models when there's a large amount of information or a huge article or a big newsletter I usually just put it here and I let it just summarize it for me very quickly so let's do that here okay here's the prompt provide two summaries for this article the first one two to three sentences the second one five to six sentences and includes more details and I'm going to use this anthropic introduction here about Sonet so I'm going to go ahead and copy this this whole page and I'll just paste it here I typically just copy the entire page with the footer and everything and he knows what to do with it it's a much faster way to do this I'll send this out here and this time let me just kind of show you the speed because this 3.5 sonit is actually pretty quick so he paid attention to the length he got it write I read through this and everything was accurate there was no kind of hallucination with this answer everything was exactly from that article okay here's chat gpt's answer the short summary again right on point the detail summary I really like the tone it was not at all emotional very factual kind of how I like it without much detail prompting I didn't really tell it what kind of tone to take wanted to see what it does by default and one functional thing I like here GPT actually if you have the paid version you could use GPT 4 and actually get a different response using a different model and compare your results inside of cloud if I try to do the same thing with a paid version I could TR change models and use Opus for example but then that's going to require me to start a new chat so I can't functionally use it the same way not that I would do that often but sometimes I find myself not quite liking what 40 gave me and I wanted to see what the older four model which was good which is good still I wanted to see what that gave me something I could do in chat GPT okay now let's test its Vision capabilities then I'll test its Vision capabilities with some data analytics but here I want to see if you could find out what's in a very complex image this is as weird of an image I could find this is world history in one image that's what I Googled and just looking at it you can't tell you can't make out just about anything that's happening maybe some some of the years here you could make out but nothing else let's see if Claude and CH PT could figure it out so again with these models you could upload images here with Claude you could upload five and each has to be 10 megabytes with chat GPT you could upload 10 here so that is one of the benefits and it has connected apps so you could actually connect to your one drive with Microsoft or mine is connected to my Google Drive which makes this a whole lot more useful when it just comes to these functionalities CLA is really lacking when it comes to just basic functionalities I'll point out bunch of them too as we go through this video but let me upload that image and I'm just going to press enter I'm not even going to give it a prompt here this is chat GPT 40 okay and from what I could see it found the name for it stream of time a timeline m map the rise and fall of different civilization Empires and Nations and it says it's from 250 ad to 1700 ad and I'm going to say analyze this image and explain this to me so sometimes just changing something and giving it in different format bullet point table I use that all the time okay and it's giving us a really nice presentation here in table format so it looks like it's putting the different country civilizations here and showing the time period where they were around the key events and so on and summary okay really nice answer here that we got out of chat GPT assuming is correct let's go to CLA and do the same thing okay again I'll just press enter to start with no prompt okay so it got the same theme it picked up on some of the colors and it's telling us it's from 580 to 1180 let's make that same table here and this one is using this thing called artifacts which is something you could turn on in settings is completely beta right now but it kind of creates things on the right side and it's great for coding and visual presentations like tables like this so I really like this new update I covered this in a different video that I posted about claw 3.5 and it created kind of a different table so it just gave us the elements and the description for these elements and chat GPT gave us something completely different so this looks more usable but when I'm looking at this it actually got the time wrong I did a little bit closer look at this image if you look a little bit closer on the very bottom it does end at the Timeline ends at 1100 ad so Claude was correct he got the information right so I'm just going to follow up here to ask for the similar table that chat GPT gave me I'm going to ask it for a timeline with each civilization and the rise and fall okay and this time it looks like it did a better job and just to be sure I ran this three different times and each time I got a very similar result here so basically chat GPT gave us something that looks better right so at first glance this looks more interesting in and it looks like it gave us better information but it totally made up the time that graph started from 500 to 1100 this did not give us anything that is in that time frame it kind of extended the time frame here so I wouldn't take any of this information at face value these are some of the limitations of these large language models in general you can't just look at the output and just assume it's right so sometimes it even makes sense to have two different subscriptions if you're doing Vision k capabilities of data analytics to just run them and then using your own judgment to see actually which one makes sense I had to really take a close look at this picture to kind of try to figure it out and this was one of the more complex things that I've given these large language model models to analyze but I could tell Claude 3.5 right now is winning there here's kind of a challenging graph to see what it comes up with so I'm giving it this graph but I cropped out what this graph represents but what it is is the interest rate for used card in the US here so I'm going to see what it comes up with okay here's chat gpts telling us this is a Trends graph from 2014 to 2024 is stable all the way till the pandemic then he has a dip which is telling us right here it has a 3% or it dips below 3% which is right so it was all the way around four dip to closer to three and then significant increase which is accurate let's see if you could actually figure out what this represents and he thinks it represents the federal funds rate which again they do set the interest rates so pretty close but I didn't really think it would figure out that this is for the car market in the US in this time frame but I was curious to see if it's going to do any type of research it's going to look online but it came up with two different conclusions here federal funds rate or Central Bank policy which is not correct let's go see what Claude did but that was not my test by the way I just wanted to see if it did that extra step right now I want to see if it could just analyze things pull in the numbers and then use those numbers to do deeper analysis and it looks like Claud again no problems here it gave us the range it told us the range of the interest rate here and it told us exactly how it's changing over time and this time I also asked it what this graph represented and it gave us five different options none of them were very specific to what I was going for but generally it's all about the interest rate here and it kind of figured that out but it wasn't specific enough to use car market in the US and the interest rate on that now I'm going to follow up with chat GPT I'm going to say create a presentation based on this information now it's going to go through here create these kind of slides for us so it's giving us titles for the slides then it's telling us would you like to create a PowerPoint file created with this contents let's say yes okay it's done and gave us a link let me go ahead and download this link to see what it gave us okay it looks like it gave us a detailed PowerPoint here it does need some styling it typically doesn't do the styling here but PowerPoint has this AI this designer AI where you could just go ahead and select different designs here from the side and get yourself a finished presentation so nice job with chat GPT and I asked Claude to create a visual presentation and look what Claude did here this is with the artifacts option turned on again you could turn that on in the settings but it wrote some code and then it creates this preview window and it created this nice visual graph I mean this is kind of the same as our current graph let me see if it could make us a PowerPoint presentation but this is really cool right inside of your viewport here let me see if it could create a PowerPoint presentation here okay it's doing the same kind of thing it's creating the slides or the text here for the different slides and what bullet point should go in each one and it looks like it cannot do that so I can't create or edit or provide download links to PowerPoints directly so all it was able to do is kind of write code and create this nice visual presentation for for us right within chat but in this case I did want a PowerPoint presentation now that's one of the big limitations of cloud there's lot of functionalities like this one was a really useful practical thing right I want to give it some data just from an image get the context from that and turn it into a presentation chat GPT could do that in one minute right and we could then use PowerPoint to design it using the AI inside of that Claud can't do that so it could only do things like these visual representations and again I ran this through a bunch of different tests and I think with data analytics so far in my early testing they were pretty equal so functionality goes to chat GPT but in the function of data analytics they both are about the same right now now at this point I usually do a head-to-head test with image generation but the only way you get image generation right now is using chat GPT with a paid subscription and that gives you access to Dolly 3 that generates images for you Claude cannot and never has been able to generate images so I can't compare that so that obviously is a point for chat GPT so if you need image generation in your day-to-day work you'll have to use another tool like co-pilots free and that has doly 3 built into it but you can't just use Claude because that doesn't have image generation at all so if that's part of your workflow keep that in mind now let's do a little bit of research here I'm going to ask chat PT write about ai's disruption in the accounting industry and give me specific links and articles and reports and here you gave us some information I'm going to go to the bottom of it to make sure he's giving us some relevant links here and for some reason the links are not clickable so sometimes it makes up links sometimes it gives us links that look like hyperlinks but when you go to click them they're not clickable I'm going to tell it they're not clickable I asked it to give me the links again and I still couldn't click them the third time I couldn't click them so I asked them to give me the links like this so I could copy and paste the links let's see nope made a page there let's try this one cannot find this page okay so a lot of times chant GPT when you use it for research when you need specific information from specific websites and resources like this it just does not work it literally makes up links like you're seeing here this has happened to me probably every other time that I've used chat GPT for research okay on the other hand let's look at Claude so Claude did again a nice job gave us specific use cases of things that could be interrupted by AI potential challenges if I go to the bottom here I don't have access to current articles or reports okay so keep this in mind Claude does not have internet access it never has had internet access where GPT 40 has internet access sometimes it makes huge mistakes like you just saw but sometimes it works so in this case it doesn't work at all chat GPT could follow URLs a lot of times I'm optimizing my website for example I give it a URL it goes craws the website it tells me things to improve I can't do things like that here for research I would not use any of these tools I wouldn't use Claud I wouldn't use chat GPT I would use perplexity AI so that is a great research tool it uses the power of these models in the background but it's really designed to be a search engine that's AI powered I have a different video about that or I'll use Google Gemini and let Google Gemini give me a snippet and I know that is kind of pulling from more accurate listings based on the Google search right so both of these in my opinion get a zero okay now let's do some coding I'm going to see if I can make a dashboard with these models we're going to go to the Nvidia website here and I'm going to pull in one of their financial reports here so this is a massive massive document I believe it's 98 Pages let's download this okay I'm going to ask Claude I uploaded this document turn this into a visual dashboard here to see what we get and usually if you have that artifacts option turned on it starts writing the code right here on the side and as soon as it's done it turns into preview mode where you could actually see the output which is awesome this is one of my favorite updates look at this it created this visual update for me and it's interactive so I could hover over things wow this is nice all right let's see if chat GPT could do the same now the nice thing by the way is both GPT 40 and Claw 3.5 Signet now they have such a big context window that I could just use a 98 page document as part of my prompt and upload that okay so chat GPT just gave us bunch of information from that document so it pulled in bunch of different numbers and things like that it did not create the visual presentation it's ask me if I want to proceed I'll say yes and again it looks like it gave us a ridiculously long stepbystep process on how to use this other app to do this outside of chat gptt it's not even attemp attempting to write oh it's still going it's not even attempting to write any code for us again I went back and forth three different times with chat GPT to try to just get it to do this before it used to create interactive and I think it still does but for some reason in the last couple of days I haven't got it to do any type of coding or create any type of interactive graphs here when I give a very specific instructions to do so okay so when it comes to visualization of data using Code well cloud is obviously the winner there now let's see if we could create a game this time I'm going to create a game of checkers I typically do a game of snake or Tic Tac Toe let's see if he could do a game of checkers without again any information about what kind of language to use in just 10 seconds he wrote the code and he gave us the preview now let's see if it actually works it says current player red let's go ahead and try to move our piece from here to here black from here to here and I'll take this piece nope oh he almost worked but it doesn't quite know how to take a piece I asked chpt to create a game of checkers and this time it's giving us again bunch of text board layouts okay where's the code okay there we go we finally got it and he chose python here and here's the python game that chat GPT wrote it does not have any pieces I can't start a new game okay so it just made the board so I'll just give it one prompt to try to fix it although I didn't give Claude any prompts okay here's a new one we got pieces this time and okay it did not add any functionality so it just basically designed a game that doesn't do anything so I don't know so far I've tested this a handful of times and every single time claw 3.5 Sonet when it came to any type of coding it be chat GPT 40 okay let's test out complex reasoning here so here's the prompt at a party each guest shakes hands with every other guest exactly once if there were a total of 66 handshakes how many guests were at a party okay so CHP created a nice formula over here and I know the answer is 12 so let's see if we get to that answer it came up with two answers 12 And1 since the guest can't be negative the number is 12 and it gave us that answer let's try Claude okay Claude took the same path it came up with 12 or1 since the answer can't be negative the number must be 12 okay let's try this one what has a voice that can't speak it has a bed but never sleeps it has a mouth but never eats and it runs but it has no feet okay Claud says this is a riddle and the answer is a river and chat GPT same thing it thinks it's a river okay I'm obviously not doing a scientific test but you could see they're both kind of doing a good job when when it comes to logic and solving riddles and puzzles now this next one is for Content creation so what I'm going to do is I'm going to give it a YouTube script I'm just going to upload my last YouTube script here and I'm going to ask it to turn it into a tweet this is my prompt extract the core lessons and actionable items from this YouTube script condense it into a concise tweet or LinkedIn post suitable for quick consumption and I'll add that YouTube transcript here okay here's our tweet Claud 3.5 Sonet and new free AI model from anthropic outperforms GPT 4 if forgot the 40 part on most benchmarks Improvement in speed vision 200k context window introduces artifacts okay really nice there is no reason why besides fixing this little missing o here I would use this right here you created a couple hashtags it's good usually I'm having a hard time finding even when I'm building my own gpts to do a really good job where I would actually use that tweet or that link Linkin post okay so chat GPT decided to give us a tweet and a LinkedIn post so the other one only gave us one and then I guess I could use it for both but look at this tweet new CLA 3.5 is here faster forget subscriptions dive into the future never would I use that that's just extremely uh bad okay if I see a post like this I'm not following that person right that is not a useful post so once again it looks like claw 3.5 Sonet is beating chat GPT okay so with my practical test claw 3.5 Sonic came ahead overall but huge limitations I want to point out so if you had to choose between one of them here's a huge limitation when it comes to the paid subscription you don't have web browsing with Claude a huge downside the information is not going to be relevant and up to dat if you use this for research I recommend you use perplexity anyway for research even the free version of perplexity is going to beat both of these I think that's a huge downside for Claude if you need to create images huge downside for Claude Claude doesn't have a good way to search any previous conversations but neither does chat GPT but chat GPT does if you have the desktop app or if you have the mobile app they have a search function which is huge it's missing though right now inside of the chat GPT website for some reason now two huge downsides with Claude it doesn't yet have memory function and that I had no idea how much it's going to improve the functionality of chat GPT by default when you talk to it sometimes you store things to memory and it gets smarter and smarter based on your previous conversations and gives you better responses so that's a huge benefit of chat GPT and the biggest reason why I would choose GPT 40 over Claude is because with a paid version of chat GPT you could build custom gpts those are very specific little mini gpts basically with your knowledge base and with your very specific set of instructions for my company we've build well over I think we have 15 20 different custom gpts and each one does a very specific task at this point I wouldn't really even know how to function day-to-day without those Claude obviously just doesn't have those co-pilot for some reason is getting rid of those but that is my favorite part of generative AI is those custom gpts that I could train to do just do one thing really really well where the broad version of chat GPT and CLA are just not going to be that good at it right they don't have that specific knowledge base they don't have that specific set of instructions so I've covered custom gpts in a different video and exactly how to build them so if you haven't watched that and you're not using that I highly recommend them they will solve so many issues for you and it will save you so much time throughout the week so I'll link that video here I hope you found this head-to-head useful and you can make a clear decisions between the two right now just from the function for my personal use I kind of have to use both because some of these coding just basic coding things that I'm doing just CLA is just so much better with that so I'm going to use Claud for that kind of stuff I'm going to use chat GPT for my everyday writing summarization things like that and I'm going to use perplexity for research I'll see you on the next video