Comparing AI Models for Coding Tasks

Deepseek versus Claude 4 Opus versus Chach GPT03 versus Gemini 2.5 Pro. Who wins? Today we're going to be testing them out side by side in an epic coding gauntlet between these AI models. Bearing in mind that Deepseek have just released a brand new update called Deepseek R1 0528. Literally just got announced. And today we're going to be testing all of these models side by side to see who's the winner, who's the best. You can get all of the prompts that I use today from the AI success lab link in the comments description. And the first prompt that we're going to go with is building an AI powered SEO therapy chatbot. Let's get straight into it. I'm going to run the prompt inside each of these. We're going to be using Claude Opus 4 and I'm going to make sure that I have extended thinking switched on as well just to get the most out of each of these models. We'll do the same inside Chip T3 and then we'll do the same inside Gemini. Of course, you want to make sure you have canvas switched on with Gemini 2.5 Pro in the top right there as well. So, let's see who comes back with the best output. Open these up in separate windows just so you can see how fast each one moves as well. And let the coding commence, peeps. So, what you can see here is we've got Deep Seek on the left, Claude for opens over here, Chat GPT03, and then Gemini 2.5 Pro. Now, what you can see here is that each of these took a similar amount of time to start coding out. I honestly think that it Oh, look at this. We got Gemini. We just come up with the first output. So, let's see what we got back here. Dr. Ser, pretty basic app to be fair, but we'll see how this performs. and actually responds to us straight away as well. So we put hey and this is working nicely and event my frustrations here. Okay, I got hit by the Google updates. Is it going to reply to me? Actually does. Look at that. It's a It's a very expressive AI therapist to be fair. Oh honey, those words carry so much pain. I hear you loud and clear. Blah blah blah. I'm actually surprised this is responding to us. That's pretty crazy. Maybe that's built in or it's using the AI to do that. I don't know how it's doing that. The UI of Gemini was not that great, but it created the output and it created it pretty quickly as well. Now, I've got the output from Claude. If we move the cursor, it actually uses its eyes to follow us around like you can see. And then inside here, it says, "Tell me about your ranking losses. I'm here to listen." Let's say I miss my SEO traffic and it says it actually came back to us and said I hear you losing your top three positions blah blah blah must be hard. I'm actually going to open that up in a separate tab just to make it easier to view. And there we go. I say what to do, but it doesn't really seem to give me answers that are like relevant to what I'm talking about. Does that make sense? So for example, it says I said what to do and it said it's completely normal to feel this way. Remember you are more than your domain authority. You have intrinsic value beyond page rank. It's quite a funny creative sort of response but actually doesn't answer our questions whereas Gemini seem to. What would be interesting is actually if you could plug in API key and and then get outputs based on the AI GPT3. Has it come back to us? Yeah, it's come back to us. All right, let's preview this bad boy. See what we got. Laala. So, GPT03 seems to have totally failed. You can see it says an error message right there. So, the UI does not work at all. So, I'm going to place that behind Gemini and Claude. So far, so far, I would say that Gemini is winning simply because it gave us the most relevant responses when we asked it questions. And then deepse car one. It's giving us an output, but it's in Python. I'm going to say turn that into HTML and see what we get back in a sec. History by says Gemini 2.5 Pro is better at coding, but worse at other things. Every model has their weaknesses, right? So, for example, if you want to generate an article or a blog, honestly, I don't even need to check all these. I would say that Claude for Sonic creates the best outputs by far. Gemini is really good at coding, but it's not the best model out there to be honest with you. Just waiting for this to come back from deepseek. It's holding us back here. Slowing us down. I'll put on the next prompt to use HTML instead. Gemini chat GP03 is almost guaranteed to come in last here unless Deep Seek gives us something really bad. In the meantime, we'll come back to Deep Seek whilst it's generating that cuz it might take a little while and we'll move on to the next task which is build a schema Shogun puzzle game. You're a samurai cutting and placing structured data into HTML scrolls, match FAQs, blah blah blah, and we'll turn that into a game as well. All right, so let's wait for Deep Seek one to load. It's actually taking absolutely ages to load the page right here. So, what I might do is just go over to Open Router instead. All right, so we're going to run this inside DeepSeek and then we'll compare it against the rest as well. Let's go inside Opus here. We chat GPT 03 ready to go. And then Gemini as well. Gemini 2.5 Pro. Again, I think on all of these from what I've seen. Deepseek is by far the slowest. Now, we can actually run this. So, let's click on run and see what we get back. I said before we begin, what's your domain name? We'll go with okay. And then we've got the chatbot ready here. The UI is actually better than Gemini 2.5 Pros, which is awesome. And if we go inside, okay, we'll do a little chat here. It's writing back to us. How are you feeling about your SERs? I'm going to say terrible and then it sends the same message again. All right. So, if I had to give Let's see if this We'll give it one final shot. Absolutely terrible. Say Sunshine. Oh, it's just bro. It's a broken record, mate. You'd give up on that therapist quickly. How are you feeling about your serves today? It's asked us out three times. The UI is great, but the actual back end terrible. So, if I had to grade each one of these, I'm going to go with Gemini 2.5 Pro as a winner. Claude for Opus coming in second. Deep Seek coming in third. UI was great, but the functionality was terrible. And then finally, the final one I would say is chat cheapy 03. Didn't create anything. Now bear in mind deepseek is free and you can even code with it for free with the free API of open router whereas for example chat GPT03 is paid and it's it can be pretty expensive I think on the API as well. So we're going to close that. We know who the winners are and we're waiting for each of these to load on the next task. So we have the output again Gemini and then Claude coming in terms of speed. So, these are the fastest models by far. And then we've got schema shogun. Slice the data into the correct thing, but it doesn't really seem to do anything. Doesn't make any sense. All right, let's close that. I'm going to say inside claude just to get a better output. I'll do the same inside Gemini as well. Let's just say make this a 3JS samurai runner game. We'll do the same inside Gemini. So same prompt inside Gemini and Claude. We have the output back from chat GBT3. Let's preview that. There we go. That's working nicely to be fair. I'm just going to say okay, make this like a freejs runner game on that too. And then Deep Seek is still it's coding out now. I'm just going to stop it and then I'm going to say the same thing inside here. We'll wait for each of these to load. So Gemini actually has an error on this. We'll try and get that fixed. And then chat GP03 says stop the HTML and then run it again. It's got a bug. So, we'll try and fix that. And back in a sec, peeps. Here we go. So, we got Gemini ready to go. Let's play this bad boy. It's got sound effects. There we go. Oh, no. It's not much of a samurai though. That's what I don't like. Claude is stuck. Doesn't seem to be doing anything. Chat GBT is not working. Deep Seek is still coding away. Here we are. All right. So, we got Opus 4 ready to go. We got a little samurai stickman just collecting honor. That's probably the best output so far to be fair. So, chat GP03 just doesn't seem to be working at all. I'm just going to preview that once more. Yeah, it's not working. And we got the HTML back from DeepS. So, let's test this out. Begin the journey. Ah, okay. I'm not really sure what's going on there. Well, that's the most playable to be fair. Gemini, it's too basic. I'm sure if you go back and forth inside the chat here, you could create something interesting, but Claude is winning by far to be fair. That's an actual playable game. Again, like you could go back and forth with the details a lot more if we had more time, but just as a quick preview, I'm going to say the Claude one here. So, if I was to choose any of these, I've got to go with Claude. Now, let's create something a bit more basic just to give each a chance. So, we're going to go back to Claude, Deepseek, Gemini, and 03. And by the way, if you want the prompts from today, feel free to get them inside the AI success lab. I'm going to run this prompt for Dopamine Space Invaders. So, I think at this point, just to recap here, let's turn that off. It should be on 03. So, just to recap here, if I had to pick one so far, Gemini, I would say, is probably performing the best. Deepseek is creating good stuff as well. Claude as well is up there, but Chat GPT is not even in the race at this point. So, Chat GPT03 finished first here. Let's test this out. Here we go. So, space fires. Here we go. That's a pretty nice game to be fair. Nice effects. I like the background of all the stars and everything like that. It's looking pretty cool. Gemini looks just about finished. Deep Seek still coding. And we got Claude ready to go as well here. Claude, you can actually control the game with your your mouse, it looks like. And here's a shoot button. I would say that's still better than for example GPT03. If you look at the effects, the screen moving, the colors, the way you can control the mouse, the feel, etc. It's still better than 03, but 03 did okay. Let's have a look at Gemini now. So, Gemini is finished. This is looking pretty nice. I like the retro feel. It's got sound effects as well. That's what we're talking about. Oh, yeah. Gemini is winning. You old rogue. He's crushing it. All right, let's try Deep Seek. Deep Seek is still taking his time. I think so far from the tests, I would say though that that Claude Opus has created the best outputs each time, right? It's a close tie between Gemini 2.5 Pro and Opus. But I just think Opus is overpowered in terms of what it can create, the functionality, etc. is pretty awesome. So, we got the results back from Deepseek. Let's play this. Ah, it's lagging. It's not bad, but it's lagging. Yeah. So, I think that's pretty conclusive then. So, for example, Claude Opus would be my pick from all of these. And I know some people are going to hate me for that in the comments and say that that Gemini is a GOAT, but yeah, I'm going to go with Opus, then Gemini. Deep Seek comes in a solid third, right? It's definitely better than CHP3 from what I've seen. And 03 really only worked on the last test. It didn't work on the rest. So, thanks so much for watching. If you want to get access to all the notes from today, all the prompts, etc., feel free to get that inside the AI profit boardroom. This also comes with courses on deepseek. You can see the links here. Courses and recipes on deepseek claude 4 2.5 pro and GPC3. So for example, if we click on this link, you can see we actually have a 4hour course on deepseek via this link. Claude 4 as well we have full courses on. So feel free to get this stuff inside the AI success lab along with all the prompts from today if you want to test this stuff out. And if you want to get coaching support, community access, if you want to get Q&As's, if you want to be able to interact with me, etc., feel free to get that inside the AI profit boardroom. You can also ask any questions you have inside there. I'm very active inside this community. Comes with lots of templates, agents, workflows, all the stuff that I've actually implemented inside my business. It's fun creating games, but let's talk about what's actually going to make you money and save you time. That stuff is all included inside the classroom. And then it also comes with weekly coaching calls plus a Q&A as well. And on top of that, each week I make a personalized video answering everyone's questions. Now, if you just want us to implement all this stuff for you, maybe you want to save a lot of time. You want someone to go off an agency to just build automations for you, feel free to get a free AI automation strategy session. Link in the comments description. You can book in a free AI strategy session and then just book in a call with us and we'll be happy to help you and basically just plan out step by step what you can automate, how you can automate it and then the next steps for doing that. So feel free to get that link in the comments description. Appreciate you watching.

Transcript for:Comparing AI Models for Coding Tasks

Transcript for:
Comparing AI Models for Coding Tasks