Agentic AI Browsers Overview

As someone that's been covering AI daily for almost three years, I feel I'm kind of able to spot trends before most people. And don't get me wrong, there's a lot of fads and duds and hype in the AI world, and then there's some big splashes that amount to nothing. But there's been a recent title wave of momentum over the past 3 weeks in one area that I don't see fading and that's agentic browsers. And not only do I think this isn't a fad, I actually think this is the future of work for most of us. And it requires a much different approach than we've been working with over the past 3 years since the generative AI boom. And that's why I think Agentic AI browsers are the next frontier of artificial intelligence. So, we're going to be talking about that and a lot more on today's episode of Everyday AI. What's going on, y'all? My name's Jordan Wilson and welcome to Everyday AI. This is your daily unedited, unscripted live stream podcast and free daily newsletter helping everyday business leaders like you, like me, not just keep up with all of these uh flurry of AI developments, but how we can make sense of it to leverage all of it to grow our companies and our careers. So, if that's what you're trying to do, this is your new home. It starts here uh with the podcast and live stream, but if you really want to take it to the next level, you need to go to your everyday.com. there. Sign up for the free daily newsletter where we will be recapping the biggest takeaways from today's show as well as keeping you up to date with everything else you need to know in the world of AI. And you can go listen to now like 570 uh back episodes from some of the smartest people in the world all for free sort of by category on our website. Uh so if you want the AI news, make sure to check today's newsletter for that. But let's get straight into it and let's talk about agentic browsers and agentic AI in the browser and how I think this is actually going to be the default way that we work with today's large language models. All right, so here's my hot take yet this year. I think that agentic browsers will be more powerful and probably maybe not by total number of users but just by overall uh direction. I think it's going to be more popular as well than their AI chatbot counterparts. So what does that mean? As an example, we're going to be talking about it a little bit in today's episode. Perplexity. Uh Perplexity came out with their very popular agentic AI browser called Comet. So what I'm saying is I think in 2025 Comet will probably have more upside more momentum uh and be more overall useful than its large language model front-end counterpart perplexity. Right? So as an example going to perplexity.com and using perplexity as a chatbot or using it as an agentic AI browser inside the perplexity comment browser. So yeah 20 I still in this year I think it's going to be more popular more commonplace and uh more momentum behind uh developing these agentic browsers and why well uh once you start using them you'll understand but I think agentic AI browsers are all about action and task completion whereas I think front-end chat bots are usually more about uh research personalization and content creation right but it still feels like there's a step to bring better data in before you get started and then a couple of steps to make use of all of this content that you create in a front-end chatbot. So, let's preview what we're going to go over the rest of today's show. So, stick around and I'm going to give you the five main advantages of AI browsers over front-end chat bots. I'm going to show you the recent momentum behind this movement from big tech, which makes this hot take, I would say, kind of undeniable. And then I'm going to preview some of the upandcoming AI agentic browsers that are worth paying attention to. So why does this matter? Right? Maybe you don't care. I don't care about agentic AI browsers, right? I I like using, you know, Chad GBT here or there. I like using Gemini or Claude, right? Uh why does it matter? Well, internet browsing is habitual. It has become such an ingrained part. If you're a knowledge worker sitting in front of the computer, which I'm guessing that's like 99% of our audience, it is literally ingrained in your brain these kind of old school processes. And I talk so much about unlearning, uh, you know, and how using a large language model, especially as they've grown in maturity and capabilities, you know, with with reasoning, um, you know, being able to think through problems step by step, being able to agentically go through their tool use, right? I think it you have to unlearn uh kind of these now automated ways that we work as humans. And I think one of the biggest culprits is just opening a browser and going about something the old way that you've always done. But I think the future of work is humans orchestrating large language models in agentic AI browsers, not just manually going through the task like we've been rewarded for for many uh decades. And I think the other reason why this really matters now developments are happening so fast faster than the general generative AI space which is actually hard to believe but it is true. All right. So you know also I'm curious live stream audience have you used an agentic AI browser yet? Um you know there's there's a couple of them out. Uh I I'll say at least among the the big names, probably one of the more popular ones is Perplexity's uh Comet. We've seen now uh some um uh some reports that said OpenAI is working on their browser and their new chat GPT agent. I think it's, you know, essentially setting the groundwork for how it will work inside of a dedicated browser. But let me know yes or no. Like are you using Aentic AI browsers right now? Like yes, no, interested, not interested. um you know because I might end up doing a a follow-up show uh maybe this Wednesday or next Wednesday, you know, for putting AI to work on Wednesdays. Uh so if this is something yes, you're very interested in, yes, I care about it, yes, I'm using them, we'll probably schedule another couple shows, so please let me know. But now, let's go over the five reasons why Agentic browsers are the future of everyday AI use. Number one, there's no middleman. All right. And I should probably zoom uh zoom out a little bit and first explain the difference between a gentic AI browsers and large language models. So I kind of already gave you an example of it. You know, perplexity versus perplexity comet. Uh but essentially you have your AI chatbots, right? That's what we all call them. That's going to a large language model on the front end. So there's front end and back end of large language models, right? Most of us are probably front-end users. We go to chatgbt.com, gemini.google.com, claw.ai, whatever it is, and we log in, right? And then there's backend development where you can work with APIs. We're not talking about that. So, you have your front-end AI chat bots. And then you have now a new category, aentic AI browsers. Uh, so in the example of perplexities, uh, which we'll talk about a little bit more, it's where you actually instead of logging into perplexity.com, instead you download their browser. Uh, it's based on Chromium. So, if you're a Chrome user, which I think most people are, it easily imports all of your data. Um, and then it has this hybrid AI approach. So, that's one of the big differentiators, and that's point number one. There is no middleman. So, when you log into a front-end chatbot, it almost feels like an unneeded degree of separation between our data context and our end goal. Uh, right? And I've been seeing that and realizing that more and more uh both in the last you know week or two the more I've been using perplexity's comment as I'm using uh chat GBT's uh agent I've been using uh Google's uh mariner which I'll talk about a little bit more but the more I use more of these agentic browsers or um agentic um kind of uh that have virtual computers virtual desktops the more I realized that working with large language models the traditional way doesn't always seem like the most intuitive, right? It almost seems like seems like there's this degree of separation, this middleman. And I think that's one of the reasons why uh with agentic browsers why they're going to pick up in popularity. Reason number two, direct access to logged in content. So, uh I think you can if you are a power user, right? So, let's just say you're spending, you know, 8 to 10 hours a day in large language models. That's not most of us, right? Uh for me, that's a lot of my days sometimes, but I think you can actually save hours a day. Uh and not only that, but get better results by having direct access via an Aentic AI browser. So, what that means is when you are logged in to an Aentic browser, it just knows it. it it has access to all of your tabs opened. Um file sharing. Uh copying and pasting is is is cut down. Um like even just the traditional context window kind of goes out the window, right? Because you can still open multiple browse brows uh tabs in an agentic browser, but those tabs also serve as memory for your context. Uh and also wherever you're logged in, which brings us to point number three, richer context. Uh so that's just in agentic browsers you have much richer context for your tax tasks. Um so you can connect online through your browsing history uh your emails your web pages PDFs whatever you're reading etc. All of those are instant context. So again that saves you from having to oh I need to open this um you know I need to open this file first and then I need to download it and then I need to upload it to the LLM. Oh, wait. This is the wrong format or it's not reading something correctly, right? It's just so much faster and more intuitive um to just be able to have that richer context uh available immediately. All right, let's go to reason number four. It's just less duct tape, right? Don't get me wrong, I'm very bullish on these new Aentic AI protocols. So, what that means, and the easiest way to explain this that I've done before here on the show, um, websites have traditionally talked to each other via APIs, right? It's essentially a language that allows two different websites to be able to talk to each other, share data um, dynamically, update things on the back end without a human needing to be involved. So, we have that now with AI, right? Um, APIs didn't really work with AI in large language models. They needed their own language. So then we had the very popular uh model context protocol MCP from anthropic. We have Google's version of that which is at uh A2A agent to agent etc. So these are all great don't get me wrong and they're all very much needed. I think with agentic uh AI browsers there's just less duct tape. you don't need to set up and configure MCPs or agentto agent protocols and you know there's a another halfozen of very popular now uh essentially agentic AI languages that allow agents um and large language models to talk to each other and share data in the background. So yes, I I'm I'm still bullish on that category. I still believe it's it's very much needed. But when you're using an agentic browser, you don't need any of that. You need to log into the websites that you need, right? And I gave this example when we did a perplexity uh comment show. I don't need to go into, you know, as an example, my, you know, we our our podcast is hosted on Buzzprout. Um, you know, we use Beehive for our newsletter. We have Google Analytics um data for our our website, right? all of these different things. Generally, what I would have to do is log into all those systems. Then I would have to download all of this data. I'd have to make sure it's in the right format. Make sure it's all clean data. Uh then I would have to make sure it's in a format and upload it to a certain large language model, right? Um and if I really wanted to take it further agentically, then I would have to, you know, configure an MCP server for one of those um avenues, right? If I wanted uh dynamic easy access. But there's less duct tape with a browser with an agentic browser. If you're logged in, you don't need to do all of those things. It is automatically able to access your dynamic data. Is it slower? Absolutely. Uh is it more stable? Probably, right? And we've seen so far over the last couple of months with the um explosion of the model context protocol. There's also been a sururgence of problematic MCPs as well. Uh right, some fishing scams, all of these things. So as something gets popular way too quickly, the bad actors capitalize on that as well. So yes, I understand uh you know uh this less duct tape approach in a browser might be much slower, but I do think ultimately it requires uh less tech knowhow and for personal use uh right if I have to choose between an aentic AI browser uh versus configuring an MCP. I mean it obviously depends on the use case, right? But in many instances I'm just going to go for the one that's in the browser. All right. And then reason number five that I think agentic AI browsers are going to potentially overtake using large language models on the front end is it's just faster, right? Um and and so I know this goes against the MCP example, right? So what I'm saying right now is a lot of these uh right now uh virtual, right? So let's talk about operator OpenAI's operator. Uh it's very powerful. It's very slow, right? Right? You have to log in on the front end. It's actually operator.gpt.com. Uh right, and it essentially uses computer vision and it kind of takes screenshots to navigate around the web. Right? So if you're trying to do something around navigating around the web, it can actually be hard uh by using these um more agentic features of a front-end large language model. I know we're getting a little tricky there. That's why um and and you know kind of walking a tight line here but I think the easiest example is to talk about operator very powerful uh right but it's extremely slow right but when you bring in aentic AI um you're moving everything on browser right so it's almost like you have this hybrid approach where for basic and and much needed things you're able to process those like in the browser kind of like with edge AI right if you have an AI chip on your smartphone, if you have an uh an NPU AI chip uh right in your computer, something like that, it's certain things are faster because you don't have to send every single process to the cloud. So, that's another huge advantage uh is just having that ondevice speed. Uh and the big tech momentum on this is undeniable. All right. Uh but before we get to that, quick a word from our partners. This podcast is supported by Google. Hi folks, Paige Bailey here from the Google DeepMind Deval team. For our developers out there, we know there's a constant trade-off between model intelligence, speed, and cost. Gemini 2.5 Flash aims right at that challenge. It's got the speed you expect from Flash, but with upgraded reasoning power. And crucially, we've added controls like setting thinking budgets so you can decide how much reasoning to apply, optimizing for latency and costs. So try out Gemini25 flash at astudio.google.com and let us know what you built. So, the momentum around agentic browsers from big tech companies is undeniable. And uh I'm I'm looking here. Everything on my list except for uh some of these agentic browser pieces from Google. The rest of these have really developed over the last two weeks. Right. So, the first and I think one of the more impressive is actually Perplexity's comment. Uh so we did do a dedicated show on that but let me just give you a little bit of information. So this is again a gentic AI in the browser. It does have that hybrid AI architecture. So it kind of cuts out the middleman of an AI chatbot and everything is happening in the browser. Uh and we did go over just about uh a week and a half ago uh in episode 568 we went over five business use cases. So again, I would say for most arguments, you know, Perplexity is probably a top five, top 10. Um, you know, AI company depending on what you're looking at. So again, they are they are hats off to them after, you know, 6 months ago. I said they're going to have to hard pivot or they're going to get squashed and they are leading I say the pack uh at least in getting a truly a gentic AI browser to market. All right. Next, OpenAI's chat GBT agent. So, uh, yes, this is still you're using it on the front end on a large language model, but hear me out. Uh, because it has its own virtual browser. That's the key here. Uh, Chetch's agent, it operates by essentially launching its own uh, virtual computer and on that virtual computer, it uses a virtual browser. Um, and I talked about eventually OpenAI will have its own browser. It's been reported and we talked about that with the reporter from Reuters who broke that story in episode 565 if you want to go back and listen to that. So the Chat GPT agent uses its own virtual computer including virtual desktop um virtual browser etc. And it combines deep research operator uh terminal access and some new capabilities right uh and I do assume that probably by at some point this summer we'll see open AI's browser. It's been reported by just about every single outlet that they're going to be releasing a browser uh based on Google's Chromium, which FYI, Perplexity comment is based on Google's Chromium. Uh OpenAI's Chat GBT agent based on Google's Chromium, right? So that's their uh kind of open-source version of Chrome. And hey, here's one other company going agentic in the browser based on Google's Chromium. uh one of their biggest competitors, Microsoft, with their Edge browser. All right. And there is a pretty interesting update that just dropped days ago uh to Microsoft Edge and their co-pilot vision feature within Edge. So I won't say that Edge is a fully uh you know fully agentic AI browser but over the last uh few months and especially this update last week they are bringing more and more agentic capabilities to the edge browser. Uh so one is just copilot vision so that's Microsoft's AI tool that can analyze and assist with content shown on your screen. Uh so the July 2020 the July 2025 update uh lets it view any app or window on your desktop that you choose not just Microsoft Edge. So that one is pretty cool there. So obviously you can use Copilot Vision um and it can understand and interact with anything on your screen. Uh but then to be able to bring your desktop into it I think starts to build a more cohesive um agentic workflow. Also, you can control what Copilot sees, making it easier to get contextaware help with anything on your desktop. And like I said, it's based on Chromium. And now the quiet leader in the space, I don't think Google has the best uh project uh product when it comes to Agentic AI. But the thing I'm thinking of, okay, the three other big leaders that I just talked about, they're all using Google Chromium. So, I have to believe it's going to be any month now that Google is going to come in with a huge update, whether it's to their Chrome uh product, whether it's to their project manager, which we're going to talk about, or a project mariner, which we're going to talk about live, or uh now, or maybe their uh Gemini assistant, their in browser assistant. But you have to believe uh Google is probably going to be striking back. But they're they already have um you know like I said two or three different versions of agentic uh browsing already going on. So one is with project mariner. So unfortunately um project mariner I think is only rolled out to some trusted testers and then to people on their uh Gemini ultra plan which is something that I pay for obviously. So I've been able to use uh project mariner. So if you don't know here's what project mariner is and uh some of its agentic capabilities. So, uh, it's a virtual browser, right? So, it's kind of weird. It's like a browser in a browser, uh, just like operator. So, it's not truly a gentic, uh, at the core yet because this is more of a, uh, a service within that you would use while logged into Google Chrome. Um, so it agentically browses the web. One unique feature that I really love with Google's Project Mariner is it has the teach a task mode. uh where essentially you can share a tab, you can talk to um project mariner uh go do a bunch of series of actions um and then it learns it and can repeat it which is amazing. I hope that all other agentic browsers have something like that. Also, Google does have a Gemini assistant in browser. I don't think a lot of people know about that. Um but it's just in the upper right hand corner. Uh so pretty cool. Uh also I think that chrome google chromium holds the key here uh right because comet and edge and open ais's browsers are all going to be based on chrome or sorry chromium. So I am expecting something big from uh from Google in this regard when it comes to agentic browsers. They have the feelers out there. Uh they're getting uh data and feedback from their trusted testers. they're putting out some different agentic capabilities uh in different parts uh of their kind of suite of products. So I would expect something big from Google uh probably before 2025. All right. So there you have it. the biggest companies at least are going all in in now, right? Aside from um Google, which I think most of those developments were announced uh in March and April, but aside from that, everything else has been over the last like week or two. So, the momentum that we've seen uh from Microsoft, from Open AI, and from Perplexity, and now from a lot of these startup contenders is going absolutely bonkers. Okay, so now last but not least, let's talk about some upandcoming agentic browsers that are worth keeping an eye on. So first is the fellow AI browser. So this had this is an agentic AI browser featuring a deep action workflow that automates cross-sight multi-step tasks in shadow windows. It's built on the eco framework enabling customizable agentic workflows via natural language or javascript language as code and it can access both public and password protected sites that's important uh to generate comprehensive reports or perform research. Then we have uh neon from opera. So opera is uh one of the you know kind of second tier I would say leaders in the browser wars right. So you have you know Google Chrome and then everyone based off Google Chrome you have Firefox, you have Safari and then that I would say is tier one and then tier two I think Opera, Brave and others um are right there in the mix. So now we are having more of these uh agentic um offerings. So this is different than their normal Opera browser. So, Opera Neon is a new experimental agentic AI browser offering different modes labeled chat, do and make to autonomously code websites, generate games, fill forms, and also book travel. So, it's kind of like half large language model, half aentic browser with Neon. Uh, so that does require a paid subscription like so many of these do and it's weight listed right now, but it integrates contextual AI to perform offline tasks without switching apps or tabs. So, some pretty uh unique features in uh Opera's Neon. Uh, next we have another Chromiumbbased offering which I think is probably been one of the more popular uh kind of uh agentic browsing startups and that's DIA. Uh so this is from the browser company. They've kind of gone all in on DIA after a pivot themselves. Uh so DIA chats with open tabs, rewrites or translates text in line and also helps plan tasks. Uh it can also route your queries through a skills system that selects the best LLM or tool for each task. And it also leverages local browser cookies and encryption to act to um access logged in sites safely for contextual assistance. That's one of the biggest advantages as you can see right I'm I'm bulletpointing some of the features of these more startup agentic browsers. One of the biggest things is like, hey, once you log in, uh, and it stores that via cookies, you don't have to log in, right? Let's just say there's 10 sites, right? I kind of give you my example, right? We use Buzz Sprout for, uh, distribution. Uh, you know, I I check Spotify um, for our our podcast stats. I I check Apple podcast for our podcast stats, Google Analytics for our traffic, Google Search Console. Um, I track uh, you know, everything in Beehive. That's for our free email uh, free uh, daily email newsletter. So, I have like 10 or so websites that I'm constantly logging into almost every day, right? Our YouTube analytics, all of these things with these Aentic AI browsers, DIA, all these others that I mentioned that have this logged in state. That's that's what's so great about it. you don't have to, you know, set up and monitor, you know, custom MCP servers, uh, or wait, you know, maybe months or years, uh, for, you know, a connector to come to one of these services if they are more niche. So, just think of all those websites that you constantly log into. Uh, right, that's the big advantage, um, of these Agentic AI browsers is being able to take advantage of that logged in sites and to be able to go on its own, right? you can just kind of work out a prompt that says, "All right, go visit these 10 sites, pull all my you know, pull all my stats, you know, go look at my email and then give me a plan for the day, right? Oh, if if you're uh you know, if a certain KPI that you're tracking has been tanking the last week, okay?" You know, an agentic browser can go through um and know that and then it can go uh see if there's any emails, any open conversations about what that stat uh or what maybe caused that that spike or drop. Um, so that's a huge part and I do think right now you still have to, um, depending on the browser you're talking about, you do have to, you know, get a little bit back into the prompt engineering side of things, uh, right, if you do want something super autonomous and impressive. But I think in the short run, you know, uh, aentic browsers are just about giving you better results in less time than a large language model. Uh, and then last but not least, this one has a little bit of an asterisk because it's not technically a browser, but I think I have to still mention Manis. Uh, so they do have a cloud browser. So, I would love I would love to see uh Manis bring an actual browser out. We'll see. Uh but this is more of a one of the more popular agentic AI platforms but recently they have launched a cloud browser which is why I think okay this is close enough that we should probably include it uh in the uh agentic browsers to keep an eye on. So uh it's a general AI agent very much like chat GBT agents uh that turns thoughts into actions autonomously executing complex tasks across work and life domains without continuous human prompts. It's built on multi- aent architecture. That's pretty cool. Using claude 35 sonnet and Quen uh models. So uh Manis is from um a Chinese product. So it uses Quen and some other models. There's 29 specialized tools including browser use for web automation. And then like I said, it does have a cloud browser. And the like the reason why you well I think it's better to have a local browser. But a cloud browser I think will still suffice if you want agentic browsing because the biggest thing is again keeping that login state so you can have it go you know check your email check whatever you know piece of software that's very important uh for your day-to-day work. Um so being able to continue those tasks independently um you know via the logged in state via the cloud browser. All right so that was a lot of information in a short amount of time. So, let's just recap and let me just make one more case why I think that agentic AI in the browser is the next frontier of artificial intelligence and why you and your company need to be paying attention. If I'm being honest, front-end chat bots, yes, they're great. Yes, they're going to continue to be developed. Yes, I'm going to talk about them every day on this show, but that's yester year's technology. I do think that everything, all these developments that we're seeing um right now in front-end chat bots, right? That's cadgbt.com, gemini.google.com, claw.ai, uh you know, copilot.microsoft.com, right? Um these are all for the most part, many of these features are going to end up in the browsers. And I think aentic browsers are the future. We can't ignore the facts, the receipts, the writing in the wall, the momentum, all the big tech companies and startups getting behind this movement. Uh, and like I said, the progress in this space is currently outpacing large language model development, which nothing outpaces large language model development. Y'all, I cover this every single day for almost 3 years. Large language models have been developed and innovated and uh iterated upon faster than probably any technology we've ever seen. And I think especially over the last couple of weeks, the momentum behind the agentic browser space both from big tech tech and startup companies, it is undeniable. It is strong and it is not going to stop because every big tech company is investing in agentic browsers. And I do think it's that major missing link between the potential of large language models and what business leaders like you and I actually need. We need actions. We need observability. And we actually need task completion. And I think that's what agentic AI uh will deliver on. And remember today's versions not good, right? But think back to the first time you used a chatbot, right? Think back to the first time you used Bard and then think of how good Gemini 2.5 Pro is. Think of the first time that you used, you know, chat GBT 3.5 and then think of how good the models are now. Think of how good these browsers, these agentic browsers are going to be. And I will end by saying this. If you're not experimenting weekly with agentic browsers, you will quickly fall behind. All right. I hope this was helpful, y'all. Uh if it was, please let me know. Reach out in the show notes um for the podcast, which I would love it if you would subscribe. Uh leave us a rating on the podcast, but always leave information to reach out to me. Also, if you are listening on the podcast and you want to watch the video, there wasn't too much on the video side uh today, but we always have that in there as well. So, uh, thank you for tuning in. If this was helpful, please tell someone about it and then go to your everyday.com. There we're going to be recapping the highlights from today's show, as well as keeping you up to date with everything you need to be the smartest person in AI at your company or in your department. So, thank you for tuning in. Hope to see you back tomorrow and every day for more Everyday AI. Thank you so

Transcript for:Agentic AI Browsers Overview

Transcript for:
Agentic AI Browsers Overview