Google IO 2023 Presentation: AI Innovations

right after open ai's presentation all of Google IO has been dedicated to AI as expected and since a big theme today has been letting Google do the work for you we went ahead and count it so that you don't have to that might be a record in how many times someone has said Ai and Google have integrated AI almost into every product they have so what did Google IO bring this year let's talk about [Music] it so one of the things that became very obvious while watching open ai's presentation was that they are focused on optimizing and improving their model now Google unlike open a has a huge library of different products and tools and that is where Google stand comes from they're not optimizing just the model they're not teaching just the model new things they are connecting the whole ecosystem of a bunch of different products a bunch of different tools that they have and the knowledge G that they've been building over the last 30 years all of that tied into a powerful AI model and even if that model is not as powerful as open AIS which is debatable the ability of that model to connect to all the Defan tools and have all the knowledge and technology that Google has is huge it's super powerful and this year unlike last year Google did not disappoint in Google IO they presented a lot of amazing tools a lot of amazing libraries all in all the presentation was impressive we had image generation text generation music generation integration with Gmail integration with maps almost all of those kind of things so what did they present let's start uh with the models Google this year showed the new version of Gemini and Gemini 1.5 Advanced is going to have later this year uh context window increased to 2 million tokens which means that basically you can feed it I don't know whole libraries and it can read it within a few seconds Google have started working towards integrating agents and getting them as part of the um automation process they even introduced what they call the AI teammates so far only for development and for the workspace users but still it's pretty impressive we got the new image model uh imagent three it's our most capable image generat model yet which is integrated into the image effects uh tool imagy is actually in many ways on par with mid Journey 6 it's really realistic it's very high quality images and it finally got much much better with text now I'm not sure if it's as good as what open AI have shown yesterday for image generation with longer text as part of GPT 40 if you didn't see that watch the video here but they're definitely not waiting to be out competed they used generated music um to the music effects um library with kind of what they call DJ mode and they even had a nice little pre uh conference show where a musician has been generating Music Live with generative AI so we have the sort of melodic element of the vi is still in there we can pull that out [Music] chip Tunes what you think about chip Tunes let's see what [Music] happens Google finally showed the new video model called VI and it can generate a pretty impressive 1080p video Phil text Google's main advantage is their infrastructure and for that they've been going deeper and deeper into Hardware basically building a super computer like what they call a hyper Compu to support AI generation open I just got the most advanced GPU from Nvidia and that's it right they don't go into Hardware necessarily yet even though the world rumors Google are already saying they are going to hardwood they um announced new theum chips that will be available uh end of 2024 and they are going to be making Nvidia Blackwell the latest uh uh GPU from Nvidia available on the cloud in 2025 Google also making the new water marking or finger printing tool to Mark uh AI generated synthetic media be it videos images and other um types of media open source and available for everybody they call this the syn ID and it's a tool to kind of add invisible watermarks to synthetic media by the way until they will actually release that my next video is going to be about how to recognize and detect AI images so be sure you're subscribed to not miss that Google's biggest advantage that they have both hardware and extremely strong software algorithms teams and so they developed uh Gemini Nano that can run directly on your device and will be available on Pixel phones later this year and what is allows to do things like uh voice recognition like calization like image manipulation right on the device and not even kind of send any data to the cloud just do everything on device which means it's very fast it's pretty accurate and it's very private Google presented a very interesting demo of the platform basically detecting when there is a spam call or a scam and notifying the user with never sending the audio to any server but just doing it fully on device and that was actually very impressive and Google being Google with basically the company built by Developers for developers they presented a lot of really impressive and really interesting things for developers some of those things and they presented Gemini 1.5 flesh which allows you to use 1 million tokens basically no time very very quickly and with super fast response time and I feel like the Gemini 1.5 flash is kind of Google's response to GPT 4. 40 from um yesterday because the goal here is to minimize latency and not increase capabilities Google introduced an very interesting project called project ASA which is again very similar to GPT 40 that we saw yesterday and they even revived the Google Glass at least that's what it seems not officially but it was in all the demos and it was on used on stage so that's kind of impressive and interesting and as you might have known Google have released the Gemma opensource llm which competes with meta's llama and you can just run it by yourself on your own machine and use it to build other specific models on top of it it's not as powerful as Gemini but it's a lot more flexible and you can build a lot on top of it to add to the existing collection of uh Gemma models they announced poly Gemma which is a version of gma. can support uh images and video and sound and all of those kind of things inside the model so you could build your own mini Gemini with open source code that's it now I'm not sure about the license of that one like can you make a commercial product based on Gemma knowing Google probably not but it's definitely a very powerful and interesting tool for research and for building new tools on top of it and of course not to be outdone by open AI with the ch GPT app Google released a new Gemini app and Gemini app has a Mode called Gemini live which is a conversation uh model and supports gems as Google calls them which are kind of very similar uh to open AI gpts or like custom gpts and I anticipate that Google will release kind of a store for gems I must say that I love the name gems a lot more than custom gpts just makes more sense the Google product that everybody is the most interested in and most curious on how it will look in this AI era or the era of Gemini is Google search and Google have started experimenting with the AI search experience last year in the past year we've answered billions of queries as part of her search generative experience and it's an interesting idea there's a lot of risk of it killing both Google's ads and the SEO industry I don't think that Google figured out how to handle ads in that uh product yet and it's they still not there and Google have added uh AI overviews that can overview and summarize whatever you're searching for and you can search multistage reasoning on such multistep logic so for example you can say show me all the pizza places that are within walking distance from me and have vegan options Google will Google for you it became super powerful in all kinds of plannings so if you want to build a travel plan or a meal plan you just ask it give me a meal plan for like up to X calories uh for the next week and it will do that with a one click export to to Google Sheets or order the ingredients super powerful super interesting still deceptive to the ecosystem we'll see how Google handles that and this ask Google or ask Gemini feature set is not limited only to search Google have added that and integrated that into Google photos as well so now it can actually extract ideas and con Concepts from Google photos you can also ask show me my daughters uh first steps and it will find it from your videos and um show it to you so kind of the depth of asking photos got crazy smart another product that is very interesting to see how it evolves because it's such a flag Flagship product for Google is Gmail I'm super happy to see that they finally integrated summarize this email and and uh suggested replies and also it can you can communicate and talk to your email so if you have 10 different invoices in your email in your playbox you can say summarize me on those invoices and it will summarize all of them collect all of them and with a clicker a button you get a summary of the uh invoices in your Google Sheets and copy of the invoices themselves in um Google Drive and with another click of a button you can automate that process and it will keep on doing that forever super powerful stuff super exciting I cannot wait to use it I personally have been using superhuman for quite a while and a lot of those features are in there but Google can do it better and I've been very disappointed with superhumans AI integration so Gemini looks so much better and in so many ways Gemini is more powerful than chpt so that's really exciting one product that has been noticeably absent from the presentation is Android in fact Google presented the pixel 8A a week before Google iio even though in previous years Google have presented the new hardware new Android versions at Google AO now there is a new Android keynote for the latest beta uh on the second day but it kind of from being the center stage and the main thing it went into like the the third priority not even the second one there is a lot of interesting things like um tap on screen is back remember this thing when you could tap on search and search what's in your screen well now there's two versions of it one is circle for search which honestly I don't get why Google are promoting it so much like they have marketing campaigns they showed it on stage like five times they have like they promoted with Samsung like what is going on why does Google like care about Circle to search for for much no idea if you have any idea leave let me know in the comments and another interesting thing is context of world search with Gemini so let's say you're watching a video on YouTube you can press search Gemini pops up and it has a button saying like ask this video and you can communicate a video you can ask to summarize it or extract key parts of the video so bye-bye watch time it was nice meeting you as far as availability of all of those things image effects music effects video effects or VI they're all available only in the US with a way list at labs. gooogle and you can sign up today but God knows when you'll get actual access the new search capabilities are going to be uh deployed throughout the US shortly and the new summarize in Gmail is available now the QA and response in Gmail is going to be available in July the um more advanced summar and kind of automation is going to be available uh in September to uh Google Labs users so again us only Gemini live is going to be a available this summer presumably to everybody but we'll see gems will open up in the coming months there's no commitment to specific time Gemini Nano should appear on De on Pixel devices later this year and probably in the coming betas it's already going to be available if you own a pixel device I'm not sure when it will appear on other devices if at all because it relies on Google's custom tenso CPU inside the pixel for developers polyj is available right now and Gemma 2 will be available in June uh with a 27 billion parameters model that is optimized to run on Nvidia Hardware so pretty quickly one thing that's really interesting to kind of as to summarize uh is that Google relies very heavily on the fact that Apple cannot replicate on device AI logic because especially as we saw now with a partnership that Apple did with open AI the internal uh AI team even though they had a few major stars in the team cannot deliver and now I'm I love Apple I'm using quite a few of Apple products but for some reason Apple always sucked at hard at software and were amazing with hardware and that didn't change almost nothing that they built with themselves have ever worked efficiently as software that's not two of Open Source platforms that Apple are involved in so Safari or6 uh cups all of those are high quality but again not fully in-house developed one thing I kept thinking while watching goo this year is will next year's Google AO going to be fully AI generated by itself and I'll leave you with this thought so I'll see you next time I feel like making a video bye

Transcript for:Google IO 2023 Presentation: AI Innovations

Transcript for:
Google IO 2023 Presentation: AI Innovations