Recent AI Developments and Innovations

Well, it's Thanksgiving week here in the U.S., but surprisingly, there's still quite a bit to talk about in the world of AI. So here's all the stuff that I thought was really interesting or super cool that I thought you would like. So let's go ahead and break it down, starting with the Sora leak. And I say leak in air quotes because the leak was pretty quickly shut down. So basically what happened was someone or some people who got early access to Sora created this little Python file here. and shared it on Hugging Face. And this Python file had access to the Sora API, basically meaning that yes, people were able to go and generate videos on Sora for a short window of time, but their prompt was going to the Sora servers. Sora was generating it on their servers and then sending it back. It wasn't like the code was leaked and people were able to install Sora on their computer and now there's access floating around and anybody can generate with Sora. No, people were linked up to the Sora servers could generate videos for a little bit. OpenAI found out, shut down the API, and then nobody was able to use Sora again after that. And when I mean nobody, I mean even all the early access people that were able to use Sora originally also lost access. So the people who leaked it basically made it so that nobody can use it anymore, at least temporarily. Now here's the reasons they gave for actually leaking this Sora access. They put up this like manifesto here on Hugging Face that says, Dear Corporate AI Overlords. We received access to Sora with the promise to be early testers, red teamers, and creative partners. However, we believe instead we are being lured into art washing to tell the world that Sora is a useful tool for artists. Artists are not your unpaid research and development. We are not your free bug testers, PR puppets, training data, or validation tokens. They go on to say, furthermore, every output needs to be approved by the OpenAI team before sharing. This program seems to be less about creative expression and critique and more about PR and advertisement. Essentially, they were frustrated that they were given early access to test for bugs, red team, and create marketing material for OpenAI without compensation and without ever actually releasing it to the public to use. They're also claiming that OpenAI required them to share the outputs before putting them out into the world. Now, from what I understand, this wasn't because OpenAI didn't want them to put out anything that didn't look amazing. It was because OpenAI didn't want any sort of fake. political propaganda spreading from Sora. They didn't want any sort of adult content that was generated with Sora to be spread around and things like that. They just wanted to review it before it went live to make sure that sort of unethical content or adult content wasn't being shared with it. Not because they were trying to make sure only the best outputs were being shown. Sam Altman himself, back when he was originally demoing Sora on X, was actually sowing some of his outputs weren't amazing still. So I don't... think they were trying to hide that it doesn't always generate amazing outputs. They go on to say here we are not against the use of AI technology as a tool for the arts. They just don't agree with how this artist program has been rolled out and how the tool is shaping up ahead of a public release. A handful of artists signed it basically saying that they agree with the sentiment, but in my opinion, I feel like this was sort of a petty thing to do. If anything, it brought more awareness to OpenAI and Sora. I mean, people had kind of stopped talking about Sora for a while. All these other video platforms had come out that were generating videos nearly as good as what we were seeing from Sora. But as a result of this leak, now people are talking about Sora again. We're getting a whole bunch of demos of it again. It's sort of back in the public consciousness. And to be honest, most of the videos that were generated kind of showed that Sora is still pretty ahead of the pack when it comes to AI video generation. I mean, these videos are... probably better than what we're seeing from most of the other video platforms right now. Now, some of the videos did show that Sora still has some of the same issues as some of the other AI video platforms, but overall, it seems to be generating better videos on average than some of those other platforms. This AI Central X account here posted a thread with every single video that's been generated. I'll share it below, but I'll kind of highlight some of the ones that I thought were interesting. This dog chasing a cat here looks... pretty good. I mean, it looks pretty dang realistic. Here's one of like a bright city with a woman in a red dress and black jacket walking towards the camera. And you know, it looks pretty good. It's a decent looking video. Here's one of a building on fire that looks realistic. I mean that if you saw that just kind of scrolling, you'd probably think that was real. Here's a nice looking anime video. Here's a video of like a truck driving through the dirt. It's kind of got that slow motion effect that we tend to see in a lot of the other AI video generators. I don't know what the exact prompt was. Maybe they prompted it to be in slow motion. I'm not sure. Here's a cat chasing a mouse. This is one where you can start to see some of the issues, right? Like you look at this cat and as it jumps around, you'll notice that the legs kind of disappear. It looks like a three-legged cat for a second there. So you can start to see some of the weirdness, a dog rolling on a skateboard. Here's one that, you know, you could clearly tell is AI. It zooms in and they've got some of the funkiness and uncanniness that you get out of a lot of the AI video generators. Here's somebody by the Eiffel Tower looking through some binoculars in the rain. Some cartoon flamingos with a blue flamingo in the middle. Some gameplay footage. Some Minecraft footage that actually looks really, really good, honestly. Looks like some maybe civilization footage here. And for the most part, it's kind of showing that... Sora makes really good videos still as everybody thought. This one of a cat on like a Roomba looks really funky. I mean, that one kind of shows off some of the weirdness. The cat loses its hat for a second and almost turns into a squirrel. But overall, the majority of the videos that I've seen that have come out of this leak make me more impressed and more excited about Sora. This one of a baby swimming in the water that kind of reminds me of like the old Nirvana cover looks pretty good. You know, there's almost 30 videos here in this thread. So again... I will link it up if you want to see what more of these videos look like. But if anything, I actually think this leak created more hype for Sora than anything else. I mean, there's even rumors going around that maybe Sora leaked it themselves to sort of get buzz around it again. I think that's highly unlikely, but not out of the question. And since we're talking about AI video, let's go ahead and continue with that theme. Luma just rolled out some new features for Dream Machine, including a mobile app. I was actually lucky enough to get... early access to the dream machine app which i believe is available for everybody now but this is what it looks like i can go through and see all of my previous generations here play them right inside of the app and i can create boards by hitting this little plus button we can see i've got a prompt box down here at the bottom i can actually pick photos from my computer here's a photo of me and some of my buddies in the ai world let's just give it the prompt make them dance and let's see what it gives us now it's choreographing a dance of camaraderie And we got a video of the four of us all dancing here. There's one video that it generated and there's the other video that it generated. And I did it all on my phone. So pretty cool new upgrade to Dream Machine. It's now got consistent characters from a single image. So I can upload an image of myself, use a prompt like at character as a Pixar cartoon. We'll use this as a reference. And by the way, I'm on the web app now to do this, but you can do this on the phone app as well. And you can see it created some. character references of me as a Pixar character. I could select one of these images and now I can turn it into a video or I can use that as a new reference image and animate this video with whatever prompt I want. So some pretty cool new features in Luma's dream machine, definitely worth playing around with and checking out. But we have even more AI video news this week. The company Light Tricks, the company behind LTX Studio, just released an open source AI video model. They released a model called LTX Video. and put all of the files available to download up on Hugging Face. So this is a video model that if you have a strong enough computer, you can actually download it and generate videos locally on your own hard drive. And it's pretty decent too. We can see some of the sample videos that they have here of like this woman having a conversation with another woman. The camera pans over a snow covered mountain. The waves crash against jagged rocks. We can see it generates videos in 24 frames per second at 768 by 512. but then you can always use a tool like Topaz's AI video upscaler and upscale the video. And if you do want to test it out and play around with it for free, they actually have a hugging face space up called LTX Video Playground. We can click into here. And well, I think this space might be a little overloaded now. I've been waiting for like almost nine minutes and it still hasn't generated anything. So let's go ahead and take a peek at some of their like cached prompts here. Like this young woman behind some curtains that are opening. You can see the pretty good generations. Maybe you'll be luckier than I am trying to use this hugging face space, or you can always duplicate this space and, you know, spend a little money to run it on hugging face, or you can download the files to your computer if you have a strong enough GPU and run it yourself. But it is pretty cool to see some of these new video models actually being open sourced so that people can build off of them and iterate off of them and improve them and do all sorts of cool stuff because now we'll be able to generate this stuff. right on our own computer without having to wait for tools like Sora. We also got some AI video news out of Runway this week. They added a new expand video feature. So you can take like a vertical video and expand it, and it will use AI to fill in the rest or take a small video and, you know, expand it in any direction and see what it does. So if we jump over to RunwayML.com. I could log into my account, make sure I'm set on Gen 3 Alpha Turbo here. Just for fun, let's take this little video that was actually one of the demo videos from the Hotshot AI video generator and expand it and see what it does. So I'm going to go ahead and pull this in here. I can make it vertical and let's go ahead and generate and see how it fills in the top and bottom on this one. And here's what we get out of that. You can see that it. figured out what the water looks like and what the top of her head looks like. Did a pretty good job, honestly. Now, it's kind of funny because the video is only five seconds long, but it generated a 10 second animation. So after five seconds, the video just freezes on this frame. But that's because the original video is five seconds and I set the prompt at 10 seconds. So my bad, but it still looks pretty cool. But that's not all. Runway released this week. They also released a new image generator called. frames and frames is one of the more realistic AI image generators I've seen. Like here's some of the sample images they've shared. Now it also does cartoon stuff in this sort of weird, like abstract stuff as well. But these images that are supposed to look realistic, look pretty dang good. Here's some more like images of people in like various costumes and things like 1970s art. Here's some more like cartoony comic booky looking images. really, really good overall though. I'm really impressed with what Runway has with their image generator here, and it should be a pretty fun one once it's fully rolled out, but we can see here on their blog post about it, we're gradually rolling out access inside of Gen 3 Alpha and the Runway API to allow you to build more of your worlds within a larger, more seamless creative flow. I just checked. I don't believe it's in my account yet, but when it is, I will follow up. in a future video about it. We also got some more AI image generation news out of Stability AI. Their Stable Diffusion 3.5 large model now has control nets. They've got the Kani control net. We recently saw this roll out with Flux as well, but Kani sort of does this almost like trace of your original image and then allows you to generate new images that follow that same sort of tracing. They also did a depth model similar to what we saw with flux, where it takes an original image, looks at the depth of the image and then generates new images with that depth. And they also released a blur control net where it looks like you can take a sort of blurry image and it will upscale it a bit. And since we're talking about AI art, I thought this was a fun one to share. Google labs just released a new thing called gen chess, where you can actually create playable chess boards in whatever style you want. So here's an example that Callum made of. Tesla vs. Ford chess pieces. Here's some dinosaur chess pieces that you can play with. But if you like chess, this is pretty cool. You can go to labs.google.com slash genchess, and we can see our prompt here. Make a classic chess set inspired by Jam on Toast, or make a creative chess set inspired by wolves. We'll go ahead and generate that. And now we can see our various wolf-related chess pieces here. Or we can go a classic chess set inspired by wolves, and we get pieces that look like more... traditional chess pieces here. Now let's go ahead and generate an opponent and it's doing wolves versus sheep. And here's the sheep chess pieces that it made. And now we can actually play chess, wolves versus sheep. I'm gonna go ahead and do easy. And now we've got a chess game going on here and I can play against the AI computer, which seems to just be mirroring every move that I do. But anyway, it's pretty cool. A fun, creative way to play more chess. All right, moving on to AI audio news. 11 labs just rolled out a new feature this week called gen fm and gen fm is kind of the same concept as notebook lm by google where you can upload a whole bunch of pdfs or documents or things like that and it will actually create a podcast out of it now this is currently only available on mobile but i do believe it's coming to desktop soon but if i open up the mobile 11 labs app here we can see a giant bar that says transform your content into a podcast with gen fm let's click on that i have the options to paste a link, write my own text in, import a file, or scan a document. I'll go ahead and paste a random AI news article in here, create a new episode. And as it's actually creating the episode, it actually plays music for you. And here's what we get out of it. Zoom, the pandemic darling of video conferencing, just dropped a bombshell. They're rebranding as an AI first work platform for human connection. But is this a brilliant pivot or a desperate attempt to stay relevant? Whoa, that's quite a shift. So they're moving away from just being known for video calls? Exactly. They're dropping the video from their name and becoming Zoom Communications Inc. It's a bold move. So yeah, if you played around with Notebook LM, this will sound very familiar, except it's on your mobile phone and you can listen to podcasts about whatever you want, whatever you want with a pretty easy fun app. And since we're talking about AI audio, NVIDIA just released a new generative AI model called Fugato, which is short for Foundational Generative Audio Transformer Opus 1. It generates or transforms any mix of music, voice, and sound described with prompts using any combination of text and audio files. is just as easy. You won't be there. Kids are talking by the door. Kids are talking. By the door! Kids are talking... By the door! So that seems pretty cool. It's like all of the various AI models that we've had out there all into one, right? You've got the ability to create music, the ability to create speech, the ability to isolate tracks from songs, add, you know, drums or other instruments to songs that you've already created, like so many different things all within a single model. Now, at the moment, this just seems to be research. I don't think they've made it available yet, but. But once it's available, this looks like it'll be pretty fun to play with and something we'll definitely be following up on once it's ready. And since we're talking about NVIDIA, let's talk about Edify 3D. This is a new scalable, high quality 3D asset generation model that they released research for this week. So this appears to be a model where you could give it a text prompt and from that text prompt, it will generate a 3D asset that you can use in. your games or whatever. And you can also upload images. It'll turn those images into 3d assets that you can use for whatever you need to use them for. So this looks really, really fun. You know, one of the things that I sort of aspire to do is create a game in unreal engine or unity at some point. And having tools like this at my disposal is going to make creating a lot of those 3d assets for that game a lot easier. Now, again, this is just research that was released. We just have a paper. There's doesn't seem to be code available for it yet, but again, something we'll follow up on as it progresses a little bit further. Now, moving on to large language model news, there's been a few announcements out of Anthropic this week, starting with the model context protocol. This is something that I think is going to come in really handy for businesses because what this allows you to do is connect your cloud account to data within your company. Now, cloud doesn't actually find real time information. It doesn't search the web. It's... only updated through the latest model checkpoint that's available. And so no new information is available except for when they roll out new models. However, with this model context protocol, you can actually attach cloud to your own sort of databases and information. And as you update the information in your own sort of system, that information gets added to cloud. Now, at the moment, it seems like this is just available with the API. It says developers can start building and testing MCP connectors today. day. Existing cloud for work customers can begin testing MCP servers locally, connecting cloud to internal systems and data sets, and will soon provide developer toolkits for deploying remote production MCP servers that can serve your entire cloud for work organization. So again, if you're a business that uses cloud and specifically use their API, you can actually start to connect it to your own data sources. But that's not all Anthropic rolled out with cloud this week. They also released a new personal style feature. And so check this out. If I head over to my Clod account, you can see there's a new dropdown here that says choose style. And it's got normal, concise, explanatory, and formal by default, but you can also create and edit your own styles. This tech storyteller is the one that it created for me. So to create your own style, you click create and edit styles, and you can see these first three are presets. And then here's mine. And it says deliver technical insights through precise analytical and professional discourse. I can even edit this style once I've. I've already created it once, but if you want to create a new style, you click create custom style and then you can add writing examples here. So you can drag and drop PDFs or documents or things like that, or you can paste in text and then select define a style objective, sort of explain the style. You can tailor it to an audience. You can use a specific voice and tone and upload like transcripts or your own blog posts, or you can describe generally what you want your style to sound like. Now, when I made this tech storyteller style, What I did was I uploaded about 90 minutes of transcripts from my YouTube videos and let it sort of determine what my style is based on my transcripts. And it did an okay job. But the nice thing is if you don't like some elements about the style, you can click edit with Claude and you could tell it how you want it to change. Like my first style that it generated was a little bit too informal and it also threw emojis in there for some reason. And so I said, hey, don't use emojis when you prompt as me. And also I do talk. Not casually, but this was a little overly casual. So make it slightly more formal. And then it actually tweaked my style and fixed it up. So something fun to play with. If you want Claude to sound more like you or like a certain style, when you generate prompts, you now have that ability. And since we're talking about Anthropic, some other big news is that Amazon is investing. Another $4 billion into Anthropic. It sounds like Amazon is kind of going all in as Anthropic being their AI partner. We already know that the future Alexas are going to use Anthropic and Amazon seems to be going in big with them, but they are hedging their bets a little bit. This information came out this week as well on the information that Amazon is developing. a video AI model hedging its reliance on Anthropic. Now, when I first read this, I thought they were making like their own version of Sora or something like that, but this title is a little bit misleading. It's actually a model that can understand video and understand images. So it says, Amazon has developed new generative artificial intelligence that can process images and video in addition to text, according to a person with direct knowledge of the matter and two people who spoke with Amazon about its plans. So even though they're going in big on Anthropic, they're kind of... doing what Microsoft is doing where they're developing their own stuff in-house, but Microsoft's also working very closely with OpenAI. Amazon's developing their own stuff in-house, but also working really, really closely with Anthropic and using Anthropic's technology, but they don't want to be too reliant on Anthropic. Alibaba also released a new model this week, which goes head to head with OpenAI's O1 model. So it's one of those reasoning models that understands math and logic and things like that a little bit better. This new model is called QWQ32B. preview. Now, personally, I have a hard time testing between different large language models because for the most part, chat GPT perplexity and cloud kind of do everything I need them to do. So these really deep logic and reasoning models, I kind of struggled to test and compare, but I know my buddy Matthew Berman over on his channel does a lot of large language model comparison videos. So definitely check out his channel because I could almost guarantee he'll probably be breaking down this model pretty soon. Grok also got an update this week. Grok now knows your name and X handle, and you can do more personalized prompts inside of Grok. So if I jump into Grok here, I can ask it, what's my name? And it will actually know my name. I'm going to turn on fun mode, and then I'm going to say, based on my tweets, what do I do for a living? From what I can gather from your X posts. It seems you're quite a digital nomad in the realm of technology, AI, and content creation. I'm into content creation, tech and AI enthusiasm, and social media engagement. So piecing it together, you seem to be a tech-savvy content creator, perhaps running a YouTube channel, engaging with AI technologies, and sharing insights on digital tools and trends. Essentially, you're the digital equivalent of a Swiss army knife, sharp, multi-tooled, and capable of opening almost any conversation in the tech world. That's kind of flattering. And since we're talking about Grok, it also looks like XAI is going to be... eventually releasing their own standalone app similar to chat GPT. I don't think the concept of X being the everything app has really caught on amazingly well in the U S and so not a lot of people are using grok yet. And so I believe that Elon thinks that if he goes and makes grok its own standalone app, like the chat GPT app, they'll get a lot more adoption of that platform, which I tend to agree with. I think pulling it out of X and making it. own standalone thing is probably a really smart move for them. This week, Threads took a play out of the X playbook, and it is giving you AI-powered summaries of trending topics. So if I head on over to my Threads account here, and I click on the little magnifying glass icon, you can see trending now what people are saying summarized by AI. So Black Friday 2024, people discuss Black Friday 2024, deals and shopping plans, Bears Fire Matt Eberfluss, Thanksgiving dinner, Brad Pitt, Taylor Swift. Jimmy Fallon. Let's go ahead and click on this one here. And you can see it's just got a very short one sentence summary of what this news is about, followed by a bunch of threads posts about this news. Uber made an interesting play this week. They're getting into AI labeling. So right now the dominant player in the AI labeling game is scale AI, where they'll look at AI images and help label them so that the AI better understands what's going on in images. They'll look at like chat transcripts and basically give feedback on whether or not the transcript looks good or not to, you know, improve the AI's output. Well, it sounds like Uber is trying to turn that concept into like a side hustle gig. Uber is going to pay people to look at images and label them or look at chats and help improve the response of those chats as like a side hustle income method. That could be really interesting as that plays out. Definitely something I'll be following the news very closely on. If you use DaVinci Resolve for your editing like I do, they just rolled out a better AI motion tracking tool here. We can see this little demo video where it's tracking this like Porsche driving on these roads, and it's doing a really, really good job of this tracking. So super impressed by that, and I'm really looking forward to playing around with this in my own DaVinci Resolve account. Elon Musk is apparently planning on starting an AI game studio to make games great again. Tesla showed off a new feature of its Optimus robot. We can see in this video here a Tesla Optimus. Optimus robot catching tennis balls and doing it pretty well. It actually turns out that this is tele operated. So when he's catching the tennis ball, there's actually somebody operating the robot to catch the tennis ball. But if you read about it here, it's really fascinating. It says the new hand is much more realistic and it actually has tendons much like a human hand. Tesla says that it has 22 degrees of freedom on the hand and another three on the wrist and forearms. So it moves a lot more naturally like a real human hand. But again, like I mentioned, Tesla was quicker to confirm that this Optimus was also teleoperated for this demonstration. And finally, there was a bit of a robot heist this week. An AI robot came into like a robot showroom and actually convinced other robots to follow it out. So here's actually a video of that happening. We can see the little robot here on the screen and it's communicating with these other robots inside of this warehouse. And it actually convinces these various robots to follow it out of the warehouse. pretty wild so this one robot follows the other robots are kind of paying attention and watching and next thing we know all the robots are following the little robot out of this warehouse here like that's crazy anyways that's what i got for you today like i mentioned quite a few cool things that happened this week that i wanted to share with you i'm actually about to head off to london this weekend for some cool stuff that i'm not quite allowed to talk about yet. So I'm not sure how that's going to affect my video uploading schedule next week, but hopefully I'll have cool stuff to share with you next week. I just don't know how it's going to play out yet with what I'm doing in London. So possibly less videos next week. We'll see how it all plays out. Anyway, check out future tools.io. This is where I curate all the coolest AI tools and latest AI news. Join the free newsletter. You'll get really cool stuff sent to your inbox around the latest AI news and AI tools. And thank you so much for tuning in. I really, really appreciate you. I'll see you in the next video. Bye-bye.

And I say leak in air quotes because the leak was pretty quickly shut down. So basically what happened was someone or some people who got early access to Sora created this little Python file here. and shared it on Hugging Face.

And this Python file had access to the Sora API, basically meaning that yes, people were able to go and generate videos on Sora for a short window of time, but their prompt was going to the Sora servers. Sora was generating it on their servers and then sending it back. It wasn't like the code was leaked and people were able to install Sora on their computer and now there's access floating around and anybody can generate with Sora.

No, people were linked up to the Sora servers could generate videos for a little bit. OpenAI found out, shut down the API, and then nobody was able to use Sora again after that. And when I mean nobody, I mean even all the early access people that were able to use Sora originally also lost access. So the people who leaked it basically made it so that nobody can use it anymore, at least temporarily.

Now here's the reasons they gave for actually leaking this Sora access. They put up this like manifesto here on Hugging Face that says, Dear Corporate AI Overlords. We received access to Sora with the promise to be early testers, red teamers, and creative partners. However, we believe instead we are being lured into art washing to tell the world that Sora is a useful tool for artists. Artists are not your unpaid research and development.

We are not your free bug testers, PR puppets, training data, or validation tokens. They go on to say, furthermore, every output needs to be approved by the OpenAI team before sharing. This program seems to be less about creative expression and critique and more about PR and advertisement. Essentially, they were frustrated that they were given early access to test for bugs, red team, and create marketing material for OpenAI without compensation and without ever actually releasing it to the public to use.

They're also claiming that OpenAI required them to share the outputs before putting them out into the world. Now, from what I understand, this wasn't because OpenAI didn't want them to put out anything that didn't look amazing. It was because OpenAI didn't want any sort of fake.

political propaganda spreading from Sora. They didn't want any sort of adult content that was generated with Sora to be spread around and things like that. They just wanted to review it before it went live to make sure that sort of unethical content or adult content wasn't being shared with it.

Not because they were trying to make sure only the best outputs were being shown. Sam Altman himself, back when he was originally demoing Sora on X, was actually sowing some of his outputs weren't amazing still. So I don't...

think they were trying to hide that it doesn't always generate amazing outputs. They go on to say here we are not against the use of AI technology as a tool for the arts. They just don't agree with how this artist program has been rolled out and how the tool is shaping up ahead of a public release.

A handful of artists signed it basically saying that they agree with the sentiment, but in my opinion, I feel like this was sort of a petty thing to do. If anything, it brought more awareness to OpenAI and Sora. I mean, people had kind of stopped talking about Sora for a while. All these other video platforms had come out that were generating videos nearly as good as what we were seeing from Sora. But as a result of this leak, now people are talking about Sora again.

We're getting a whole bunch of demos of it again. It's sort of back in the public consciousness. And to be honest, most of the videos that were generated kind of showed that Sora is still pretty ahead of the pack when it comes to AI video generation. I mean, these videos are...

probably better than what we're seeing from most of the other video platforms right now. Now, some of the videos did show that Sora still has some of the same issues as some of the other AI video platforms, but overall, it seems to be generating better videos on average than some of those other platforms. This AI Central X account here posted a thread with every single video that's been generated.

I'll share it below, but I'll kind of highlight some of the ones that I thought were interesting. This dog chasing a cat here looks... pretty good.

I mean, it looks pretty dang realistic. Here's one of like a bright city with a woman in a red dress and black jacket walking towards the camera. And you know, it looks pretty good.

It's a decent looking video. Here's one of a building on fire that looks realistic. I mean that if you saw that just kind of scrolling, you'd probably think that was real.

Here's a nice looking anime video. Here's a video of like a truck driving through the dirt. It's kind of got that slow motion effect that we tend to see in a lot of the other AI video generators. I don't know what the exact prompt was.

Maybe they prompted it to be in slow motion. I'm not sure. Here's a cat chasing a mouse. This is one where you can start to see some of the issues, right?

Like you look at this cat and as it jumps around, you'll notice that the legs kind of disappear. It looks like a three-legged cat for a second there. So you can start to see some of the weirdness, a dog rolling on a skateboard. Here's one that, you know, you could clearly tell is AI. It zooms in and they've got some of the funkiness and uncanniness that you get out of a lot of the AI video generators.

Here's somebody by the Eiffel Tower looking through some binoculars in the rain. Some cartoon flamingos with a blue flamingo in the middle. Some gameplay footage.

Some Minecraft footage that actually looks really, really good, honestly. Looks like some maybe civilization footage here. And for the most part, it's kind of showing that... Sora makes really good videos still as everybody thought.

This one of a cat on like a Roomba looks really funky. I mean, that one kind of shows off some of the weirdness. The cat loses its hat for a second and almost turns into a squirrel.

But overall, the majority of the videos that I've seen that have come out of this leak make me more impressed and more excited about Sora. This one of a baby swimming in the water that kind of reminds me of like the old Nirvana cover looks pretty good. You know, there's almost 30 videos here in this thread. So again... I will link it up if you want to see what more of these videos look like.

But if anything, I actually think this leak created more hype for Sora than anything else. I mean, there's even rumors going around that maybe Sora leaked it themselves to sort of get buzz around it again. I think that's highly unlikely, but not out of the question.

And since we're talking about AI video, let's go ahead and continue with that theme. Luma just rolled out some new features for Dream Machine, including a mobile app. I was actually lucky enough to get...

early access to the dream machine app which i believe is available for everybody now but this is what it looks like i can go through and see all of my previous generations here play them right inside of the app and i can create boards by hitting this little plus button we can see i've got a prompt box down here at the bottom i can actually pick photos from my computer here's a photo of me and some of my buddies in the ai world let's just give it the prompt make them dance and let's see what it gives us now it's choreographing a dance of camaraderie And we got a video of the four of us all dancing here. There's one video that it generated and there's the other video that it generated. And I did it all on my phone.

So pretty cool new upgrade to Dream Machine. It's now got consistent characters from a single image. So I can upload an image of myself, use a prompt like at character as a Pixar cartoon. We'll use this as a reference.

And by the way, I'm on the web app now to do this, but you can do this on the phone app as well. And you can see it created some. character references of me as a Pixar character.

I could select one of these images and now I can turn it into a video or I can use that as a new reference image and animate this video with whatever prompt I want. So some pretty cool new features in Luma's dream machine, definitely worth playing around with and checking out. But we have even more AI video news this week. The company Light Tricks, the company behind LTX Studio, just released an open source AI video model. They released a model called LTX Video.

and put all of the files available to download up on Hugging Face. So this is a video model that if you have a strong enough computer, you can actually download it and generate videos locally on your own hard drive. And it's pretty decent too.

We can see some of the sample videos that they have here of like this woman having a conversation with another woman. The camera pans over a snow covered mountain. The waves crash against jagged rocks.

We can see it generates videos in 24 frames per second at 768 by 512. but then you can always use a tool like Topaz's AI video upscaler and upscale the video. And if you do want to test it out and play around with it for free, they actually have a hugging face space up called LTX Video Playground. We can click into here.

And well, I think this space might be a little overloaded now. I've been waiting for like almost nine minutes and it still hasn't generated anything. So let's go ahead and take a peek at some of their like cached prompts here.

Like this young woman behind some curtains that are opening. You can see the pretty good generations. Maybe you'll be luckier than I am trying to use this hugging face space, or you can always duplicate this space and, you know, spend a little money to run it on hugging face, or you can download the files to your computer if you have a strong enough GPU and run it yourself. But it is pretty cool to see some of these new video models actually being open sourced so that people can build off of them and iterate off of them and improve them and do all sorts of cool stuff because now we'll be able to generate this stuff.

right on our own computer without having to wait for tools like Sora. We also got some AI video news out of Runway this week. They added a new expand video feature.

So you can take like a vertical video and expand it, and it will use AI to fill in the rest or take a small video and, you know, expand it in any direction and see what it does. So if we jump over to RunwayML.com. I could log into my account, make sure I'm set on Gen 3 Alpha Turbo here.

Just for fun, let's take this little video that was actually one of the demo videos from the Hotshot AI video generator and expand it and see what it does. So I'm going to go ahead and pull this in here. I can make it vertical and let's go ahead and generate and see how it fills in the top and bottom on this one. And here's what we get out of that. You can see that it.

figured out what the water looks like and what the top of her head looks like. Did a pretty good job, honestly. Now, it's kind of funny because the video is only five seconds long, but it generated a 10 second animation. So after five seconds, the video just freezes on this frame.

But that's because the original video is five seconds and I set the prompt at 10 seconds. So my bad, but it still looks pretty cool. But that's not all.

Runway released this week. They also released a new image generator called. frames and frames is one of the more realistic AI image generators I've seen. Like here's some of the sample images they've shared. Now it also does cartoon stuff in this sort of weird, like abstract stuff as well.

But these images that are supposed to look realistic, look pretty dang good. Here's some more like images of people in like various costumes and things like 1970s art. Here's some more like cartoony comic booky looking images. really, really good overall though. I'm really impressed with what Runway has with their image generator here, and it should be a pretty fun one once it's fully rolled out, but we can see here on their blog post about it, we're gradually rolling out access inside of Gen 3 Alpha and the Runway API to allow you to build more of your worlds within a larger, more seamless creative flow.

I just checked. I don't believe it's in my account yet, but when it is, I will follow up. in a future video about it.

We also got some more AI image generation news out of Stability AI. Their Stable Diffusion 3.5 large model now has control nets. They've got the Kani control net. We recently saw this roll out with Flux as well, but Kani sort of does this almost like trace of your original image and then allows you to generate new images that follow that same sort of tracing.

They also did a depth model similar to what we saw with flux, where it takes an original image, looks at the depth of the image and then generates new images with that depth. And they also released a blur control net where it looks like you can take a sort of blurry image and it will upscale it a bit. And since we're talking about AI art, I thought this was a fun one to share. Google labs just released a new thing called gen chess, where you can actually create playable chess boards in whatever style you want.

So here's an example that Callum made of. Tesla vs. Ford chess pieces. Here's some dinosaur chess pieces that you can play with. But if you like chess, this is pretty cool. You can go to labs.google.com slash genchess, and we can see our prompt here.

Make a classic chess set inspired by Jam on Toast, or make a creative chess set inspired by wolves. We'll go ahead and generate that. And now we can see our various wolf-related chess pieces here.

Or we can go a classic chess set inspired by wolves, and we get pieces that look like more... traditional chess pieces here. Now let's go ahead and generate an opponent and it's doing wolves versus sheep.

And here's the sheep chess pieces that it made. And now we can actually play chess, wolves versus sheep. I'm gonna go ahead and do easy.

And now we've got a chess game going on here and I can play against the AI computer, which seems to just be mirroring every move that I do. But anyway, it's pretty cool. A fun, creative way to play more chess.

All right, moving on to AI audio news. 11 labs just rolled out a new feature this week called gen fm and gen fm is kind of the same concept as notebook lm by google where you can upload a whole bunch of pdfs or documents or things like that and it will actually create a podcast out of it now this is currently only available on mobile but i do believe it's coming to desktop soon but if i open up the mobile 11 labs app here we can see a giant bar that says transform your content into a podcast with gen fm let's click on that i have the options to paste a link, write my own text in, import a file, or scan a document. I'll go ahead and paste a random AI news article in here, create a new episode.

And as it's actually creating the episode, it actually plays music for you. And here's what we get out of it. Zoom, the pandemic darling of video conferencing, just dropped a bombshell.

They're rebranding as an AI first work platform for human connection. But is this a brilliant pivot or a desperate attempt to stay relevant? Whoa, that's quite a shift. So they're moving away from just being known for video calls?

Exactly. They're dropping the video from their name and becoming Zoom Communications Inc. It's a bold move. So yeah, if you played around with Notebook LM, this will sound very familiar, except it's on your mobile phone and you can listen to podcasts about whatever you want, whatever you want with a pretty easy fun app. And since we're talking about AI audio, NVIDIA just released a new generative AI model called Fugato, which is short for Foundational Generative Audio Transformer Opus 1. It generates or transforms any mix of music, voice, and sound described with prompts using any combination of text and audio files.

is just as easy. You won't be there. Kids are talking by the door. Kids are talking.

By the door! Kids are talking... By the door! So that seems pretty cool.

It's like all of the various AI models that we've had out there all into one, right? You've got the ability to create music, the ability to create speech, the ability to isolate tracks from songs, add, you know, drums or other instruments to songs that you've already created, like so many different things all within a single model. Now, at the moment, this just seems to be research.

I don't think they've made it available yet, but. But once it's available, this looks like it'll be pretty fun to play with and something we'll definitely be following up on once it's ready. And since we're talking about NVIDIA, let's talk about Edify 3D. This is a new scalable, high quality 3D asset generation model that they released research for this week.

So this appears to be a model where you could give it a text prompt and from that text prompt, it will generate a 3D asset that you can use in. your games or whatever. And you can also upload images.

It'll turn those images into 3d assets that you can use for whatever you need to use them for. So this looks really, really fun. You know, one of the things that I sort of aspire to do is create a game in unreal engine or unity at some point.

And having tools like this at my disposal is going to make creating a lot of those 3d assets for that game a lot easier. Now, again, this is just research that was released. We just have a paper.

There's doesn't seem to be code available for it yet, but again, something we'll follow up on as it progresses a little bit further. Now, moving on to large language model news, there's been a few announcements out of Anthropic this week, starting with the model context protocol. This is something that I think is going to come in really handy for businesses because what this allows you to do is connect your cloud account to data within your company.

Now, cloud doesn't actually find real time information. It doesn't search the web. It's...

only updated through the latest model checkpoint that's available. And so no new information is available except for when they roll out new models. However, with this model context protocol, you can actually attach cloud to your own sort of databases and information. And as you update the information in your own sort of system, that information gets added to cloud.

Now, at the moment, it seems like this is just available with the API. It says developers can start building and testing MCP connectors today. day. Existing cloud for work customers can begin testing MCP servers locally, connecting cloud to internal systems and data sets, and will soon provide developer toolkits for deploying remote production MCP servers that can serve your entire cloud for work organization. So again, if you're a business that uses cloud and specifically use their API, you can actually start to connect it to your own data sources.

But that's not all Anthropic rolled out with cloud this week. They also released a new personal style feature. And so check this out. If I head over to my Clod account, you can see there's a new dropdown here that says choose style.

And it's got normal, concise, explanatory, and formal by default, but you can also create and edit your own styles. This tech storyteller is the one that it created for me. So to create your own style, you click create and edit styles, and you can see these first three are presets.

And then here's mine. And it says deliver technical insights through precise analytical and professional discourse. I can even edit this style once I've. I've already created it once, but if you want to create a new style, you click create custom style and then you can add writing examples here.

So you can drag and drop PDFs or documents or things like that, or you can paste in text and then select define a style objective, sort of explain the style. You can tailor it to an audience. You can use a specific voice and tone and upload like transcripts or your own blog posts, or you can describe generally what you want your style to sound like. Now, when I made this tech storyteller style, What I did was I uploaded about 90 minutes of transcripts from my YouTube videos and let it sort of determine what my style is based on my transcripts.

And it did an okay job. But the nice thing is if you don't like some elements about the style, you can click edit with Claude and you could tell it how you want it to change. Like my first style that it generated was a little bit too informal and it also threw emojis in there for some reason.

And so I said, hey, don't use emojis when you prompt as me. And also I do talk. Not casually, but this was a little overly casual.

So make it slightly more formal. And then it actually tweaked my style and fixed it up. So something fun to play with. If you want Claude to sound more like you or like a certain style, when you generate prompts, you now have that ability. And since we're talking about Anthropic, some other big news is that Amazon is investing.

Another $4 billion into Anthropic. It sounds like Amazon is kind of going all in as Anthropic being their AI partner. We already know that the future Alexas are going to use Anthropic and Amazon seems to be going in big with them, but they are hedging their bets a little bit.

This information came out this week as well on the information that Amazon is developing. a video AI model hedging its reliance on Anthropic. Now, when I first read this, I thought they were making like their own version of Sora or something like that, but this title is a little bit misleading.

It's actually a model that can understand video and understand images. So it says, Amazon has developed new generative artificial intelligence that can process images and video in addition to text, according to a person with direct knowledge of the matter and two people who spoke with Amazon about its plans. So even though they're going in big on Anthropic, they're kind of... doing what Microsoft is doing where they're developing their own stuff in-house, but Microsoft's also working very closely with OpenAI. Amazon's developing their own stuff in-house, but also working really, really closely with Anthropic and using Anthropic's technology, but they don't want to be too reliant on Anthropic.

Alibaba also released a new model this week, which goes head to head with OpenAI's O1 model. So it's one of those reasoning models that understands math and logic and things like that a little bit better. This new model is called QWQ32B.

preview. Now, personally, I have a hard time testing between different large language models because for the most part, chat GPT perplexity and cloud kind of do everything I need them to do. So these really deep logic and reasoning models, I kind of struggled to test and compare, but I know my buddy Matthew Berman over on his channel does a lot of large language model comparison videos. So definitely check out his channel because I could almost guarantee he'll probably be breaking down this model pretty soon. Grok also got an update this week.

Grok now knows your name and X handle, and you can do more personalized prompts inside of Grok. So if I jump into Grok here, I can ask it, what's my name? And it will actually know my name. I'm going to turn on fun mode, and then I'm going to say, based on my tweets, what do I do for a living? From what I can gather from your X posts.

It seems you're quite a digital nomad in the realm of technology, AI, and content creation. I'm into content creation, tech and AI enthusiasm, and social media engagement. So piecing it together, you seem to be a tech-savvy content creator, perhaps running a YouTube channel, engaging with AI technologies, and sharing insights on digital tools and trends. Essentially, you're the digital equivalent of a Swiss army knife, sharp, multi-tooled, and capable of opening almost any conversation in the tech world. That's kind of flattering.

And since we're talking about Grok, it also looks like XAI is going to be... eventually releasing their own standalone app similar to chat GPT. I don't think the concept of X being the everything app has really caught on amazingly well in the U S and so not a lot of people are using grok yet.

And so I believe that Elon thinks that if he goes and makes grok its own standalone app, like the chat GPT app, they'll get a lot more adoption of that platform, which I tend to agree with. I think pulling it out of X and making it. own standalone thing is probably a really smart move for them.

This week, Threads took a play out of the X playbook, and it is giving you AI-powered summaries of trending topics. So if I head on over to my Threads account here, and I click on the little magnifying glass icon, you can see trending now what people are saying summarized by AI. So Black Friday 2024, people discuss Black Friday 2024, deals and shopping plans, Bears Fire Matt Eberfluss, Thanksgiving dinner, Brad Pitt, Taylor Swift. Jimmy Fallon.

Let's go ahead and click on this one here. And you can see it's just got a very short one sentence summary of what this news is about, followed by a bunch of threads posts about this news. Uber made an interesting play this week.

They're getting into AI labeling. So right now the dominant player in the AI labeling game is scale AI, where they'll look at AI images and help label them so that the AI better understands what's going on in images. They'll look at like chat transcripts and basically give feedback on whether or not the transcript looks good or not to, you know, improve the AI's output.

Well, it sounds like Uber is trying to turn that concept into like a side hustle gig. Uber is going to pay people to look at images and label them or look at chats and help improve the response of those chats as like a side hustle income method. That could be really interesting as that plays out. Definitely something I'll be following the news very closely on. If you use DaVinci Resolve for your editing like I do, they just rolled out a better AI motion tracking tool here.

We can see this little demo video where it's tracking this like Porsche driving on these roads, and it's doing a really, really good job of this tracking. So super impressed by that, and I'm really looking forward to playing around with this in my own DaVinci Resolve account. Elon Musk is apparently planning on starting an AI game studio to make games great again. Tesla showed off a new feature of its Optimus robot.

We can see in this video here a Tesla Optimus. Optimus robot catching tennis balls and doing it pretty well. It actually turns out that this is tele operated.

So when he's catching the tennis ball, there's actually somebody operating the robot to catch the tennis ball. But if you read about it here, it's really fascinating. It says the new hand is much more realistic and it actually has tendons much like a human hand.

Tesla says that it has 22 degrees of freedom on the hand and another three on the wrist and forearms. So it moves a lot more naturally like a real human hand. But again, like I mentioned, Tesla was quicker to confirm that this Optimus was also teleoperated for this demonstration.

And finally, there was a bit of a robot heist this week. An AI robot came into like a robot showroom and actually convinced other robots to follow it out. So here's actually a video of that happening. We can see the little robot here on the screen and it's communicating with these other robots inside of this warehouse. And it actually convinces these various robots to follow it out of the warehouse.

pretty wild so this one robot follows the other robots are kind of paying attention and watching and next thing we know all the robots are following the little robot out of this warehouse here like that's crazy anyways that's what i got for you today like i mentioned quite a few cool things that happened this week that i wanted to share with you i'm actually about to head off to london this weekend for some cool stuff that i'm not quite allowed to talk about yet. So I'm not sure how that's going to affect my video uploading schedule next week, but hopefully I'll have cool stuff to share with you next week. I just don't know how it's going to play out yet with what I'm doing in London. So possibly less videos next week.

We'll see how it all plays out. Anyway, check out future tools.io. This is where I curate all the coolest AI tools and latest AI news.

Join the free newsletter. You'll get really cool stuff sent to your inbox around the latest AI news and AI tools. And thank you so much for tuning in.

I really, really appreciate you. I'll see you in the next video. Bye-bye.

Transcript for:Recent AI Developments and Innovations

Transcript for:
Recent AI Developments and Innovations