Coconote
AI notes
AI voice & video notes
Try for free
📰
AI Weekly Update: New Releases and Major Developments
Jul 7, 2024
AI Weekly Update
Overview
Tracking significant AI releases and news.
Focus on practical applications usable by the public.
Moshi AI by Cute AI Labs
Open-source real-time voice assistant.
French-based, aims to compete with OpenAI's GPT-4 voice assistant.
Features:
Low-latency web interface.
7 billion parameters in base model.
Emotional awareness and tone modification (performance varies).
Issues encountered during testing:
Interrupts user frequently.
Limited emotional detection and voice modulation capabilities.
Expected to be integrated into many applications due to open-source nature.
Gen-3 Video Generator
State-of-the-art AI video generator now widely available.
Previously showcased by Andrej Karpathy.
Examples and capabilities:
Various scenarios like a T-Rex surfing or a medieval village in the style of Hieronymus Bosch.
High-quality, immersive video generation.
Cost concerns:
Uses credits; expensive ($1 per 10-second clip).
Requires multiple iterations for desired output, increasing cost.
Real-World Use Case:
Motorola’s AI video tools in their ad campaign.
11 Labs Reader App
iOS app (US only, future global rollout planned) for text-to-speech using AI voices.
Features iconic voices like James Dean and Burt Reynolds.
Introduced AI tool to isolate voices, enhancing audio quality.
Suna Mobile App
iOS app (US only, future Android version and global rollout planned) for AI music generation.
Luma AI Key Frames
New feature allowing smooth transitions in AI video from one object to another.
Testing showed variable success; requires many iterations to avoid hard cuts.
Potential limited by necessity for frequent regeneration to achieve desired effect.
Perplexity AI Pro Search
New feature with multi-step reasoning and access to math, programming, and Wolfram Alpha.
Aimed at enhancing agent-based search processes.
Other Interesting Developments
WebSIM AI
Creates websites from scratch using Claude 3.5.
Fun application example: recreating interdimensional TV from
Rick and Morty
.
AI Multimodal Model: Dolphin Vision 72B
Largest multimodal, uncensored AI model to date.
Requires significant computing power to run.
Figma AI Features
New AI-powered tools for UI design, including prompt-to-UI and visual search.
Recent controversy over copyright issues with the generated designs.
Features disabled temporarily for ethical review.
Hugging Face Leaderboard Update
Introduction of new benchmarks (MLU pro, GPT QA, MSU).
Community voting system added for result validation.
Quen 72B model leads current rankings.
Noteworthy Mentions
Google’s AI-powered crossword game.
New 11 Labs audio isolation feature.
Motorola ad campaign using AI video tools.
Conclusion
AI advancements are rapidly evolving, with practical applications becoming more accessible.
Enthusiasts encouraged to explore and test new tools and features.
Stay updated with weekly AI news for more insights and developments.
📄
Full transcript