AI Weekly Update: New Releases and Major Developments

Jul 7, 2024

AI Weekly Update

Overview

  • Tracking significant AI releases and news.
  • Focus on practical applications usable by the public.

Moshi AI by Cute AI Labs

  • Open-source real-time voice assistant.
  • French-based, aims to compete with OpenAI's GPT-4 voice assistant.
  • Features:
    • Low-latency web interface.
    • 7 billion parameters in base model.
    • Emotional awareness and tone modification (performance varies).
  • Issues encountered during testing:
    • Interrupts user frequently.
    • Limited emotional detection and voice modulation capabilities.
  • Expected to be integrated into many applications due to open-source nature.

Gen-3 Video Generator

  • State-of-the-art AI video generator now widely available.
  • Previously showcased by Andrej Karpathy.
  • Examples and capabilities:
    • Various scenarios like a T-Rex surfing or a medieval village in the style of Hieronymus Bosch.
    • High-quality, immersive video generation.
  • Cost concerns:
    • Uses credits; expensive ($1 per 10-second clip).
    • Requires multiple iterations for desired output, increasing cost.
  • Real-World Use Case:
    • Motorola’s AI video tools in their ad campaign.

11 Labs Reader App

  • iOS app (US only, future global rollout planned) for text-to-speech using AI voices.
  • Features iconic voices like James Dean and Burt Reynolds.
  • Introduced AI tool to isolate voices, enhancing audio quality.

Suna Mobile App

  • iOS app (US only, future Android version and global rollout planned) for AI music generation.

Luma AI Key Frames

  • New feature allowing smooth transitions in AI video from one object to another.
  • Testing showed variable success; requires many iterations to avoid hard cuts.
  • Potential limited by necessity for frequent regeneration to achieve desired effect.

Perplexity AI Pro Search

  • New feature with multi-step reasoning and access to math, programming, and Wolfram Alpha.
  • Aimed at enhancing agent-based search processes.

Other Interesting Developments

WebSIM AI

  • Creates websites from scratch using Claude 3.5.
  • Fun application example: recreating interdimensional TV from Rick and Morty.

AI Multimodal Model: Dolphin Vision 72B

  • Largest multimodal, uncensored AI model to date.
  • Requires significant computing power to run.

Figma AI Features

  • New AI-powered tools for UI design, including prompt-to-UI and visual search.
  • Recent controversy over copyright issues with the generated designs.
  • Features disabled temporarily for ethical review.

Hugging Face Leaderboard Update

  • Introduction of new benchmarks (MLU pro, GPT QA, MSU).
  • Community voting system added for result validation.
  • Quen 72B model leads current rankings.

Noteworthy Mentions

  • Google’s AI-powered crossword game.
  • New 11 Labs audio isolation feature.
  • Motorola ad campaign using AI video tools.

Conclusion

  • AI advancements are rapidly evolving, with practical applications becoming more accessible.
  • Enthusiasts encouraged to explore and test new tools and features.
  • Stay updated with weekly AI news for more insights and developments.