Building the Future of Audio Computing

Jul 15, 2024

Building the Future of Audio Computing

Introduction

  • Lecturer's reflection on modern technology usage.
  • Observations about people being glued to phones.
  • Irony of criticizing excessive screen time while being on the phone.
  • Query: How to create a healthier relationship with technology?

Background

  • Lecturer's experience in deep tech:
    • Funded deep tech at ARPA-E
    • Worked at Google X (Google's moonshot factory)
    • Created a spin out called iyo
  • Spent 10 years researching future technologies.
  • Conclusion: Need for an entirely new kind of computer
    • Computer that speaks our language
    • Computer for seamless, intuitive conversations.

Introduction of Audio Computers

  • 6 years of secret development.
  • Characteristics of the prototype:
    • No screen
    • Entirely ear-based
    • Not just fancy earbuds but a complete computer
  • Aim: Replace visual computer interactions
  • Reliance on new audio-based user interface (AUI)
    • Natural language input
    • Auditory space output

Demonstration of Audio Computer

  • Examples of natural language conversation:
    • Conversational interaction with the system "Q"
    • Tasks like receiving positive messages, telling jokes
  • Emphasis on intuitive, less robotic interactions.

Technological Foundations

  • Revolutionary aspect: leveraging natural spoken language
  • Key elements:
    • Natural language
    • Superhuman processing speed
    • Access to internet and human knowledge databases
  • Distinction from existing voice commands.

Potential Applications

  • Examples of possible uses:
    • Conversational email briefings
    • Voice-based search
    • Personal context building through conversations
    • Less reliance on visual apps

Privacy and Wearability

  • Designed for wear all day
  • Enhances natural hearing
  • Mixed audio-reality capabilities:
    • Enhanced ambient acoustics control
    • Ultra-high fidelity spatial sound
    • Giant audio structure for research
  • Underlying research: psychoacoustics
  • Reverse-engineering brain's sound positioning into software

Practical Experience Simulation

  • Restaurant noise scenario
  • Auditory transformations: enhacement, reduction of specific sounds, real-time translation.
  • Execution of beamforming, auditory scene analysis, machine-learning denoising, AI transcription and translation, text-to-speech.

Future Possibilities

  • Endless potential for app development
  • Examples:
    • Personalized educational apps
    • Conversational fitness coaches
    • Relaxation soundscapes
  • Developer creativity as the limit
  • Aspirations to create the first audio computer and human-centric computing experience

Conclusion

  • Vision: Computing interface that speaks our language intuitively
  • Not monetizing attention, but enhancing life
  • Invitation to move towards audio-based computing.