Developing Advanced AI Assistants

Apr 11, 2025

AI Assistant Development Lecture Notes

Introduction

  • Presenter showcases a Red Bull energy drink while holding a magazine with Spanish text: "Las Florida Delo".
  • Previous project involved building an AI assistant using a microphone and webcam, which received positive feedback.

Collaboration with Life Kit

  • A company called Life Kit contacted the presenter to develop a more advanced AI assistant.
  • Life Kit built the platform for OpenAI's ChatGPT.
  • The presenter decided to rewrite the AI assistant using Life Kit's platform.

Demonstration of AI Assistant

  • The assistant can identify objects and events through webcam interaction.
  • Example: Responding to a card that says "Happy Father's Day" for Father's Day celebration.

Code Overview

  • The source code is around 139 lines long, with comments for clarity.
  • Essential steps before running the code:
    • Create a virtual environment.
    • Install required libraries.
    • Set up environment variables for Life Kit, Deep Gram, and OpenAI API.

Assistant Functionality

  • The assistant interacts primarily through text, only accessing the webcam for specific queries.
  • This prevents unnecessary data transfer and speeds up interactions.

Function Calling

  • The assistant uses function calling to determine when an image is needed.
  • When a question requires visual context, the assistant requests an image instead of trying to fetch it on every request.
  • This is achieved by returning a function call indicating the requirement for an image.

Source Code Implementation

  • The main entry point is defined in the code, establishing a chat context and initializing the AI model.
  • The assistant's personality is injected through a system message.
  • Voice detection and speech-to-text functionalities are integrated using Deep Gram.

Handling Events

  • Events are used to manage messages and function calls:
    • message receive: Triggered when new messages are received and parsed.
    • function call finish: Triggered when a function call completes, allowing the assistant to respond using the captured context.

Practical Usage and Test

  • The assistant can be run through the command: python assistant.py start.
  • The presenter connects to the playground, allowing the assistant to access the webcam and microphone.
  • Demonstrated functionalities include counting fingers and identifying objects, showcasing how the assistant interacts.

Conclusion

  • The presentation concludes with an invitation for viewers to like and subscribe for more content.