AI-Powered Virtual Assistant: Jarvis

Jul 12, 2024

AI-Powered Virtual Assistant: Jarvis

Introduction

  • Development of a virtual assistant named Jarvis which responds to voice commands.
  • Utilizes speech recognition, web browser interaction, and OpenAI for text generation.
  • Capabilities include opening websites, playing YouTube music, fetching news, and AI-powered responses.
  • Built to authenticate developing and working knowledge of Python packages such as speech_recognition, pyttsx3, webbrowser, and openai.

Pre-requisites

  • Install the required Python packages using pip:
    pip install speech_recognition pyttsx3 webbrowser openai pyaudio
    
  • Install setup tools:
    pip install setup tools
    

Main Steps in Building Jarvis

Initialize Speech Recognition and Text-to-Speech

  • Import required libraries and initialize modules:
    import speech_recognition as sr
    import pyttsx3
    import webbrowser
    from openai import OpenAI
    import random
    import os
    
  • Set up recognizer and text-to-speech engine:
    recognizer = sr.Recognizer()
    engine = pyttsx3.init()
    

Define Speak Function

  • Create a function to make Jarvis speak using pyttsx3:
    def speak(text):
        engine.say(text)
        engine.runAndWait()
    

Implement Voice Command Listening and Responding

  • Use speech_recognition to listen for commands and convert them to text:
    while True:
        with sr.Microphone() as source:
            print("Listening...")
            audio = recognizer.listen(source)
            try:
                command = recognizer.recognize_google(audio)
                command = command.lower()
                print(f"Command: {command}")
                if 'jarvis' in command:
                    speak("Yes, what can I do for you?")
                    # Further commands can be processed here
            except Exception as e:
                print(e)
    

Adding Capabilities

Open Websites

  • Add functionality to open websites by voice command:
    if 'open google' in command:
        webbrowser.open('https://www.google.com')
    elif 'open facebook' in command:
        webbrowser.open('https://www.facebook.com')
    elif 'open youtube' in command:
        webbrowser.open('https://www.youtube.com')
    

Play YouTube Music

  • Use a predefined music library to play specific YouTube songs:
    music_library = {
        'stealth': 'youtube-url',
        'march': 'youtube-url',
        'skyfall': 'youtube-url'
    }
    
    if 'play' in command:
        song = command.replace('play ', '')
        if song in music_library:
            webbrowser.open(music_library[song])
        else:
            speak("Sorry, I don't have that song in my library.")
    

Fetch News

  • Integrate news API to fetch latest news headlines:
    import requests
    
    def get_news():
        url = 'http://newsapi.org/v2/top-headlines?country=in&apiKey=yourapikey'
        news = requests.get(url).json()
    
        headlines = [article['title'] for article in news['articles'][:5]]
        for headline in headlines:
            speak(headline)
    

AI-Powered Responses using OpenAI

  • Integrate OpenAI to handle complex queries:
    import openai
    openai.api_key = 'your-openai-api-key'
    
    def ai_response(prompt):
        response = openai.Completion.create(
            engine="text-davinci-003",
            prompt=prompt,
            max_tokens=150
        )
        return response.choices[0].text.strip()
    
    # Use in your command processing
    output = ai_response(command)
    speak(output)
    

Enhancements and Final Words

  • This assistant is a starting project. You can extend its capabilities by adding more commands.
  • Add scheduler capability to remind you to take breaks or notify you of appointments.
  • Fine-tune speech recognition timeout and phrase time limit for better performance.

Example Code

import speech_recognition as sr
import pyttsx3
import webbrowser
from openai import OpenAI
import requests
import random
import os

recognizer = sr.Recognizer()
engine = pyttsx3.init()

# Open AI
import openai
openai.api_key = "your-openai-api-key"

def speak(text):
    engine.say(text)
    engine.runAndWait()

def listen():
    with sr.Microphone() as source:
        print("Listening...")
        audio = recognizer.listen(source)
        try:
            command = recognizer.recognize_google(audio)
            command = command.lower()
            return command
        except Exception as e:
            print("Could not understand the audio")
            return ""

def open_website(url):
    webbrowser.open(url)

while True:
    command = listen()
    if 'jarvis' in command:
        speak("Yes, what can I do for you?")
        command = listen()
        if 'open google' in command:
            open_website('https://www.google.com')
        elif 'open youtube' in command:
            open_website('https://www.youtube.com')
        elif 'play' in command:
            song = command.replace('play', '').strip()
            # Assign URL for songs
            songs = {
                'stealth': 'https://www.youtube.com/watch?v=your-url',
                'march': 'https://www.youtube.com/watch?v=your-url',
            }
            if song in songs:
                open_website(songs[song])
            else:
                speak("Sorry, I don't have that song in my library.")
        elif 'news' in command:
            get_news()
        else:
            response = ai_response(command)
            speak(response)

Additional Libraries

  • In case you want to use Google's GTTS (Google Text-to-Speech):

    from gtts import gTTS
    import os
    
    def speak(text):
        tts = gTTS(text=text, lang='en')
        tts.save('temp.mp3')
        os.system('mpg321 temp.mp3')
    
  • To play audio in python:

    import pygame
    
    def play_audio(file_path):
        pygame.mixer.init()
        pygame.mixer.music.load(file_path)
        pygame.mixer.music.play()
    

Enjoy building and enhancing your Python virtual assistant!