📷

OpenCV Object Detection Tutorial

Jul 18, 2024

OpenCV Object Detection Tutorial

Introduction

  • Objective: Implement object detection using OpenCV.
  • Outcome: Computer verbally announces detected objects from a live video feed.

Dependencies

Libraries to Install

  • OpenCV Contrib: pip install opencv-contrib-python
    • Contains additional libraries beyond basic OpenCV.
  • cvlib: pip install cvlib
    • Used for object detection.
  • gtts: pip install gtts
    • Allows the computer to generate speech.
  • playsound: pip install playsound
    • Plays sound files.
  • pyobjc: pip3 install pyobjc
    • Makes playsound more efficient.

Importing Libraries

import cv2 import cvlib as cv from cvlib.object_detection import draw_bbox from gtts import gTTS from playsound import playsound

Accessing the Camera

  • Access the video feed from the camera:
video = cv2.VideoCapture(1) # or use index 0 for default camera

Capturing Frames

  • Use a loop to read frames from the video feed:
while True: ret, frame = video.read() # Unpack each frame

Detecting Objects & Drawing Bounding Boxes

  • Detect objects and draw bounding boxes with labels:
bbox, label, conf = cv.detect_common_objects(frame) output_image = draw_bbox(frame, bbox, label, conf)

Showing Live Object Detection

  • Display the output image in a window:
cv2.imshow('Object Detection', output_image)

Breaking the Loop

  • Stop the video feed when 'q' is pressed:
if cv2.waitKey(1) & 0xFF == ord('q'): break

Handling Detected Labels

Managing Labels List

  • Store detected labels in a list without duplicates:
labels = [] for item in label: if item not in labels: labels.append(item)

Generating Speech from Detected Labels

Creating a New Sentence from Labels

  • Create a sentence that lists all detected items:
new_sentence = [] for i, label in enumerate(labels): if i == 0: new_sentence.append(f'I found a {label},') else: new_sentence.append(label) sentence = ' '.join(new_sentence)

Defining the Speech Function

  • Function to convert text to speech and play it:
def speech(text): print(text) language = 'en' output = gTTS(text=text, lang=language, slow=False) output.save('sounds/output.mp3') playsound('sounds/output.mp3')
  • Call the speech function with the sentence:
speech(sentence)

Conclusion

  • Review: In this tutorial, we used OpenCV to detect objects in a live video feed and verbally announce them using gTTS and playsound.
  • Next steps: Experiment with different indices for your camera, modify the detection criteria, or integrate other functionalities.