Home

Voice Video Manipulator

View on GitHub

Project Overview

VVM was my first venture into ROS2 development, combining voice commands, computer vision, and robotic manipulation. This project demonstrates how AI can make robots understand and respond to human commands naturally.

How It Works

The system works in three main steps:

Say "pick up the red ball" and the robot will find the red ball, calculate its position, and grab it!

System in Action

VVM System Overview

Voice Video Manipulator system architecture

Object Detection System

Object detection and distance calculation in real-time

Computer Vision Pipeline

The vision system processes camera feeds to identify and locate objects:

Complete Voice Control Demo

Complete voice-controlled pick-and-place operation

Technical Implementation

Built With

ROS2 Gazebo Python OpenCV Machine Learning Speech Recognition Computer Vision

Key Features

Challenges Solved

My First ROS2 Experience

This project was my introduction to ROS2 development. It taught me how to:

The experience gained from VVM became the foundation for all my future robotics projects.