2022 Robotics & Computer Vision

Neato Gesture Commands

An intuitive gesture-based control system for Neato robots, using MediaPipe for hand detection and machine learning for custom gesture recognition. This project enables natural human-robot interaction through hand movements, allowing users to control robot navigation and behaviors without physical interfaces.

Controls

8+

Distinct gesture commands

Framework

ROS2

Robot Operating System integration

Recognition

Custom

Trainable gesture models

The Challenge

Creating a reliable and intuitive gesture-based control system presented unique opportunities to explore human-robot interaction. The project required implementing robust hand detection that works reliably in varying conditions, developing a system for training custom gesture recognition rather than relying on pre-trained models, and creating intuitive mappings between gestures and robot behaviors.

The system needed to ensure real-time processing with minimal latency for responsive control while designing advanced features like path drawing that translate hand movements into robot trajectories. Additionally, integrating gesture recognition with ROS2 for reliable robot control demanded careful coordination between computer vision processing and robotic control systems to create a seamless user experience.

Gesture Recognition System

Gesture Recognition System in Action

The Solution

We developed a comprehensive gesture recognition system using advanced computer vision techniques. The system implemented MediaPipe for accurate hand detection and tracking of key points including fingertips and knuckles, created a custom training system for recognizing user-defined gestures, and developed intuitive number-based gesture commands for basic robot control. The solution also implemented specific behavior algorithms triggered by different gestures and built a prototype path drawing feature that tracks finger movements to create custom robot paths.

Our approach to gesture control focused on user-friendliness and system extensibility. We used MediaPipe's hand landmark detection to identify 21 key points on each hand, implemented custom gesture training where users could create their own gesture dataset, and designed a speed control system that uses the distance between thumb and index finger as a variable control. The system created an integrated solution where the robot responds to gestures in real-time via ROS2 communication, enabling seamless interaction between human gestures and robotic behaviors.

Hand Tracking System

MediaPipe Hand Tracking Visualization

How It Works

Our implementation went through several distinct development phases. After evaluating various computer vision libraries, we chose MediaPipe for its robust hand detection capabilities, which provided pre-built functionalities for identifying and tracking hands, including detailed finger position data essential for our gesture recognition system. Rather than using pre-trained models limited to standard gestures, we implemented a system that allowed us to create and train recognition for custom gestures, involving capturing images of our gestures in real-time and using them to build a custom dataset for machine learning-based recognition.

We mapped specific gestures to robot behaviors, initially using random gesture assignments before transitioning to a more intuitive number-based system for better usability. The gesture control system evolved from basic commands to more sophisticated interactions, with movement control using the robot's odometry to create consistent patterns rather than time-based commands, leveraging the robot's positioning system to achieve precise movements and turns for driving in various shapes.

The number gesture system used intuitive commands where each number triggered a specific robot behavior: Number 1 (pointer finger) for forward movement, Number 2 (peace sign) for turn right, Number 3 (three fingers) for turn left, Number 4 (four fingers) for stop, Number 0 (fist) for continuous spinning, open hand for drawing a square, triangle gesture for drawing a triangle, and thumb and index finger pinch for speed control with distance proportional to speed. We also began work on an advanced path drawing feature that would allow users to "draw" paths in the air for the robot to follow, involving tracking finger movement, recognizing the intended shape, and converting it to precise robot navigation commands.

Number Gesture System

Number-Based Gesture Control System

System Performance

The gesture control system successfully implemented a working solution for the Neato robot, achieving reliable recognition of multiple distinct hand gestures and creating intuitive mappings between gestures and robot behaviors. The system developed a functional prototype for custom gesture training and established a foundation for future development of path drawing functionality, demonstrating the practical application of computer vision techniques for hand detection and tracking.

The implementation showcased machine learning approaches for custom gesture recognition, ROS2 integration for translating gesture commands to robot actions, and real-time processing optimization for responsive control. The project provided valuable insights into human-robot interaction design principles for intuitive interfaces, proving that gesture-based control could serve as an effective alternative to traditional robotic interfaces.

Future Development

Future enhancements could include a waypoint-based drawing system to create more precise paths, recognition of pauses in finger movement to mark corner points, and support for multiple path geometries including lines, zigzags, and shapes. The system could benefit from improved curve handling and path smoothing algorithms, enhanced gesture recognition accuracy in varied lighting conditions, and implementation of more sophisticated gesture combinations.

Additional improvements might involve user-specific gesture calibration and preferences, integration with other robot capabilities such as mapping and object interaction, and expansion of the gesture vocabulary for more complex commands. The translation of detected paths into precise robot movement commands would complete the path drawing functionality, creating a comprehensive gesture-based control system for advanced robotic applications.