SkillAgentSearch skills...

Hexapod

This project develops an autonomous hexapod robot using auditory scene analysis for navigation. It integrates sound source localization (DOA) and beamforming via ODAS with a circular microphone array for precise spatial detection. A machine learning-based Keyword Spotting (KWS) module enables voice command recognition for human-robot interaction.

Install / Use

/learn @Gl0dny/Hexapod

README

Thesis : "Hexapod autonomous control system based on auditory scene analysis: real-time sound source localization and keyword spotting for voice command recognition"

Diploma project completed at Warsaw University of Science and Technology as a part of Master of Science in Engineering - Computer Science.

This project aims to develop an autonomous control system for a hexapod walking robot, using auditory scene analysis as the primary modality for navigation and environmental interaction. The system integrates sound source localization (Direction of Arrival estimation - DOA) and beamforming techniques via the ODAS framework, employing a circular microphone array for enhanced spatial precision. This enables the robot to accurately detect and characterize sound sources, allowing real-time responses to acoustic stimuli for dynamic, context-aware behavior.

A Keyword Spotting (KWS) module, powered by machine learning, is incorporated to recognize predefined voice commands, enabling effective human-robot interaction. The research focuses on developing the hardware and software infrastructure to seamlessly integrate acoustic processing with the robot's control system.

The project includes designing and building the robot's platform, encompassing both the mechanical structure and embedded systems. The hexapod's platform is engineered to support advanced auditory processing, ensuring optimal performance in real-world scenarios. This involves creating a robust mechanical framework for stable, agile locomotion and an embedded system architecture for real-time processing and decision-making.

The hardware is designed to accommodate the circular microphone array, ensuring precise sound capture, while the software facilitates seamless communication between auditory processing modules, the control system, and actuators. This comprehensive approach ensures the robot can perform complex tasks, such as navigating dynamic environments and responding accurately to auditory cues.

Real-Time Sound Source Localization: Hexapod Robot with ODAS Audio Processing

[Click the image below to watch the full demonstration video]

Real-Time Sound Source Localization: Hexapod Robot with ODAS Audio Processing

This video demonstrates an autonomous hexapod robot performing advanced auditory scene analysis in real-time. The complete ODAS (Open embeddeD Audition System) pipeline with beamforming is showcased, featuring:

  • Real-time Direction of Arrival (DoA) estimation using a 6-microphone circular array
  • Live GUI visualization showing sound source tracking and spatial mapping
  • Terminal debug output displaying active sound sources with coordinates and activity levels
  • Elevation and azimuth time charts showing temporal tracking of sound source positions
  • System monitoring panel showing CPU usage, temperature, memory usage, and IP address
  • Robot view - top-down view of the hexapod responding to acoustic stimuli
  • LED feedback system indicating detected sound sources through visual cues
  • Multi-source tracking - demonstrating the system's ability to track up to 4 simultaneous sound sources
  • Automatic audio stream separation and recording of individual source audio files

This represents a complete autonomous control system where the hexapod can navigate and interact based purely on auditory cues, enabling sophisticated human-robot interaction through voice commands and environmental sound awareness.

Gamepad Control System

The hexapod supports three control modes through a connected DualSense controller, providing both manual control and automated gait control capabilities:

Control Modes

1. Body Control Mode (Default)

Body Control "Direct body positioning" - Direct control of hexapod body position and orientation using inverse kinematics. Left stick controls translation (forward/back/left/right), right stick controls rotation (roll/pitch), L2/R2 triggers control up/down movement, and L1/R1 control yaw rotation. LEDs show blue pulsing animation (blue base with black pulse) during body control operations.

2. Gait Control Mode

Gait Control "Natural walking movement" - Uses the hexapod's gait generator for realistic walking. Left stick controls movement direction (forward/back/left/right/diagonal), right stick controls rotation while walking, and X button toggles marching in place. LEDs show indigo thinking animation pattern during gait control operations.

3. Voice Control Mode

Voice Control System

"Voice command processing" - System switches to voice control mode where manual inputs are disabled and the robot responds to voice commands. Can be toggled from any manual mode. LEDs show blue and green pulsing animation (blue base with green pulse) synchronized with voice control system.

Gamepad Features

  • Automatic mode detection - System automatically detects connected DualSense controller
  • LED feedback integration - Controller LEDs provide visual feedback matching robot status and mode
  • Seamless mode switching - Switch between body control, gait control, and voice control modes on-the-fly
  • Voice control integration - Voice commands can interrupt and override manual control
  • Precise movement control - Analog sticks provide smooth, proportional control with adjustable sensitivity
  • Safety features - Built-in safety limits and emergency stop functionality
  • Sensitivity adjustment - Real-time sensitivity control via D-pad for fine-tuning movement

Voice Control System

The hexapod operates through a sophisticated voice control system that processes commands through distinct phases, each with specific functionality and visual feedback:

System Phases

1. Wake Word Detection Mode

Wake Word Mode "Listening for 'Hexapod'..." - System continuously monitors audio input for the wake word using Picovoice Porcupine engine. LEDs show pulsing animation (blue base with green pulse) during passive listening state.

2. Intent Recognition Mode

Intent Mode "What would you like me to do?" - After wake word detection, system switches to active command listening using Picovoice Rhino engine. LEDs show alternating light rotating pattern while waiting for voice command.

3. Command Processing Mode

Processing Mode "Processing your request..." - System analyzes the recognized intent, extracts parameters, and determines the appropriate action. System dispatches the command to the appropriate subsystem (movement, lights, audio, or system control). LED animation shows lime green opposite rotation pattern during processing.

4. Error Handling Mode

Error Mode "Command not recognized" - System handles unrecognized commands, invalid parameters, or execution failures. LED indicators show pulsing animation (red base with orange pulse) for error states.

System Features

  • Multi-intent processing - Handles complex commands with multiple parameters
  • Task interruption - Wake word detection automatically interrupts current tasks (gait tasks are gracefully stopped after completing a cycle)
  • Real-time feedback - Visual and audio confirmation of system state
  • Error recovery - Graceful handling of command failures and system errors

Usage Examples

Movement Commands

Walk

Walk "Hexapod, walk/move [direction] [for X seconds/minutes/cycles]" - Omnidirectional movement in 8 directions: forward, backward, left, right, forward left, forward right, backward left, backward right. Supports time-based (seconds/minutes) or cycle-based movement

Rotate

Rotate "Hexapod, rotate/turn [clockwise/counterclockwise] [for X seconds/minutes/cycles]" - Smooth rotation in both directions using inverse kinematics. Supports time-based (seconds/minutes) or cycle-based rotation

March in Place

March in Place "Hexapod, march in place/step in place [for X seconds/minutes]" - In-place marching demonstration with optional duration control

Idle Stance

"Hexapod, go to idle stance/neutral position" - Return to neutral default position

Entertainment Commands

Sit Up

Sit Up "Hexapod, make some sit ups/do sit ups" - Dynamic sit-up exercise routine

Say Hello

Say Hello "Hexapod, say hello/wave" - Friendly greeting gesture with leg movement

Helix

Helix "Hexapod, helix/spiral" - Helical movement pattern

Audio Commands

Sound Source Localization

Sound Source Localization "Hexapod, run sound source localization/analyze sounds" - Analyze environment for sound sources

ODAS Studio

Sound Source Following

"Hexapod, follow me/track me" - Audio-based target following using ODAS

Stream ODAS Audio

"Hexapod, stream ODAS audio" - Stream processed audio from ODAS system to remote host

Start/Stop Recording

"Hexapod, start recording/begin recording [for X seconds/minutes]" / "Hexapod, stop recording/end recording" - Begin/end audio recording with optional duration control

Light Commands

Police Lights

Police Lights "Hexapod, activate police mode/police lights" - Police-style flashing lights

Rainbow Lights

Rainbow Lights "Hexapod, activate rainbow/rainbow mode" - Rainbow color sequence

Change Color

Change Color "Hexapod, change color/set color to [blue/red/green/etc.]" - Change LED

Related Skills

View on GitHub
GitHub Stars5
CategoryEducation
Updated5mo ago
Forks2

Languages

Python

Security Score

72/100

Audited on Nov 6, 2025

No findings