Spaces:
Running
Running
| # Reachy Mini DanceML Architecture | |
| ## System Architecture | |
| ```mermaid | |
| flowchart TB | |
| subgraph Input["π€ Input Layer"] | |
| USER["User Voice"] | |
| MIC["Browser Microphone<br/>(Laptop/Mobile)"] | |
| end | |
| subgraph Streaming["β‘ Streaming Layer"] | |
| GRADIO["Gradio UI<br/>:8042"] | |
| end | |
| subgraph AI["π§ AI Layer (OpenAI Realtime)"] | |
| ASR["Speech-to-Text<br/>(Whisper)"] | |
| REASON["gpt-realtime<br/>+ SYSTEM_INSTRUCTIONS"] | |
| TTS["Text-to-Speech"] | |
| end | |
| subgraph Tools["π§ 11 Tools"] | |
| direction TB | |
| subgraph Core["Core Movement"] | |
| GOTO["goto_pose"] | |
| LOOK["look_at"] | |
| STOP["stop_movement"] | |
| end | |
| subgraph Library["Library Moves"] | |
| SEARCH["search_moves"] | |
| PLAY["play_move"] | |
| end | |
| subgraph Procedural["Procedural Motion"] | |
| GENMOTION["generate_motion"] | |
| end | |
| subgraph Sequences["Multi-Step"] | |
| EXECSEQ["execute_sequence"] | |
| end | |
| subgraph BuiltIn["Lifecycle & Control"] | |
| WAKE["wake_up"] | |
| SLEEP["goto_sleep"] | |
| MOTOR["motor_control"] | |
| end | |
| subgraph Reference["Reference"] | |
| GUIDE["get_choreography_guide"] | |
| end | |
| end | |
| subgraph Planner["π€ Sequence Planner (GPT-4.1)"] | |
| PLAN["SequencePlanner<br/>+ PLANNER_SYSTEM_PROMPT"] | |
| end | |
| subgraph Backend["π¦ Backend"] | |
| HANDLER["RealtimeHandler<br/>(tool dispatch)"] | |
| GENERATOR["MovementGenerator<br/>(50Hz motor thread)"] | |
| EXECUTOR["SequenceExecutor"] | |
| PROCMOVE["ProceduralMove"] | |
| MOVELIBRARY["MoveLibrary<br/>(101 moves)"] | |
| end | |
| subgraph Robot["π€ Reachy Mini"] | |
| HEAD["Head<br/>roll/pitch/yaw"] | |
| BODY["Body<br/>yaw Β±180Β°"] | |
| ANTENNAS["Antennas<br/>left/right"] | |
| SPEAKER["Speaker"] | |
| end | |
| %% Flow | |
| USER --> MIC --> GRADIO --> ASR --> REASON | |
| REASON --> TTS --> SPEAKER | |
| REASON -->|"function_call"| Tools | |
| Tools --> HANDLER | |
| %% Tool routing | |
| HANDLER --> GENERATOR | |
| HANDLER --> EXECUTOR | |
| HANDLER --> MOVELIBRARY | |
| HANDLER --> PROCMOVE | |
| %% Sequence planning | |
| EXECSEQ -.->|"plan request"| PLAN | |
| PLAN -.->|"SequencePlan"| EXECUTOR | |
| %% Execution to hardware | |
| GENERATOR --> HEAD | |
| GENERATOR --> BODY | |
| GENERATOR --> ANTENNAS | |
| ``` | |
| --- | |
| ## Tool Reference (11 Tools) | |
| | Tool | Category | Description | | |
| |------|----------|-------------| | |
| | `goto_pose` | Core | Move to specific head/body angles with duration | | |
| | `look_at` | Core | Look at direction (up/down/left/right/floor/ceiling) or 3D point | | |
| | `stop_movement` | Core | Stop all movement and return to neutral | | |
| | `search_moves` | Library | Semantic search of 101 pre-recorded moves | | |
| | `play_move` | Library | Play a named library move | | |
| | `generate_motion` | Procedural | Continuous procedural motion with waveforms, drifts, antenna control | | |
| | `execute_sequence` | Sequences | Multi-step choreography with timing (uses GPT-4.1 planner) | | |
| | `wake_up` | Lifecycle | Play built-in wake animation | | |
| | `goto_sleep` | Lifecycle | Play built-in sleep animation | | |
| | `motor_control` | Control | Enable/disable motors or gravity compensation | | |
| | `get_choreography_guide` | Reference | Load choreography guide for custom movements | | |
| --- | |
| ## Tool Selection Flow | |
| ```mermaid | |
| flowchart TD | |
| START(("π€ User<br/>Request")) --> INTENT{"Classify<br/>Intent"} | |
| INTENT -->|"look left<br/>tilt head"| SIMPLE["π― SIMPLE"] | |
| INTENT -->|"stop<br/>freeze"| EMERGENCY["π STOP"] | |
| INTENT -->|"show happy<br/>do a dance"| EMOTION["π EMOTION"] | |
| INTENT -->|"spiral motion<br/>wiggle antenna"| PROCEDURAL["π PROCEDURAL"] | |
| INTENT -->|"peek-a-boo<br/>multi-step"| SEQUENCE["π¬ SEQUENCE"] | |
| SIMPLE --> GOTO_POSE["goto_pose()"] | |
| EMERGENCY --> STOP_MOVE["stop_movement()"] | |
| EMOTION --> SEARCH_LIB["search_moves()"] | |
| SEARCH_LIB --> FOUND{"Results?"} | |
| FOUND -->|"Yes"| PLAY_MOVE["play_move()"] | |
| FOUND -->|"No"| GEN_MOTION | |
| PROCEDURAL --> GEN_MOTION["generate_motion()"] | |
| SEQUENCE --> EXEC_SEQ["execute_sequence()"] | |
| GOTO_POSE --> EXECUTE["β‘ Execute"] | |
| STOP_MOVE --> EXECUTE | |
| PLAY_MOVE --> EXECUTE | |
| GEN_MOTION --> EXECUTE | |
| EXEC_SEQ --> EXECUTE | |
| EXECUTE --> ROBOT(("π€ Robot<br/>Moves")) | |
| ``` | |
| --- | |
| ## Component Summary | |
| | Layer | Component | Purpose | | |
| |-------|-----------|---------| | |
| | **Input** | Gradio UI | Web interface + audio capture | | |
| | **AI** | OpenAI Realtime API | Speech recognition, reasoning, TTS | | |
| | **AI** | GPT-4.1 (Planner) | Sequence planning for multi-step actions | | |
| | **Tools** | 11 functions | Intent execution via function calling | | |
| | **Backend** | MoveLibrary | 101 pre-recorded HuggingFace moves | | |
| | **Backend** | MovementGenerator | 50Hz motor control thread | | |
| | **Backend** | ProceduralMove | Waveform-based motion generation | | |
| | **Backend** | SequenceExecutor | Step-by-step sequence execution | | |
| | **Output** | Reachy Mini SDK | Motor control, audio playback | | |
| --- | |
| ## System Prompts | |
| The agent uses **two system prompts**: | |
| 1. **SYSTEM_INSTRUCTIONS** ([realtime_handler.py](../reachy_mini_danceml/realtime_handler.py#L19)) | |
| - Main conversational AI instructions | |
| - Tool selection guide, physical conventions, physics envelope | |
| - ~200 lines | |
| 2. **PLANNER_SYSTEM_PROMPT** ([sequence_planner.py](../reachy_mini_danceml/sequence_planner.py#L56)) | |
| - GPT-4.1 sequence planning instructions | |
| - Step types: move, wait, speak, motion | |
| - ~35 lines | |