Amir Mahla commited on
Commit
2cd87a8
Β·
1 Parent(s): f5d0df5

CHG README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -214
README.md CHANGED
@@ -7,218 +7,6 @@ sdk: docker
7
  pinned: false
8
  ---
9
 
10
- # CUA2 - Computer Use Agent 2
11
 
12
- An AI-powered automation interface featuring real-time agent task processing, VNC streaming, and step-by-step execution visualization.
13
-
14
- ## πŸš€ Overview
15
-
16
- CUA2 is a full-stack application that provides a modern web interface for AI agents to perform automated computer tasks. The system features real-time WebSocket communication between a FastAPI backend and React frontend, allowing users to monitor agent execution, view screenshots, track token usage, and stream VNC sessions.
17
-
18
- ## πŸ—οΈ Architecture
19
-
20
- ![CUA2 Architecture](assets/architecture.png)
21
-
22
- ## πŸ› οΈ Tech Stack
23
-
24
- ### Backend (`cua2-core`)
25
- - **FastAPI**
26
- - **Uvicorn**
27
- - **smolagents** - AI agent framework with OpenAI/LiteLLM support
28
-
29
- ### Frontend (`cua2-front`)
30
- - **React TS**
31
- - **Vite**
32
-
33
- ## πŸ“‹ Prerequisites
34
-
35
- - **Python** 3.10 or higher
36
- - **Node.js** 18 or higher
37
- - **npm**
38
- - **uv** - Python package manager
39
-
40
- ### Installing uv
41
-
42
- **macOS/Linux:**
43
- ```bash
44
- curl -LsSf https://astral.sh/uv/install.sh | sh
45
- ```
46
-
47
- For more installation options, visit: https://docs.astral.sh/uv/getting-started/installation/
48
-
49
-
50
-
51
- ## πŸš€ Getting Started
52
-
53
- ### 1. Clone the Repository
54
-
55
- ```bash
56
- git clone https://github.com/huggingface/CUA2.git
57
- cd CUA2
58
- ```
59
-
60
- ### 2. Install Dependencies
61
-
62
- Use the Makefile for quick setup:
63
-
64
- ```bash
65
- make sync
66
- ```
67
-
68
- This will:
69
- - Install Python dependencies using `uv`
70
- - Install Node.js dependencies for the frontend
71
-
72
- Or install manually:
73
-
74
- ```bash
75
- # Backend dependencies
76
- cd cua2-core
77
- uv sync --all-extras
78
-
79
- # Frontend dependencies
80
- cd ../cua2-front
81
- npm install
82
- ```
83
-
84
- ### 3. Environment Configuration
85
-
86
- Copy the example environment file and configure your settings:
87
-
88
- ```bash
89
- cd cua2-core
90
- cp env.example .env
91
- ```
92
-
93
- Edit `.env` with your configuration:
94
- - API keys for OpenAI/LiteLLM
95
- - Database connections (if applicable)
96
- - Other service credentials
97
-
98
- ### 4. Start Development Servers
99
-
100
- #### Option 1: Using Makefile (Recommended)
101
-
102
- Open two terminal windows:
103
-
104
- **Terminal 1 - Backend:**
105
- ```bash
106
- make dev-backend
107
- ```
108
-
109
- **Terminal 2 - Frontend:**
110
- ```bash
111
- make dev-frontend
112
- ```
113
-
114
- #### Option 2: Manual Start
115
-
116
- **Terminal 1 - Backend:**
117
- ```bash
118
- cd cua2-core
119
- uv run uvicorn cua2_core.main:app --reload --host 0.0.0.0 --port 8000
120
- ```
121
-
122
- **Terminal 2 - Frontend:**
123
- ```bash
124
- cd cua2-front
125
- npm run dev
126
- ```
127
-
128
- ### 5. Access the Application
129
-
130
- - **Frontend**: http://localhost:8080
131
- - **Backend API**: http://localhost:8000
132
- - **API Documentation**: http://localhost:8000/docs
133
- - **ReDoc**: http://localhost:8000/redoc
134
-
135
- ## πŸ“ Project Structure
136
-
137
- ```
138
- CUA2/
139
- β”œβ”€β”€ cua2-core/ # Backend application
140
- β”‚ β”œβ”€β”€ src/
141
- β”‚ β”‚ └── cua2_core/
142
- β”‚ β”‚ β”œβ”€β”€ app.py # FastAPI application setup
143
- β”‚ β”‚ β”œβ”€β”€ main.py # Application entry point
144
- β”‚ β”‚ β”œβ”€β”€ models/
145
- β”‚ β”‚ β”‚ └── models.py # Pydantic models
146
- β”‚ β”‚ β”œβ”€β”€ routes/
147
- β”‚ β”‚ β”‚ β”œβ”€β”€ routes.py # REST API endpoints
148
- β”‚ β”‚ β”‚ └── websocket.py # WebSocket endpoint
149
- β”‚ β”‚ β”œβ”€β”€ services/
150
- β”‚ β”‚ β”‚ β”œβ”€β”€ agent_service.py # Agent task processing
151
- β”‚ β”‚ β”‚ └── simulation_metadata/ # Demo data
152
- β”‚ β”‚ └── websocket/
153
- β”‚ β”‚ └── websocket_manager.py # WebSocket management
154
- β”‚ β”œβ”€β”€ pyproject.toml # Python dependencies
155
- β”‚ └── env.example # Environment variables template
156
- β”‚
157
- β”œβ”€β”€ cua2-front/ # Frontend application
158
- β”‚ β”œβ”€β”€ src/
159
- β”‚ β”‚ β”œβ”€β”€ App.tsx # Main application component
160
- β”‚ β”‚ β”œβ”€β”€ pages/
161
- β”‚ β”‚ β”‚ └── Index.tsx # Main page
162
- β”‚ β”‚ β”œβ”€β”€ components/
163
- β”‚ β”‚ β”‚ └── mock/ # UI components
164
- β”‚ β”‚ β”œβ”€β”€ hooks/
165
- β”‚ β”‚ β”‚ └── useWebSocket.ts # WebSocket hook
166
- β”‚ β”‚ └── types/
167
- β”‚ β”‚ └── agent.ts # TypeScript type definitions
168
- β”‚ β”œβ”€β”€ package.json # Node dependencies
169
- β”‚ └── vite.config.ts # Vite configuration
170
- β”‚
171
- β”œβ”€β”€ Makefile # Development commands
172
- └── README.md # This file
173
- ```
174
-
175
- ## πŸ”Œ API Endpoints
176
-
177
- ### REST API
178
-
179
- | Method | Endpoint | Description |
180
- |--------|----------|-------------|
181
- | GET | `/health` | Health check with WebSocket connection count |
182
- | GET | `/tasks` | Get all active tasks |
183
- | GET | `/tasks/{task_id}` | Get specific task status |
184
- | GET | `/docs` | Interactive API documentation (Swagger) |
185
- | GET | `/redoc` | Alternative API documentation (ReDoc) |
186
-
187
- ### WebSocket
188
-
189
-
190
- #### Client β†’ Server Events
191
-
192
- - `user_task` - New user task request
193
-
194
- #### Server β†’ Client Events
195
-
196
- - `agent_start` - Agent begins processing
197
- - `agent_progress` - New step completed with image and metadata
198
- - `agent_complete` - Task finished successfully
199
- - `agent_error` - Error occurred during processing
200
- - `vnc_url_set` - VNC stream URL available
201
- - `vnc_url_unset` - VNC stream ended
202
- - `heartbeat` - Connection keep-alive
203
-
204
- ## πŸ§ͺ Development
205
-
206
- ### Available Make Commands
207
-
208
- ```bash
209
- make sync # Sync all dependencies (Python + Node.js)
210
- make dev-backend # Start backend development server
211
- make dev-frontend # Start frontend development server
212
- make pre-commit # Run pre-commit hooks
213
- make clean # Clean build artifacts and caches
214
- ```
215
-
216
- ### Code Quality
217
-
218
- ```bash
219
- # Backend
220
- make pre-commit
221
- ```
222
-
223
-
224
- **Happy Coding! πŸš€**
 
7
  pinned: false
8
  ---
9
 
10
+ # CUA2 - Computer Use Agent
11
 
12
+ An AI-powered automation interface featuring real-time agent task processing, VNC streaming, and step-by-step execution visualization.