JatsTheAIGen commited on
Commit
50ec2c4
·
1 Parent(s): 48f2898

Add comprehensive Flask API documentation for integration

Browse files
Files changed (1) hide show
  1. API_DOCUMENTATION.md +745 -0
API_DOCUMENTATION.md ADDED
@@ -0,0 +1,745 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Flask API Documentation
2
+
3
+ ## Overview
4
+
5
+ The Research AI Assistant API provides a RESTful interface for interacting with an AI-powered research assistant. The API uses local GPU models for inference and supports conversational interactions with context management.
6
+
7
+ **Base URL:** `https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API`
8
+
9
+ **API Version:** 1.0
10
+
11
+ **Content-Type:** `application/json`
12
+
13
+ ## Features
14
+
15
+ - 🤖 **AI-Powered Responses** - Local GPU model inference (Tesla T4)
16
+ - 💬 **Conversational Context** - Maintains conversation history and user context
17
+ - 🔒 **CORS Enabled** - Ready for web integration
18
+ - ⚡ **Async Processing** - Efficient request handling
19
+ - 📊 **Transparent Reasoning** - Returns reasoning chains and performance metrics
20
+
21
+ ---
22
+
23
+ ## Authentication
24
+
25
+ Currently, the API does not require authentication. However, for production use, you should:
26
+
27
+ 1. Set `HF_TOKEN` environment variable for Hugging Face model access
28
+ 2. Implement API key authentication if needed
29
+
30
+ ---
31
+
32
+ ## Endpoints
33
+
34
+ ### 1. Get API Information
35
+
36
+ **Endpoint:** `GET /`
37
+
38
+ **Description:** Returns API information, version, and available endpoints.
39
+
40
+ **Request:**
41
+ ```http
42
+ GET / HTTP/1.1
43
+ Host: huggingface.co
44
+ ```
45
+
46
+ **Response:**
47
+ ```json
48
+ {
49
+ "name": "AI Assistant Flask API",
50
+ "version": "1.0",
51
+ "status": "running",
52
+ "orchestrator_ready": true,
53
+ "features": {
54
+ "local_gpu_models": true,
55
+ "max_workers": 4,
56
+ "hardware": "NVIDIA T4 Medium"
57
+ },
58
+ "endpoints": {
59
+ "health": "GET /api/health",
60
+ "chat": "POST /api/chat",
61
+ "initialize": "POST /api/initialize"
62
+ }
63
+ }
64
+ ```
65
+
66
+ **Status Codes:**
67
+ - `200 OK` - Success
68
+
69
+ ---
70
+
71
+ ### 2. Health Check
72
+
73
+ **Endpoint:** `GET /api/health`
74
+
75
+ **Description:** Checks if the API and orchestrator are ready to handle requests.
76
+
77
+ **Request:**
78
+ ```http
79
+ GET /api/health HTTP/1.1
80
+ Host: huggingface.co
81
+ ```
82
+
83
+ **Response:**
84
+ ```json
85
+ {
86
+ "status": "healthy",
87
+ "orchestrator_ready": true
88
+ }
89
+ ```
90
+
91
+ **Status Codes:**
92
+ - `200 OK` - API is healthy
93
+ - `orchestrator_ready: true` - Ready to process requests
94
+ - `orchestrator_ready: false` - Still initializing
95
+
96
+ **Example Response (Initializing):**
97
+ ```json
98
+ {
99
+ "status": "initializing",
100
+ "orchestrator_ready": false
101
+ }
102
+ ```
103
+
104
+ ---
105
+
106
+ ### 3. Chat Endpoint
107
+
108
+ **Endpoint:** `POST /api/chat`
109
+
110
+ **Description:** Send a message to the AI assistant and receive a response with reasoning and context.
111
+
112
+ **Request Headers:**
113
+ ```http
114
+ Content-Type: application/json
115
+ ```
116
+
117
+ **Request Body:**
118
+ ```json
119
+ {
120
+ "message": "Explain quantum entanglement in simple terms",
121
+ "history": [
122
+ ["User message 1", "Assistant response 1"],
123
+ ["User message 2", "Assistant response 2"]
124
+ ],
125
+ "session_id": "session-123",
126
+ "user_id": "user-456"
127
+ }
128
+ ```
129
+
130
+ **Request Fields:**
131
+
132
+ | Field | Type | Required | Description |
133
+ |-------|------|----------|-------------|
134
+ | `message` | string | ✅ Yes | User's message/question (max 10,000 characters) |
135
+ | `history` | array | ❌ No | Conversation history as array of `[user, assistant]` pairs |
136
+ | `session_id` | string | ❌ No | Unique session identifier for context continuity |
137
+ | `user_id` | string | ❌ No | User identifier (defaults to "anonymous") |
138
+
139
+ **Response (Success):**
140
+ ```json
141
+ {
142
+ "success": true,
143
+ "message": "Quantum entanglement is when two particles become linked...",
144
+ "history": [
145
+ ["Explain quantum entanglement", "Quantum entanglement is when two particles become linked..."]
146
+ ],
147
+ "reasoning": {
148
+ "intent": "educational_query",
149
+ "steps": ["Understanding request", "Gathering information", "Synthesizing response"],
150
+ "confidence": 0.95
151
+ },
152
+ "performance": {
153
+ "response_time_ms": 2345,
154
+ "tokens_generated": 156,
155
+ "model_used": "mistralai/Mistral-7B-Instruct-v0.2"
156
+ }
157
+ }
158
+ ```
159
+
160
+ **Response Fields:**
161
+
162
+ | Field | Type | Description |
163
+ |-------|------|-------------|
164
+ | `success` | boolean | Whether the request was successful |
165
+ | `message` | string | AI assistant's response |
166
+ | `history` | array | Updated conversation history including the new exchange |
167
+ | `reasoning` | object | AI reasoning process and confidence metrics |
168
+ | `performance` | object | Performance metrics (response time, tokens, model used) |
169
+
170
+ **Status Codes:**
171
+ - `200 OK` - Request processed successfully
172
+ - `400 Bad Request` - Invalid request (missing message, empty message, too long, wrong type)
173
+ - `500 Internal Server Error` - Server error processing request
174
+ - `503 Service Unavailable` - Orchestrator not ready (still initializing)
175
+
176
+ **Error Response:**
177
+ ```json
178
+ {
179
+ "success": false,
180
+ "error": "Message is required",
181
+ "message": "Error processing your request. Please try again."
182
+ }
183
+ ```
184
+
185
+ ---
186
+
187
+ ### 4. Initialize Orchestrator
188
+
189
+ **Endpoint:** `POST /api/initialize`
190
+
191
+ **Description:** Manually trigger orchestrator initialization (useful if initialization failed on startup).
192
+
193
+ **Request:**
194
+ ```http
195
+ POST /api/initialize HTTP/1.1
196
+ Host: huggingface.co
197
+ Content-Type: application/json
198
+ ```
199
+
200
+ **Request Body:**
201
+ ```json
202
+ {}
203
+ ```
204
+
205
+ **Response (Success):**
206
+ ```json
207
+ {
208
+ "success": true,
209
+ "message": "Orchestrator initialized successfully"
210
+ }
211
+ ```
212
+
213
+ **Response (Failure):**
214
+ ```json
215
+ {
216
+ "success": false,
217
+ "message": "Initialization failed. Check logs for details."
218
+ }
219
+ ```
220
+
221
+ **Status Codes:**
222
+ - `200 OK` - Initialization successful
223
+ - `500 Internal Server Error` - Initialization failed
224
+
225
+ ---
226
+
227
+ ## Code Examples
228
+
229
+ ### Python
230
+
231
+ ```python
232
+ import requests
233
+ import json
234
+
235
+ BASE_URL = "https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API"
236
+
237
+ # Check health
238
+ def check_health():
239
+ response = requests.get(f"{BASE_URL}/api/health")
240
+ return response.json()
241
+
242
+ # Send chat message
243
+ def send_message(message, session_id=None, user_id=None, history=None):
244
+ payload = {
245
+ "message": message,
246
+ "session_id": session_id,
247
+ "user_id": user_id or "anonymous",
248
+ "history": history or []
249
+ }
250
+
251
+ response = requests.post(
252
+ f"{BASE_URL}/api/chat",
253
+ json=payload,
254
+ headers={"Content-Type": "application/json"}
255
+ )
256
+
257
+ if response.status_code == 200:
258
+ return response.json()
259
+ else:
260
+ raise Exception(f"API Error: {response.status_code} - {response.text}")
261
+
262
+ # Example usage
263
+ if __name__ == "__main__":
264
+ # Check if API is ready
265
+ health = check_health()
266
+ print(f"API Status: {health}")
267
+
268
+ if health.get("orchestrator_ready"):
269
+ # Send a message
270
+ result = send_message(
271
+ message="What is machine learning?",
272
+ session_id="my-session-123",
273
+ user_id="user-456"
274
+ )
275
+
276
+ print(f"Response: {result['message']}")
277
+ print(f"Reasoning: {result.get('reasoning', {})}")
278
+
279
+ # Continue conversation
280
+ history = result['history']
281
+ result2 = send_message(
282
+ message="Can you explain neural networks?",
283
+ session_id="my-session-123",
284
+ user_id="user-456",
285
+ history=history
286
+ )
287
+ print(f"Follow-up Response: {result2['message']}")
288
+ ```
289
+
290
+ ### JavaScript (Fetch API)
291
+
292
+ ```javascript
293
+ const BASE_URL = 'https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API';
294
+
295
+ // Check health
296
+ async function checkHealth() {
297
+ const response = await fetch(`${BASE_URL}/api/health`);
298
+ return await response.json();
299
+ }
300
+
301
+ // Send chat message
302
+ async function sendMessage(message, sessionId = null, userId = null, history = []) {
303
+ const payload = {
304
+ message: message,
305
+ session_id: sessionId,
306
+ user_id: userId || 'anonymous',
307
+ history: history
308
+ };
309
+
310
+ const response = await fetch(`${BASE_URL}/api/chat`, {
311
+ method: 'POST',
312
+ headers: {
313
+ 'Content-Type': 'application/json'
314
+ },
315
+ body: JSON.stringify(payload)
316
+ });
317
+
318
+ if (!response.ok) {
319
+ const error = await response.json();
320
+ throw new Error(`API Error: ${response.status} - ${error.error || error.message}`);
321
+ }
322
+
323
+ return await response.json();
324
+ }
325
+
326
+ // Example usage
327
+ async function main() {
328
+ try {
329
+ // Check if API is ready
330
+ const health = await checkHealth();
331
+ console.log('API Status:', health);
332
+
333
+ if (health.orchestrator_ready) {
334
+ // Send a message
335
+ const result = await sendMessage(
336
+ 'What is machine learning?',
337
+ 'my-session-123',
338
+ 'user-456'
339
+ );
340
+
341
+ console.log('Response:', result.message);
342
+ console.log('Reasoning:', result.reasoning);
343
+
344
+ // Continue conversation
345
+ const result2 = await sendMessage(
346
+ 'Can you explain neural networks?',
347
+ 'my-session-123',
348
+ 'user-456',
349
+ result.history
350
+ );
351
+ console.log('Follow-up Response:', result2.message);
352
+ }
353
+ } catch (error) {
354
+ console.error('Error:', error);
355
+ }
356
+ }
357
+
358
+ main();
359
+ ```
360
+
361
+ ### cURL
362
+
363
+ ```bash
364
+ # Check health
365
+ curl -X GET "https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API/api/health"
366
+
367
+ # Send chat message
368
+ curl -X POST "https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API/api/chat" \
369
+ -H "Content-Type: application/json" \
370
+ -d '{
371
+ "message": "What is machine learning?",
372
+ "session_id": "my-session-123",
373
+ "user_id": "user-456",
374
+ "history": []
375
+ }'
376
+
377
+ # Continue conversation
378
+ curl -X POST "https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API/api/chat" \
379
+ -H "Content-Type: application/json" \
380
+ -d '{
381
+ "message": "Can you explain neural networks?",
382
+ "session_id": "my-session-123",
383
+ "user_id": "user-456",
384
+ "history": [
385
+ ["What is machine learning?", "Machine learning is a subset of artificial intelligence..."]
386
+ ]
387
+ }'
388
+ ```
389
+
390
+ ### Node.js (Axios)
391
+
392
+ ```javascript
393
+ const axios = require('axios');
394
+
395
+ const BASE_URL = 'https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API';
396
+
397
+ // Check health
398
+ async function checkHealth() {
399
+ const response = await axios.get(`${BASE_URL}/api/health`);
400
+ return response.data;
401
+ }
402
+
403
+ // Send chat message
404
+ async function sendMessage(message, sessionId = null, userId = null, history = []) {
405
+ try {
406
+ const response = await axios.post(`${BASE_URL}/api/chat`, {
407
+ message: message,
408
+ session_id: sessionId,
409
+ user_id: userId || 'anonymous',
410
+ history: history
411
+ }, {
412
+ headers: {
413
+ 'Content-Type': 'application/json'
414
+ }
415
+ });
416
+
417
+ return response.data;
418
+ } catch (error) {
419
+ if (error.response) {
420
+ throw new Error(`API Error: ${error.response.status} - ${error.response.data.error || error.response.data.message}`);
421
+ }
422
+ throw error;
423
+ }
424
+ }
425
+
426
+ // Example usage
427
+ (async () => {
428
+ try {
429
+ const health = await checkHealth();
430
+ console.log('API Status:', health);
431
+
432
+ if (health.orchestrator_ready) {
433
+ const result = await sendMessage(
434
+ 'What is machine learning?',
435
+ 'my-session-123',
436
+ 'user-456'
437
+ );
438
+
439
+ console.log('Response:', result.message);
440
+ }
441
+ } catch (error) {
442
+ console.error('Error:', error.message);
443
+ }
444
+ })();
445
+ ```
446
+
447
+ ---
448
+
449
+ ## Error Handling
450
+
451
+ ### Common Error Responses
452
+
453
+ #### 400 Bad Request
454
+
455
+ **Missing Message:**
456
+ ```json
457
+ {
458
+ "success": false,
459
+ "error": "Message is required"
460
+ }
461
+ ```
462
+
463
+ **Empty Message:**
464
+ ```json
465
+ {
466
+ "success": false,
467
+ "error": "Message cannot be empty"
468
+ }
469
+ ```
470
+
471
+ **Message Too Long:**
472
+ ```json
473
+ {
474
+ "success": false,
475
+ "error": "Message too long. Maximum length is 10000 characters"
476
+ }
477
+ ```
478
+
479
+ **Invalid Type:**
480
+ ```json
481
+ {
482
+ "success": false,
483
+ "error": "Message must be a string"
484
+ }
485
+ ```
486
+
487
+ #### 503 Service Unavailable
488
+
489
+ **Orchestrator Not Ready:**
490
+ ```json
491
+ {
492
+ "success": false,
493
+ "error": "Orchestrator not ready",
494
+ "message": "AI system is initializing. Please try again in a moment."
495
+ }
496
+ ```
497
+
498
+ **Solution:** Wait a few seconds and retry, or check the `/api/health` endpoint.
499
+
500
+ #### 500 Internal Server Error
501
+
502
+ **Generic Error:**
503
+ ```json
504
+ {
505
+ "success": false,
506
+ "error": "Error message here",
507
+ "message": "Error processing your request. Please try again."
508
+ }
509
+ ```
510
+
511
+ ---
512
+
513
+ ## Best Practices
514
+
515
+ ### 1. Session Management
516
+
517
+ - **Use consistent session IDs** for maintaining conversation context
518
+ - **Generate unique session IDs** per user conversation thread
519
+ - **Include conversation history** in subsequent requests for better context
520
+
521
+ ```python
522
+ # Good: Maintains context
523
+ session_id = "user-123-session-1"
524
+ history = []
525
+
526
+ # First message
527
+ result1 = send_message("What is AI?", session_id=session_id, history=history)
528
+ history = result1['history']
529
+
530
+ # Follow-up message (includes context)
531
+ result2 = send_message("Can you explain more?", session_id=session_id, history=history)
532
+ ```
533
+
534
+ ### 2. Error Handling
535
+
536
+ Always implement retry logic for 503 errors:
537
+
538
+ ```python
539
+ import time
540
+
541
+ def send_message_with_retry(message, max_retries=3, retry_delay=2):
542
+ for attempt in range(max_retries):
543
+ try:
544
+ result = send_message(message)
545
+ return result
546
+ except Exception as e:
547
+ if "503" in str(e) and attempt < max_retries - 1:
548
+ time.sleep(retry_delay)
549
+ continue
550
+ raise
551
+ ```
552
+
553
+ ### 3. Health Checks
554
+
555
+ Check API health before sending requests:
556
+
557
+ ```python
558
+ def is_api_ready():
559
+ try:
560
+ health = check_health()
561
+ return health.get("orchestrator_ready", False)
562
+ except:
563
+ return False
564
+
565
+ if is_api_ready():
566
+ # Send request
567
+ result = send_message("Hello")
568
+ else:
569
+ print("API is not ready yet")
570
+ ```
571
+
572
+ ### 4. Rate Limiting
573
+
574
+ - **No explicit rate limits** are currently enforced
575
+ - **Recommended:** Implement client-side rate limiting (e.g., 1 request per second)
576
+ - **Consider:** Implementing request queuing for high-volume applications
577
+
578
+ ### 5. Message Length
579
+
580
+ - **Maximum:** 10,000 characters per message
581
+ - **Recommended:** Keep messages concise for faster processing
582
+ - **For long content:** Split into multiple messages or summarize
583
+
584
+ ### 6. Context Management
585
+
586
+ - **Include history** in requests to maintain conversation context
587
+ - **Session IDs** help track conversations across multiple requests
588
+ - **User IDs** enable personalization and user-specific context
589
+
590
+ ---
591
+
592
+ ## Integration Examples
593
+
594
+ ### React Component
595
+
596
+ ```jsx
597
+ import React, { useState, useEffect } from 'react';
598
+
599
+ const AIAssistant = () => {
600
+ const [message, setMessage] = useState('');
601
+ const [history, setHistory] = useState([]);
602
+ const [loading, setLoading] = useState(false);
603
+ const [sessionId] = useState(`session-${Date.now()}`);
604
+
605
+ const sendMessage = async () => {
606
+ if (!message.trim()) return;
607
+
608
+ setLoading(true);
609
+ try {
610
+ const response = await fetch('https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API/api/chat', {
611
+ method: 'POST',
612
+ headers: { 'Content-Type': 'application/json' },
613
+ body: JSON.stringify({
614
+ message: message,
615
+ session_id: sessionId,
616
+ user_id: 'user-123',
617
+ history: history
618
+ })
619
+ });
620
+
621
+ const data = await response.json();
622
+ if (data.success) {
623
+ setHistory(data.history);
624
+ setMessage('');
625
+ }
626
+ } catch (error) {
627
+ console.error('Error:', error);
628
+ } finally {
629
+ setLoading(false);
630
+ }
631
+ };
632
+
633
+ return (
634
+ <div>
635
+ <div className="chat-history">
636
+ {history.map(([user, assistant], idx) => (
637
+ <div key={idx}>
638
+ <div><strong>You:</strong> {user}</div>
639
+ <div><strong>Assistant:</strong> {assistant}</div>
640
+ </div>
641
+ ))}
642
+ </div>
643
+ <input
644
+ value={message}
645
+ onChange={(e) => setMessage(e.target.value)}
646
+ onKeyPress={(e) => e.key === 'Enter' && sendMessage()}
647
+ disabled={loading}
648
+ />
649
+ <button onClick={sendMessage} disabled={loading}>
650
+ {loading ? 'Sending...' : 'Send'}
651
+ </button>
652
+ </div>
653
+ );
654
+ };
655
+ ```
656
+
657
+ ### Python CLI Tool
658
+
659
+ ```python
660
+ #!/usr/bin/env python3
661
+ import requests
662
+ import sys
663
+
664
+ BASE_URL = "https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API"
665
+
666
+ class ChatCLI:
667
+ def __init__(self):
668
+ self.session_id = f"cli-session-{hash(__file__)}"
669
+ self.history = []
670
+
671
+ def chat(self, message):
672
+ response = requests.post(
673
+ f"{BASE_URL}/api/chat",
674
+ json={
675
+ "message": message,
676
+ "session_id": self.session_id,
677
+ "user_id": "cli-user",
678
+ "history": self.history
679
+ }
680
+ )
681
+
682
+ if response.status_code == 200:
683
+ data = response.json()
684
+ self.history = data['history']
685
+ return data['message']
686
+ else:
687
+ return f"Error: {response.status_code} - {response.text}"
688
+
689
+ def run(self):
690
+ print("AI Assistant CLI (Type 'exit' to quit)")
691
+ print("=" * 50)
692
+
693
+ while True:
694
+ user_input = input("\nYou: ").strip()
695
+ if user_input.lower() in ['exit', 'quit']:
696
+ break
697
+
698
+ print("Assistant: ", end="", flush=True)
699
+ response = self.chat(user_input)
700
+ print(response)
701
+
702
+ if __name__ == "__main__":
703
+ cli = ChatCLI()
704
+ cli.run()
705
+ ```
706
+
707
+ ---
708
+
709
+ ## Response Times
710
+
711
+ - **Typical Response:** 2-10 seconds
712
+ - **First Request:** May take longer due to model loading (10-30 seconds)
713
+ - **Subsequent Requests:** Faster due to cached models (2-5 seconds)
714
+
715
+ **Factors Affecting Response Time:**
716
+ - Message length
717
+ - Model loading (first request)
718
+ - GPU availability
719
+ - Concurrent requests
720
+
721
+ ---
722
+
723
+ ## Support
724
+
725
+ For issues, questions, or contributions:
726
+ - **Repository:** [GitHub Repository URL]
727
+ - **Hugging Face Space:** [https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API](https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API)
728
+
729
+ ---
730
+
731
+ ## Changelog
732
+
733
+ ### Version 1.0 (Current)
734
+ - Initial API release
735
+ - Chat endpoint with context management
736
+ - Health check endpoint
737
+ - Local GPU model inference
738
+ - CORS enabled for web integration
739
+
740
+ ---
741
+
742
+ ## License
743
+
744
+ This API is provided as-is. Please refer to the main project README for license information.
745
+