update
Browse files
README.md
CHANGED
|
@@ -10,7 +10,113 @@ pinned: false
|
|
| 10 |
license: apache-2.0
|
| 11 |
short_description: AI-powered tool that automatically generates quote video
|
| 12 |
tags:
|
| 13 |
-
- mcp-in-action-track-
|
| 14 |
---
|
| 15 |
|
| 16 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
license: apache-2.0
|
| 11 |
short_description: AI-powered tool that automatically generates quote video
|
| 12 |
tags:
|
| 13 |
+
- mcp-in-action-track-consumer
|
| 14 |
---
|
| 15 |
|
| 16 |
+
## 🎬 AI Quote Clip Generator
|
| 17 |
+
|
| 18 |
+
Autonomous MCP Agent • Trend-Aware Quote Studio • Multimodal Generation
|
| 19 |
+
|
| 20 |
+
AI Quote Clip Generator is an MCP-powered autonomous system that creates aesthetic, trend-aware quote videos for TikTok, Instagram Reels, and Shorts.
|
| 21 |
+
It combines Gemini + OpenAI + ElevenLabs + Modal + Pexels into a single intelligent pipeline that plans, generates, narrates, and renders short-form content automatically.
|
| 22 |
+
|
| 23 |
+
This project is built for the MCP 1st Birthday Hackathon – Track 2 (MCP in Action / Productivity).
|
| 24 |
+
|
| 25 |
+
### 🔮 Live Demo
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
🚀 What It Does
|
| 29 |
+
|
| 30 |
+
- With a single click, the system:
|
| 31 |
+
|
| 32 |
+
- Generates non-repetitive Gemini-powered quotes
|
| 33 |
+
|
| 34 |
+
- Applies a persona style (Coach, Philosopher, Poet, Mentor)
|
| 35 |
+
|
| 36 |
+
- Incorporates trend-aware context for modern content themes
|
| 37 |
+
|
| 38 |
+
- Creates voice-over explanations using OpenAI + ElevenLabs
|
| 39 |
+
|
| 40 |
+
- Retrieves cinematic vertical stock footage from Pexels
|
| 41 |
+
|
| 42 |
+
- Renders 7–20 second short-form videos via Modal
|
| 43 |
+
|
| 44 |
+
- Saves the results to a live gallery inside the app
|
| 45 |
+
|
| 46 |
+
- Displays a full agent activity log of each step
|
| 47 |
+
|
| 48 |
+
This turns the tool into a full AI content studio optimized for social platforms.
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
## 🛠️ MCP Tools Used
|
| 52 |
+
|
| 53 |
+
The project exposes multiple tools via MCP:
|
| 54 |
+
|
| 55 |
+
| Tool | Description |
|
| 56 |
+
|------|-------------|
|
| 57 |
+
| **generate_quote_tool** | Produces unique, trend-aware quotes using Gemini with per-niche memory |
|
| 58 |
+
| **search_pexels_video_tool** | Retrieves aesthetic background videos from Pexels |
|
| 59 |
+
| **create_quote_video_tool** | Sends jobs to Modal to render final 7–20s clips |
|
| 60 |
+
| *(internal)* `generate_voice_commentary` | Generates 25–35 word explanations (OpenAI + ElevenLabs) |
|
| 61 |
+
|
| 62 |
+
These tools are orchestrated autonomously through a multi-step agent chain.
|
| 63 |
+
|
| 64 |
+
---
|
| 65 |
+
|
| 66 |
+
## 📊 Agent Pipeline Overview
|
| 67 |
+
|
| 68 |
+
1. Build context → niche + persona + trend theme
|
| 69 |
+
2. Generate quote (Gemini primary, OpenAI fallback)
|
| 70 |
+
3. Create voice-over commentary (OpenAI + ElevenLabs)
|
| 71 |
+
4. Retrieve video footage (Pexels)
|
| 72 |
+
5. Render the final video (Modal)
|
| 73 |
+
6. Save and display in the gallery
|
| 74 |
+
|
| 75 |
+
----
|
| 76 |
+
## 🧩 Core Components
|
| 77 |
+
|
| 78 |
+
### **1. AUTONOMOUS MCP AGENT PIPELINE**
|
| 79 |
+
|
| 80 |
+
A multi-step reasoning pipeline built with smolagents that orchestrates the full workflow:
|
| 81 |
+
trend-aware context building → quote generation → narration → video retrieval → rendering → gallery update.
|
| 82 |
+
|
| 83 |
+
### **2. Gemini-Enhanced Quote Generator (Variety Safe)**
|
| 84 |
+
|
| 85 |
+
A hybrid Gemini/OpenAI system with per-niche memory and variety tracking to ensure every quote is unique, non-repetitive, and aligned with current social trends.
|
| 86 |
+
|
| 87 |
+
|
| 88 |
+
### 3. **Trend-Aware Mini-RAG Engine**
|
| 89 |
+
|
| 90 |
+
A lightweight "mini-RAG" system embeds niche-specific trend intelligence (e.g., Soft Life, Discipline Era, Glow-Up, Reset Culture). The agent proactively retrieves and fuses these trend insights—hooks, metaphors, and persona voice—into quotes and commentaries for contextual freshness.
|
| 91 |
+
|
| 92 |
+
### 4. **ElevenLabs Voice Studio**
|
| 93 |
+
|
| 94 |
+
Automatically generates voice-over explanations for every video, using OpenAI for spoken-style commentary creation and ElevenLabs for lifelike narration. Provides a selection of realistic voices.
|
| 95 |
+
|
| 96 |
+
### 5. **Modal Render Engine (Fast Video Processing)**
|
| 97 |
+
|
| 98 |
+
All final short-form clips are rendered through a Modal cloud function, synchronizing narration length, animated text, and cinematic video overlays for rapid production.
|
| 99 |
+
|
| 100 |
+
### 6. **Pexels Multimodal Search Tool**
|
| 101 |
+
|
| 102 |
+
Harnesses the Pexels video API via an agent tool to fetch vertical cinematic backgrounds tailored to each niche, persona, and trending topic (e.g., “soft morning light,” “discipline era routines”).
|
| 103 |
+
|
| 104 |
+
### 7. **Dynamic Aesthetic Text Layouts**
|
| 105 |
+
|
| 106 |
+
Offers three distinct text styles—Classic Center, Lower-Third Serif, and Typewriter Top—based on high-performing TikTok aesthetics, optimizing for visual variety.
|
| 107 |
+
|
| 108 |
+
### 8. **Persistent Video Gallery**
|
| 109 |
+
|
| 110 |
+
Saves every generated video to a scrollable gallery inside the app, letting creators browse their entire history of AI-generated clips.
|
| 111 |
+
|
| 112 |
+
|
| 113 |
+
### 🧑💻 Authors
|
| 114 |
+
|
| 115 |
+
- Meheret Egzerab
|
| 116 |
+
|
| 117 |
+
|
| 118 |
+
---
|
| 119 |
+
### 📝 License
|
| 120 |
+
|
| 121 |
+
This project is licensed under the apache-2.0 License.
|
| 122 |
+
|
app.py
CHANGED
|
@@ -204,7 +204,6 @@ def get_trend_insights(niche: str) -> Dict[str, Any]:
|
|
| 204 |
},
|
| 205 |
}
|
| 206 |
|
| 207 |
-
# Default fallback
|
| 208 |
default = {
|
| 209 |
"label": "modern glow-up & gentle discipline",
|
| 210 |
"summary": (
|
|
@@ -520,14 +519,46 @@ agent, agent_error = initialize_agent()
|
|
| 520 |
# ==== VOICE GENERATION (OpenAI explanation + ElevenLabs TTS) ==================
|
| 521 |
|
| 522 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 523 |
def generate_voice_commentary(
|
| 524 |
quote_text: str,
|
| 525 |
niche: str,
|
| 526 |
persona: str,
|
| 527 |
trend_label: str,
|
|
|
|
| 528 |
) -> Tuple[str, str]:
|
| 529 |
"""
|
| 530 |
Generate a short explanatory commentary + ElevenLabs audio (as base64).
|
|
|
|
| 531 |
|
| 532 |
Returns:
|
| 533 |
(commentary_text, audio_b64) – audio_b64 may be "" if error.
|
|
@@ -577,16 +608,13 @@ Return ONLY the commentary text, nothing else.
|
|
| 577 |
|
| 578 |
# 2) ElevenLabs TTS
|
| 579 |
try:
|
|
|
|
|
|
|
| 580 |
audio_stream = elevenlabs_client.text_to_speech.convert(
|
| 581 |
text=commentary,
|
| 582 |
-
voice_id=
|
| 583 |
model_id="eleven_multilingual_v2",
|
| 584 |
-
voice_settings=
|
| 585 |
-
stability=0.6,
|
| 586 |
-
similarity_boost=0.8,
|
| 587 |
-
style=0.6,
|
| 588 |
-
use_speaker_boost=True,
|
| 589 |
-
),
|
| 590 |
)
|
| 591 |
|
| 592 |
audio_bytes = b"".join(chunk for chunk in audio_stream)
|
|
@@ -605,15 +633,15 @@ def mcp_agent_pipeline(
|
|
| 605 |
style: str,
|
| 606 |
persona: str,
|
| 607 |
text_style: str,
|
|
|
|
| 608 |
num_variations: int = 1,
|
| 609 |
-
voice_enabled: bool = True,
|
| 610 |
) -> Tuple[str, List[str]]:
|
| 611 |
"""
|
| 612 |
MCP-flavored autonomous pipeline with:
|
| 613 |
- Context engineering (persona, trends)
|
| 614 |
- Trend-informed 'RAG' context injection
|
| 615 |
- Quote generation via hybrid Gemini/OpenAI
|
| 616 |
-
-
|
| 617 |
- Modal-based video creation (1–3 variations)
|
| 618 |
"""
|
| 619 |
|
|
@@ -630,7 +658,7 @@ def mcp_agent_pipeline(
|
|
| 630 |
status_log.append(f" • Visual style: `{style}`")
|
| 631 |
status_log.append(f" • Persona: `{persona}`")
|
| 632 |
status_log.append(f" • Text layout: `{text_style}`")
|
| 633 |
-
status_log.append(f" • Voice
|
| 634 |
|
| 635 |
trend_info = get_trend_insights(niche)
|
| 636 |
trend_label = trend_info.get("label", "")
|
|
@@ -659,24 +687,21 @@ def mcp_agent_pipeline(
|
|
| 659 |
preview = quote if len(quote) <= 140 else quote[:140] + "..."
|
| 660 |
status_log.append(f" ✅ Quote: “{preview}”\n")
|
| 661 |
|
| 662 |
-
# STEP 3:
|
| 663 |
-
|
| 664 |
-
|
| 665 |
-
|
| 666 |
-
|
| 667 |
-
|
| 668 |
-
|
| 669 |
-
|
| 670 |
-
|
| 671 |
-
|
| 672 |
-
|
| 673 |
-
status_log.append(" ✅ Voice-over created and encoded as base64")
|
| 674 |
-
else:
|
| 675 |
-
status_log.append(" ⚠️ Voice generation failed; continuing without audio")
|
| 676 |
-
if commentary:
|
| 677 |
-
status_log.append(f" 📝 Commentary preview: {commentary[:120]}...\n")
|
| 678 |
else:
|
| 679 |
-
status_log.append("
|
|
|
|
|
|
|
| 680 |
|
| 681 |
# STEP 4: Search Pexels videos
|
| 682 |
status_log.append("🎥 **Step 4 – Searching Pexels for background videos**")
|
|
@@ -731,7 +756,7 @@ def mcp_agent_pipeline(
|
|
| 731 |
created_videos.append(out_path)
|
| 732 |
status_log.append(f" ✅ Variation {i+1} rendered successfully")
|
| 733 |
|
| 734 |
-
# Copy to gallery
|
| 735 |
gallery_filename = f"gallery_{timestamp}_v{i+1}.mp4"
|
| 736 |
gallery_path = os.path.join(gallery_dir, gallery_filename)
|
| 737 |
try:
|
|
@@ -764,11 +789,14 @@ def mcp_agent_pipeline(
|
|
| 764 |
return "\n".join(status_log), created_videos
|
| 765 |
|
| 766 |
|
| 767 |
-
# ==== GALLERY UTIL
|
| 768 |
|
| 769 |
|
| 770 |
def load_gallery_videos() -> List[str]:
|
| 771 |
-
"""
|
|
|
|
|
|
|
|
|
|
| 772 |
gallery_output_dir = "/data/gallery_videos"
|
| 773 |
os.makedirs(gallery_output_dir, exist_ok=True)
|
| 774 |
|
|
@@ -778,14 +806,9 @@ def load_gallery_videos() -> List[str]:
|
|
| 778 |
glob.glob(f"{gallery_output_dir}/*.mp4"),
|
| 779 |
key=os.path.getmtime,
|
| 780 |
reverse=True,
|
| 781 |
-
)
|
| 782 |
-
|
| 783 |
-
videos: List[str] = [None] * 6 # type: ignore
|
| 784 |
-
for i, video_path in enumerate(existing_videos):
|
| 785 |
-
if i < 6:
|
| 786 |
-
videos[i] = video_path
|
| 787 |
|
| 788 |
-
return
|
| 789 |
|
| 790 |
|
| 791 |
# ==== GRADIO UI ===============================================================
|
|
@@ -799,28 +822,21 @@ with gr.Blocks(
|
|
| 799 |
# 🎬 AIQuoteClipGenerator
|
| 800 |
### MCP-flavored agent • Gemini + OpenAI + ElevenLabs + Modal
|
| 801 |
|
| 802 |
-
|
| 803 |
-
|
| 804 |
-
- 📈 Uses **trend-aware context** per niche (mini-RAG style)
|
| 805 |
-
- 🎭 Applies a **persona** to shape the tone (coach / philosopher / poet / mentor)
|
| 806 |
-
- 🔊 Optional **ElevenLabs voice-over** explaining the quote
|
| 807 |
-
- 🎥 Pulls vertical stock videos from **Pexels**
|
| 808 |
-
- ⚡ Renders final clips via **Modal**, 1–3 variations
|
| 809 |
"""
|
| 810 |
)
|
| 811 |
|
| 812 |
-
with gr.Accordion("📸 Example Gallery –
|
| 813 |
-
gr.Markdown("
|
| 814 |
-
|
| 815 |
-
|
| 816 |
-
|
| 817 |
-
|
| 818 |
-
|
| 819 |
-
|
| 820 |
-
|
| 821 |
-
|
| 822 |
-
gallery_video5 = gr.Video(height=260, show_label=False)
|
| 823 |
-
gallery_video6 = gr.Video(height=260, show_label=False)
|
| 824 |
|
| 825 |
gr.Markdown("---")
|
| 826 |
gr.Markdown("## 🎯 Generate Your Own Quote Video")
|
|
@@ -871,9 +887,13 @@ with gr.Blocks(
|
|
| 871 |
value="classic_center",
|
| 872 |
)
|
| 873 |
|
| 874 |
-
|
| 875 |
-
|
| 876 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 877 |
)
|
| 878 |
|
| 879 |
num_variations = gr.Slider(
|
|
@@ -897,7 +917,7 @@ with gr.Blocks(
|
|
| 897 |
show_label=False,
|
| 898 |
)
|
| 899 |
|
| 900 |
-
gr.Markdown("### ✨ Your Quote Videos")
|
| 901 |
with gr.Row():
|
| 902 |
video1 = gr.Video(label="Video 1", height=480)
|
| 903 |
video2 = gr.Video(label="Video 2", height=480)
|
|
@@ -907,12 +927,10 @@ with gr.Blocks(
|
|
| 907 |
"""
|
| 908 |
---
|
| 909 |
### 🧩 Under the hood
|
| 910 |
-
-
|
| 911 |
-
-
|
| 912 |
-
-
|
| 913 |
-
-
|
| 914 |
-
|
| 915 |
-
Built for the MCP 1st Birthday Hackathon – Track 2 (MCP in Action, Productivity).
|
| 916 |
"""
|
| 917 |
)
|
| 918 |
|
|
@@ -921,16 +939,16 @@ with gr.Blocks(
|
|
| 921 |
style_val,
|
| 922 |
persona_val,
|
| 923 |
text_style_val,
|
|
|
|
| 924 |
num_variations_val,
|
| 925 |
-
voice_enabled_val,
|
| 926 |
):
|
| 927 |
status, videos = mcp_agent_pipeline(
|
| 928 |
niche=niche_val,
|
| 929 |
style=style_val,
|
| 930 |
persona=persona_val,
|
| 931 |
text_style=text_style_val,
|
|
|
|
| 932 |
num_variations=int(num_variations_val),
|
| 933 |
-
voice_enabled=bool(voice_enabled_val),
|
| 934 |
)
|
| 935 |
|
| 936 |
v1 = videos[0] if len(videos) > 0 else None
|
|
@@ -944,12 +962,7 @@ with gr.Blocks(
|
|
| 944 |
v1,
|
| 945 |
v2,
|
| 946 |
v3,
|
| 947 |
-
gallery_vids
|
| 948 |
-
gallery_vids[1],
|
| 949 |
-
gallery_vids[2],
|
| 950 |
-
gallery_vids[3],
|
| 951 |
-
gallery_vids[4],
|
| 952 |
-
gallery_vids[5],
|
| 953 |
]
|
| 954 |
|
| 955 |
generate_btn.click(
|
|
@@ -959,35 +972,23 @@ with gr.Blocks(
|
|
| 959 |
style,
|
| 960 |
persona,
|
| 961 |
text_style,
|
|
|
|
| 962 |
num_variations,
|
| 963 |
-
voice_enabled,
|
| 964 |
],
|
| 965 |
outputs=[
|
| 966 |
output,
|
| 967 |
video1,
|
| 968 |
video2,
|
| 969 |
video3,
|
| 970 |
-
|
| 971 |
-
gallery_video2,
|
| 972 |
-
gallery_video3,
|
| 973 |
-
gallery_video4,
|
| 974 |
-
gallery_video5,
|
| 975 |
-
gallery_video6,
|
| 976 |
],
|
| 977 |
)
|
| 978 |
|
| 979 |
# Load gallery on page load
|
| 980 |
demo.load(
|
| 981 |
load_gallery_videos,
|
| 982 |
-
outputs=[
|
| 983 |
-
gallery_video1,
|
| 984 |
-
gallery_video2,
|
| 985 |
-
gallery_video3,
|
| 986 |
-
gallery_video4,
|
| 987 |
-
gallery_video5,
|
| 988 |
-
gallery_video6,
|
| 989 |
-
],
|
| 990 |
)
|
| 991 |
|
| 992 |
if __name__ == "__main__":
|
| 993 |
-
demo.launch(allowed_paths=["/data/gallery_videos"])
|
|
|
|
| 204 |
},
|
| 205 |
}
|
| 206 |
|
|
|
|
| 207 |
default = {
|
| 208 |
"label": "modern glow-up & gentle discipline",
|
| 209 |
"summary": (
|
|
|
|
| 519 |
# ==== VOICE GENERATION (OpenAI explanation + ElevenLabs TTS) ==================
|
| 520 |
|
| 521 |
|
| 522 |
+
def get_voice_config(voice_profile: str) -> Tuple[str, VoiceSettings]:
|
| 523 |
+
"""
|
| 524 |
+
Map a human-readable voice profile to an ElevenLabs voice_id + settings.
|
| 525 |
+
"""
|
| 526 |
+
vp = (voice_profile or "").lower()
|
| 527 |
+
|
| 528 |
+
# Calm female (Rachel)
|
| 529 |
+
if "rachel" in vp or "female" in vp:
|
| 530 |
+
return (
|
| 531 |
+
"21m00Tcm4TlvDq8ikWAM", # Rachel (from ElevenLabs docs)
|
| 532 |
+
VoiceSettings(
|
| 533 |
+
stability=0.5,
|
| 534 |
+
similarity_boost=0.9,
|
| 535 |
+
style=0.4,
|
| 536 |
+
use_speaker_boost=True,
|
| 537 |
+
),
|
| 538 |
+
)
|
| 539 |
+
|
| 540 |
+
# Warm male (Adam)
|
| 541 |
+
return (
|
| 542 |
+
"pNInz6obpgDQGcFmaJgB", # Adam
|
| 543 |
+
VoiceSettings(
|
| 544 |
+
stability=0.6,
|
| 545 |
+
similarity_boost=0.8,
|
| 546 |
+
style=0.5,
|
| 547 |
+
use_speaker_boost=True,
|
| 548 |
+
),
|
| 549 |
+
)
|
| 550 |
+
|
| 551 |
+
|
| 552 |
def generate_voice_commentary(
|
| 553 |
quote_text: str,
|
| 554 |
niche: str,
|
| 555 |
persona: str,
|
| 556 |
trend_label: str,
|
| 557 |
+
voice_profile: str,
|
| 558 |
) -> Tuple[str, str]:
|
| 559 |
"""
|
| 560 |
Generate a short explanatory commentary + ElevenLabs audio (as base64).
|
| 561 |
+
Voice is always generated if ElevenLabs is available.
|
| 562 |
|
| 563 |
Returns:
|
| 564 |
(commentary_text, audio_b64) – audio_b64 may be "" if error.
|
|
|
|
| 608 |
|
| 609 |
# 2) ElevenLabs TTS
|
| 610 |
try:
|
| 611 |
+
voice_id, voice_settings = get_voice_config(voice_profile)
|
| 612 |
+
|
| 613 |
audio_stream = elevenlabs_client.text_to_speech.convert(
|
| 614 |
text=commentary,
|
| 615 |
+
voice_id=voice_id,
|
| 616 |
model_id="eleven_multilingual_v2",
|
| 617 |
+
voice_settings=voice_settings,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 618 |
)
|
| 619 |
|
| 620 |
audio_bytes = b"".join(chunk for chunk in audio_stream)
|
|
|
|
| 633 |
style: str,
|
| 634 |
persona: str,
|
| 635 |
text_style: str,
|
| 636 |
+
voice_profile: str,
|
| 637 |
num_variations: int = 1,
|
|
|
|
| 638 |
) -> Tuple[str, List[str]]:
|
| 639 |
"""
|
| 640 |
MCP-flavored autonomous pipeline with:
|
| 641 |
- Context engineering (persona, trends)
|
| 642 |
- Trend-informed 'RAG' context injection
|
| 643 |
- Quote generation via hybrid Gemini/OpenAI
|
| 644 |
+
- ElevenLabs narration (always on if available)
|
| 645 |
- Modal-based video creation (1–3 variations)
|
| 646 |
"""
|
| 647 |
|
|
|
|
| 658 |
status_log.append(f" • Visual style: `{style}`")
|
| 659 |
status_log.append(f" • Persona: `{persona}`")
|
| 660 |
status_log.append(f" • Text layout: `{text_style}`")
|
| 661 |
+
status_log.append(f" • Voice profile: `{voice_profile}`\n")
|
| 662 |
|
| 663 |
trend_info = get_trend_insights(niche)
|
| 664 |
trend_label = trend_info.get("label", "")
|
|
|
|
| 687 |
preview = quote if len(quote) <= 140 else quote[:140] + "..."
|
| 688 |
status_log.append(f" ✅ Quote: “{preview}”\n")
|
| 689 |
|
| 690 |
+
# STEP 3: Voice commentary (always attempted)
|
| 691 |
+
status_log.append("🔊 **Step 3 – Generating voice-over explanation (OpenAI + ElevenLabs)**")
|
| 692 |
+
commentary, audio_b64 = generate_voice_commentary(
|
| 693 |
+
quote_text=quote,
|
| 694 |
+
niche=niche,
|
| 695 |
+
persona=persona,
|
| 696 |
+
trend_label=trend_label,
|
| 697 |
+
voice_profile=voice_profile,
|
| 698 |
+
)
|
| 699 |
+
if audio_b64:
|
| 700 |
+
status_log.append(" ✅ Voice-over created and encoded as base64")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 701 |
else:
|
| 702 |
+
status_log.append(" ⚠️ Voice generation failed or ElevenLabs unavailable")
|
| 703 |
+
if commentary:
|
| 704 |
+
status_log.append(f" 📝 Commentary preview: {commentary[:120]}...\n")
|
| 705 |
|
| 706 |
# STEP 4: Search Pexels videos
|
| 707 |
status_log.append("🎥 **Step 4 – Searching Pexels for background videos**")
|
|
|
|
| 756 |
created_videos.append(out_path)
|
| 757 |
status_log.append(f" ✅ Variation {i+1} rendered successfully")
|
| 758 |
|
| 759 |
+
# Copy to gallery (we keep ALL; scrolling handled by Gradio gallery)
|
| 760 |
gallery_filename = f"gallery_{timestamp}_v{i+1}.mp4"
|
| 761 |
gallery_path = os.path.join(gallery_dir, gallery_filename)
|
| 762 |
try:
|
|
|
|
| 789 |
return "\n".join(status_log), created_videos
|
| 790 |
|
| 791 |
|
| 792 |
+
# ==== GALLERY UTIL (SCROLLABLE, KEEPS ALL) ====================================
|
| 793 |
|
| 794 |
|
| 795 |
def load_gallery_videos() -> List[str]:
|
| 796 |
+
"""
|
| 797 |
+
Load all videos from persistent gallery folder (sorted newest → oldest).
|
| 798 |
+
Gradio's Gallery will handle scrolling.
|
| 799 |
+
"""
|
| 800 |
gallery_output_dir = "/data/gallery_videos"
|
| 801 |
os.makedirs(gallery_output_dir, exist_ok=True)
|
| 802 |
|
|
|
|
| 806 |
glob.glob(f"{gallery_output_dir}/*.mp4"),
|
| 807 |
key=os.path.getmtime,
|
| 808 |
reverse=True,
|
| 809 |
+
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 810 |
|
| 811 |
+
return existing_videos
|
| 812 |
|
| 813 |
|
| 814 |
# ==== GRADIO UI ===============================================================
|
|
|
|
| 822 |
# 🎬 AIQuoteClipGenerator
|
| 823 |
### MCP-flavored agent • Gemini + OpenAI + ElevenLabs + Modal
|
| 824 |
|
| 825 |
+
An autonomous mini-studio that generates trend-aware quote videos with voice-over,
|
| 826 |
+
cinematic stock footage, and MCP-style agent reasoning.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 827 |
"""
|
| 828 |
)
|
| 829 |
|
| 830 |
+
with gr.Accordion("📸 Example Gallery – All Generated Videos", open=True):
|
| 831 |
+
gr.Markdown("Scroll to explore all the clips you've generated so far.")
|
| 832 |
+
gallery = gr.Gallery(
|
| 833 |
+
label=None,
|
| 834 |
+
elem_id="gallery",
|
| 835 |
+
show_label=False,
|
| 836 |
+
columns=3,
|
| 837 |
+
height=540,
|
| 838 |
+
preview=True,
|
| 839 |
+
)
|
|
|
|
|
|
|
| 840 |
|
| 841 |
gr.Markdown("---")
|
| 842 |
gr.Markdown("## 🎯 Generate Your Own Quote Video")
|
|
|
|
| 887 |
value="classic_center",
|
| 888 |
)
|
| 889 |
|
| 890 |
+
voice_profile = gr.Dropdown(
|
| 891 |
+
choices=[
|
| 892 |
+
"Calm Female (Rachel)",
|
| 893 |
+
"Warm Male (Adam)",
|
| 894 |
+
],
|
| 895 |
+
label="🔊 Voice Profile (ElevenLabs)",
|
| 896 |
+
value="Calm Female (Rachel)",
|
| 897 |
)
|
| 898 |
|
| 899 |
num_variations = gr.Slider(
|
|
|
|
| 917 |
show_label=False,
|
| 918 |
)
|
| 919 |
|
| 920 |
+
gr.Markdown("### ✨ Your Quote Videos (This Run)")
|
| 921 |
with gr.Row():
|
| 922 |
video1 = gr.Video(label="Video 1", height=480)
|
| 923 |
video2 = gr.Video(label="Video 2", height=480)
|
|
|
|
| 927 |
"""
|
| 928 |
---
|
| 929 |
### 🧩 Under the hood
|
| 930 |
+
- Context engineering: niche + persona + trend theme
|
| 931 |
+
- Mini-RAG: curated trend knowledge feeding into generation
|
| 932 |
+
- Hybrid LLM: Gemini (quotes) + OpenAI (commentary)
|
| 933 |
+
- Multimodal pipeline: text → audio → video
|
|
|
|
|
|
|
| 934 |
"""
|
| 935 |
)
|
| 936 |
|
|
|
|
| 939 |
style_val,
|
| 940 |
persona_val,
|
| 941 |
text_style_val,
|
| 942 |
+
voice_profile_val,
|
| 943 |
num_variations_val,
|
|
|
|
| 944 |
):
|
| 945 |
status, videos = mcp_agent_pipeline(
|
| 946 |
niche=niche_val,
|
| 947 |
style=style_val,
|
| 948 |
persona=persona_val,
|
| 949 |
text_style=text_style_val,
|
| 950 |
+
voice_profile=voice_profile_val,
|
| 951 |
num_variations=int(num_variations_val),
|
|
|
|
| 952 |
)
|
| 953 |
|
| 954 |
v1 = videos[0] if len(videos) > 0 else None
|
|
|
|
| 962 |
v1,
|
| 963 |
v2,
|
| 964 |
v3,
|
| 965 |
+
gallery_vids,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 966 |
]
|
| 967 |
|
| 968 |
generate_btn.click(
|
|
|
|
| 972 |
style,
|
| 973 |
persona,
|
| 974 |
text_style,
|
| 975 |
+
voice_profile,
|
| 976 |
num_variations,
|
|
|
|
| 977 |
],
|
| 978 |
outputs=[
|
| 979 |
output,
|
| 980 |
video1,
|
| 981 |
video2,
|
| 982 |
video3,
|
| 983 |
+
gallery,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 984 |
],
|
| 985 |
)
|
| 986 |
|
| 987 |
# Load gallery on page load
|
| 988 |
demo.load(
|
| 989 |
load_gallery_videos,
|
| 990 |
+
outputs=[gallery],
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 991 |
)
|
| 992 |
|
| 993 |
if __name__ == "__main__":
|
| 994 |
+
demo.launch(allowed_paths=["/data/gallery_videos"])
|