Spaces:

algorithmicsuperintelligence
/

prompt-optimizer

Running

App Files Files Community

codelion commited on 22 days ago

Commit

6981b70

verified ·

1 Parent(s): 062d7ee

Upload app.py

Browse files

Files changed (1) hide show

app.py +5 -4

app.py CHANGED Viewed

@@ -751,7 +751,7 @@ Rewrite the prompt to MAXIMIZE accuracy on sentiment classification.
 CRITICAL REQUIREMENTS (these DIRECTLY affect score):
 1. ✓ MUST include word "sentiment" → model response will contain "sentiment" keyword
 2. ✓ MUST use pattern "[Action] sentiment: {{input}}" → triggers correct response format
-3. ✓ MUST be SHORT (under 35 chars) → prevents verbose/conversational responses
 4. ✓ MUST keep {{input}} placeholder EXACTLY as-is
 PROVEN WORKING PATTERNS (use these!):
@@ -764,7 +764,8 @@ PATTERNS THAT FAIL (avoid!):
 - ❌ "Review: {{input}}" - missing "sentiment" keyword
 - ❌ "Please analyze the sentiment..." - too long, word "please"
-Generate a SHORT, DIRECT prompt using the working pattern above.
 Output ONLY the new prompt between ```text markers:
@@ -782,7 +783,7 @@ Your improved prompt here
             "api_base": "https://openrouter.ai/api/v1",  # Use OpenRouter endpoint
             "temperature": 1.2,  # Even higher temperature for more creative variations
         },
-        "max_iterations": 5,  # Fewer iterations (each is expensive)
         "checkpoint_interval": 1,  # Save checkpoints every iteration to preserve prompt history
         "diff_based_evolution": False,  # Use full rewrite mode for prompts (not diff/patch mode)
         "language": "text",  # CRITICAL: Optimize text/prompts, not Python code!
@@ -1011,7 +1012,7 @@ def optimize_prompt(initial_prompt: str, dataset_name: str, dataset_split: str,
 - **Initial Eval**: 50 samples
 - **Final Eval**: 50 samples (same samples for fair comparison)
 - **Evolution**: 50 samples per variant (SAME samples as initial/final!)
-- **Iterations**: 5 (population: 15, elite: 40%, explore: 10%, exploit: 50%)
 ### Results
 - **Initial Accuracy**: {initial_eval['accuracy']:.2f}% ({initial_eval['correct']}/{initial_eval['total']})

 CRITICAL REQUIREMENTS (these DIRECTLY affect score):
 1. ✓ MUST include word "sentiment" → model response will contain "sentiment" keyword
 2. ✓ MUST use pattern "[Action] sentiment: {{input}}" → triggers correct response format
+3. ✓ Keep it reasonable (under 1000 chars) → focus on clarity and effectiveness
 4. ✓ MUST keep {{input}} placeholder EXACTLY as-is
 PROVEN WORKING PATTERNS (use these!):
 - ❌ "Review: {{input}}" - missing "sentiment" keyword
 - ❌ "Please analyze the sentiment..." - too long, word "please"
+Generate a DIRECT, EFFECTIVE prompt using the working pattern above.
+You have up to 1000 characters to craft the best possible prompt.
 Output ONLY the new prompt between ```text markers:
             "api_base": "https://openrouter.ai/api/v1",  # Use OpenRouter endpoint
             "temperature": 1.2,  # Even higher temperature for more creative variations
         },
+        "max_iterations": 10,  # More iterations for better convergence
         "checkpoint_interval": 1,  # Save checkpoints every iteration to preserve prompt history
         "diff_based_evolution": False,  # Use full rewrite mode for prompts (not diff/patch mode)
         "language": "text",  # CRITICAL: Optimize text/prompts, not Python code!
 - **Initial Eval**: 50 samples
 - **Final Eval**: 50 samples (same samples for fair comparison)
 - **Evolution**: 50 samples per variant (SAME samples as initial/final!)
+- **Iterations**: 10 (population: 15, elite: 40%, explore: 10%, exploit: 50%)
 ### Results
 - **Initial Accuracy**: {initial_eval['accuracy']:.2f}% ({initial_eval['correct']}/{initial_eval['total']})