codelion commited on
Commit
6981b70
Β·
verified Β·
1 Parent(s): 062d7ee

Upload app.py

Browse files
Files changed (1) hide show
  1. app.py +5 -4
app.py CHANGED
@@ -751,7 +751,7 @@ Rewrite the prompt to MAXIMIZE accuracy on sentiment classification.
751
  CRITICAL REQUIREMENTS (these DIRECTLY affect score):
752
  1. βœ“ MUST include word "sentiment" β†’ model response will contain "sentiment" keyword
753
  2. βœ“ MUST use pattern "[Action] sentiment: {{input}}" β†’ triggers correct response format
754
- 3. βœ“ MUST be SHORT (under 35 chars) β†’ prevents verbose/conversational responses
755
  4. βœ“ MUST keep {{input}} placeholder EXACTLY as-is
756
 
757
  PROVEN WORKING PATTERNS (use these!):
@@ -764,7 +764,8 @@ PATTERNS THAT FAIL (avoid!):
764
  - ❌ "Review: {{input}}" - missing "sentiment" keyword
765
  - ❌ "Please analyze the sentiment..." - too long, word "please"
766
 
767
- Generate a SHORT, DIRECT prompt using the working pattern above.
 
768
 
769
  Output ONLY the new prompt between ```text markers:
770
 
@@ -782,7 +783,7 @@ Your improved prompt here
782
  "api_base": "https://openrouter.ai/api/v1", # Use OpenRouter endpoint
783
  "temperature": 1.2, # Even higher temperature for more creative variations
784
  },
785
- "max_iterations": 5, # Fewer iterations (each is expensive)
786
  "checkpoint_interval": 1, # Save checkpoints every iteration to preserve prompt history
787
  "diff_based_evolution": False, # Use full rewrite mode for prompts (not diff/patch mode)
788
  "language": "text", # CRITICAL: Optimize text/prompts, not Python code!
@@ -1011,7 +1012,7 @@ def optimize_prompt(initial_prompt: str, dataset_name: str, dataset_split: str,
1011
  - **Initial Eval**: 50 samples
1012
  - **Final Eval**: 50 samples (same samples for fair comparison)
1013
  - **Evolution**: 50 samples per variant (SAME samples as initial/final!)
1014
- - **Iterations**: 5 (population: 15, elite: 40%, explore: 10%, exploit: 50%)
1015
 
1016
  ### Results
1017
  - **Initial Accuracy**: {initial_eval['accuracy']:.2f}% ({initial_eval['correct']}/{initial_eval['total']})
 
751
  CRITICAL REQUIREMENTS (these DIRECTLY affect score):
752
  1. βœ“ MUST include word "sentiment" β†’ model response will contain "sentiment" keyword
753
  2. βœ“ MUST use pattern "[Action] sentiment: {{input}}" β†’ triggers correct response format
754
+ 3. βœ“ Keep it reasonable (under 1000 chars) β†’ focus on clarity and effectiveness
755
  4. βœ“ MUST keep {{input}} placeholder EXACTLY as-is
756
 
757
  PROVEN WORKING PATTERNS (use these!):
 
764
  - ❌ "Review: {{input}}" - missing "sentiment" keyword
765
  - ❌ "Please analyze the sentiment..." - too long, word "please"
766
 
767
+ Generate a DIRECT, EFFECTIVE prompt using the working pattern above.
768
+ You have up to 1000 characters to craft the best possible prompt.
769
 
770
  Output ONLY the new prompt between ```text markers:
771
 
 
783
  "api_base": "https://openrouter.ai/api/v1", # Use OpenRouter endpoint
784
  "temperature": 1.2, # Even higher temperature for more creative variations
785
  },
786
+ "max_iterations": 10, # More iterations for better convergence
787
  "checkpoint_interval": 1, # Save checkpoints every iteration to preserve prompt history
788
  "diff_based_evolution": False, # Use full rewrite mode for prompts (not diff/patch mode)
789
  "language": "text", # CRITICAL: Optimize text/prompts, not Python code!
 
1012
  - **Initial Eval**: 50 samples
1013
  - **Final Eval**: 50 samples (same samples for fair comparison)
1014
  - **Evolution**: 50 samples per variant (SAME samples as initial/final!)
1015
+ - **Iterations**: 10 (population: 15, elite: 40%, explore: 10%, exploit: 50%)
1016
 
1017
  ### Results
1018
  - **Initial Accuracy**: {initial_eval['accuracy']:.2f}% ({initial_eval['correct']}/{initial_eval['total']})