violetxi/clbench-exploitable-poker-wm-sft-opponent-dynamics-thinking Viewer • Updated 13 days ago • 49 • 41
violetxi/clbench-exploitable-poker-wm-sft-opponent-dynamics Viewer • Updated 17 days ago • 1.12k • 41
violetxi/clbench-exploitable-poker-wm_summar-gemini-flash-3-1-lite Viewer • Updated 19 days ago • 2.74k • 95
violetxi/clbench-exploitable-poker-step3-state-action-wm-labels Viewer • Updated 19 days ago • 1.12k • 44
violetxi/single-turn-eval-int_qwen3-4b_distill_teacher_reverse_kl_lr1e-7-n32 Viewer • Updated 22 days ago • 566 • 69
violetxi/single-turn-eval-meta_feedback_qwen3-4b_step2_gpt-5-nano_gepa-n32 Viewer • Updated May 5 • 1.01k • 44
violetxi/single-turn-eval-meta_feedback_qwen3-4b_step2_gpt-5.4_gepa-n32 Viewer • Updated May 5 • 1.01k • 37
violetxi/stage1_proof-qwen3-4b-grpo-imoproofbench-summary-reasoning-graded Viewer • Updated Apr 30 • 960 • 132
violetxi/stage1_proof-qwen3-4b-dense-process-rubric-imoproofbench-summary-graded Viewer • Updated Apr 30 • 960 • 19
violetxi/stage1_proof-qwen3-4b-dense-process-imoproofbench-summary-reasoning-graded Viewer • Updated Apr 30 • 960 • 13
violetxi/stage1_proof-qwen3-4b-dense-process-proofbench-summary-graded Viewer • Updated Apr 27 • 2.32k • 56
violetxi/stage1_proof-qwen3-4b-dense-process-imoproofbench-summary-graded Viewer • Updated Apr 27 • 960 • 8
violetxi/stage1_proof-qwen3-4b-self-distill-proofbench-summary-graded Viewer • Updated Apr 27 • 2.32k • 21
violetxi/stage1_proof-qwen3-4b-self-distill-imoproofbench-summary-graded Viewer • Updated Apr 27 • 960 • 62
violetxi/stage1_proof-qwen3-4b-sft-imoproofbench-summary-graded-gemini3_pro Viewer • Updated Apr 27 • 960 • 11