GUI-Libra Training GUI agents with augmented reasoning data and a tailored post-training recipe Ray2333/GUI-Libra-3B 4B • Updated 4 days ago • 34 Ray2333/Libra-81K-SFT Updated 12 days ago • 18 Ray2333/Offline_Evaluation Viewer • Updated 11 days ago • 35.2k • 10 Ray2333/Libra-81K Viewer • Updated 13 days ago • 738 • 14
GRM Generalizable Reward Models Ray2333/GRM-llama3-8B-sftreg Text Classification • 8B • Updated Feb 5, 2025 • 8 • 5 Ray2333/GRM-llama3-8B-distill Text Classification • 8B • Updated Feb 5, 2025 • 71 • 6 Ray2333/GRM-Gemma-2B-sftreg Text Classification • 3B • Updated Feb 5, 2025 • 345 • 3 Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs Paper • 2406.10216 • Published Jun 14, 2024 • 2
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs Paper • 2406.10216 • Published Jun 14, 2024 • 2
GUI-Libra Training GUI agents with augmented reasoning data and a tailored post-training recipe Ray2333/GUI-Libra-3B 4B • Updated 4 days ago • 34 Ray2333/Libra-81K-SFT Updated 12 days ago • 18 Ray2333/Offline_Evaluation Viewer • Updated 11 days ago • 35.2k • 10 Ray2333/Libra-81K Viewer • Updated 13 days ago • 738 • 14
GRM Generalizable Reward Models Ray2333/GRM-llama3-8B-sftreg Text Classification • 8B • Updated Feb 5, 2025 • 8 • 5 Ray2333/GRM-llama3-8B-distill Text Classification • 8B • Updated Feb 5, 2025 • 71 • 6 Ray2333/GRM-Gemma-2B-sftreg Text Classification • 3B • Updated Feb 5, 2025 • 345 • 3 Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs Paper • 2406.10216 • Published Jun 14, 2024 • 2
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs Paper • 2406.10216 • Published Jun 14, 2024 • 2
Ray2333/reward-model-Mistral-7B-instruct-Unified-Feedback Text Classification • 7B • Updated Feb 5, 2025 • 186 • 11