Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity
Zhang Xingjian
Zhang199
AI & ML interests
Large Multimodal Models
Organizations
None yet
TinyLLaVA-Video-R1
Towards Smaller LMMs for Video Reasoning.
-
Zhang199/TinyLLaVA-Video-R1
Video-Text-to-Text • 4B • Updated • 5 • 4 -
Zhang199/TinyLLaVA-Video-Coldstart_NextQA_16
Video-Text-to-Text • 4B • Updated • 9 • 1 -
Zhang199/TinyLLaVA-Video-R1-training-data
Updated • 59 • 1 -
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning
Paper • 2504.09641 • Published • 16
EDGE-GRPO
Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity
TinyLLaVA-Video-R1
Towards Smaller LMMs for Video Reasoning.
-
Zhang199/TinyLLaVA-Video-R1
Video-Text-to-Text • 4B • Updated • 5 • 4 -
Zhang199/TinyLLaVA-Video-Coldstart_NextQA_16
Video-Text-to-Text • 4B • Updated • 9 • 1 -
Zhang199/TinyLLaVA-Video-R1-training-data
Updated • 59 • 1 -
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning
Paper • 2504.09641 • Published • 16
models 11
Zhang199/TinyLLaVA-Qwen2-0.5B-SigLIP
Image-Text-to-Text • 1B • Updated
• 160 • 7
Zhang199/EDGE-GRPO-Qwen-1.5B
Text Generation • 2B • Updated
• 2
Zhang199/EDGE-GRPO-Qwen-7B
Text Generation • 8B • Updated
• 1 • 1
Zhang199/TinyLLaVA-Video-Qwen2.5-3B-Group-16-512
Video-Text-to-Text • 4B • Updated
• 469 • 1
Zhang199/TinyLLaVA-Video-Qwen2.5-3B-Naive-16-512
Video-Text-to-Text • 4B • Updated
• 2
Zhang199/TinyLLaVA-Video-Phi2-Naive-16-512
Video-Text-to-Text • 3B • Updated
• 12
Zhang199/TinyLLaVA-Qwen2.5-3B-SigLIP
Image-Text-to-Text • 4B • Updated
• 646
Zhang199/TinyLLaVA-Video-R1
Video-Text-to-Text • 4B • Updated
• 5 • 4
Zhang199/TinyLLaVA-Video-Coldstart_NextQA_16
Video-Text-to-Text • 4B • Updated
• 9 • 1
Zhang199/TinyLLaVA-Video-Qwen2.5-3B-Group-1fps-512
Video-Text-to-Text • 4B • Updated
• 1