Rui Yang's picture

In a Training Loop 🔄

Rui Yang PRO

Ray2333

·

https://yangrui2015.github.io

YangRui2015

AI & ML interests

Deep Reinforcement Learning

Recent Activity

updated a collection 12 days ago

updated a collection 12 days ago

published a model 12 days ago

Ray2333/Qwen3-VL-3B-sft-reasoning_and_grounding_changecoord_mixnoreasoning_cpt636

View all activity

Organizations

Collections 2

Papers 8

arxiv:2602.22190

arxiv:2510.27623

arxiv:2510.12693

arxiv:2506.03143

models 26

Ray2333/Qwen3-VL-3B-sft-reasoning_and_grounding_changecoord_mixnoreasoning_cpt636

4B • Updated 12 days ago • 29

Ray2333/Qwen3-VL-7B-sft-reasoning_and_grounding_changecoord_mixnoreasoning_cpt636

8B • Updated 12 days ago • 25

Ray2333/Qwen3-VL-4B-weighted_sft_ratio2-reasoning_and_grounding_changecoord_mixnoreasoning_cpt637

4B • Updated 12 days ago • 39

Ray2333/Qwen3-VL-8B-weighted_sft_ratio2-reasoning_and_grounding_changecoord_mixnoreasoning_cpt637

9B • Updated 12 days ago • 25

Ray2333/Qwen3-VL-4B-sft-reasoning_and_grounding_changecoord_mixnoreasoning_cpt637

4B • Updated 12 days ago • 18

Ray2333/Qwen3-VL-8B-sft-reasoning_and_grounding_changecoord_mixnoreasoning_cpt637

9B • Updated 12 days ago • 30

Ray2333/Qwen2.5-VL-3B-sft-reasoning_and_grounding_changecoord_cpt636

4B • Updated 12 days ago • 16

Ray2333/Qwen2.5-VL-3B-weighted_sft_ratio2-reasoning_and_grounding_changecoord_mixnoreasoning_cpt636

4B • Updated 12 days ago • 18

Ray2333/Qwen2.5-VL-7B-weighted_sft_ratio2-reasoning_and_grounding_changecoord_mixnoreasoning_cpt636

8B • Updated 12 days ago • 27

Ray2333/Qwen2.5-VL-7B-sft-reasoning_and_grounding_changecoord_cpt636

8B • Updated 12 days ago • 26

datasets 4

Ray2333/Libra-81K-SFT

Updated Mar 31 • 30

Ray2333/Offline_Evaluation

Viewer • Updated Feb 22 • 35.2k • 29

Ray2333/Libra-81K

Viewer • Updated Feb 20 • 738 • 73

Ray2333/RiC_harmless_helpful

Viewer • Updated Jul 12, 2024 • 291k • 63