LIU Shih-yang's picture

LIU Shih-yang

sliuau

·

AI & ML interests

None yet

Recent Activity

authored a paper 1 day ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

upvoted a paper 2 days ago

TiDAR: Think in Diffusion, Talk in Autoregression

upvoted a paper 2 days ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

View all activity

Organizations

Papers 7

arxiv:2601.05242

arxiv:2510.15110

arxiv:2410.21271

arxiv:2402.09353

models 11

sliuau/llama-3.2-3b_4bits_128group_size_eora_rank128_mmlu_c4

Updated Mar 10, 2025

sliuau/llama-3.2-3b_4bits_128group_size_eora_rank64_mmlu_c4

Updated Mar 7, 2025

sliuau/Llama-3.2-3B_4bits_128group_size

Text Generation • 3B • Updated Mar 7, 2025 • 249

sliuau/llama3.2-1b-4bit-group128-eora-rank128-arc-v2

Updated Feb 6, 2025

sliuau/llama3.2-1b-4bit-group128

1B • Updated Feb 5, 2025 • 69

sliuau/llama3.2-1b-4bit-group128-eora-rank128-arc

Updated Feb 5, 2025

sliuau/llama3_8b_gptq_w4_eora_arc

Updated Dec 30, 2024

sliuau/llama3_8b_gptq_4bits

Updated Dec 30, 2024

sliuau/llama3_8b_gptq_w3_eora_arc

Updated Dec 24, 2024

sliuau/llama3_8b_gptq_3bits

Updated Dec 24, 2024 • 2

datasets 8

sliuau/DeepScaleR-Preview-Dataset-verl-format

Viewer • Updated Nov 3, 2025 • 40.8k • 21

sliuau/FULL-MATH

Viewer • Updated Apr 30, 2025 • 7.5k • 10

sliuau/c4-val

Viewer • Updated Apr 14, 2025 • 45.6k • 10

sliuau/c4-train

Viewer • Updated Apr 14, 2025 • 356k • 23

sliuau/gsm9k_reasoning_openr1_eval_format

Viewer • Updated Apr 3, 2025 • 1.32k • 8

sliuau/truncated_openr1_math_0.1k

Viewer • Updated Feb 28, 2025 • 100 • 4

sliuau/truncated_openr1_math_1k

Viewer • Updated Feb 28, 2025 • 1k • 7

sliuau/c4

Updated Jan 9, 2025 • 1