18 13

Васильев Сергей

miljones2024

AI & ML interests

None yet

Recent Activity

upvoted a paper about 16 hours ago

DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards

liked a dataset 2 days ago

wegrthj/l36l5h-v654-data

liked a model 3 days ago

TAUR-dev/rankalign-v6-gemma-2-2b-d0.15-e1-persona-v1-all-tcs-fsx-sm0.1

View all activity

Organizations

None yet

upvoted a paper about 16 hours ago

DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards

Paper • 2605.21467 • Published 4 days ago • 190

liked a dataset 2 days ago

wegrthj/l36l5h-v654-data

Updated 2 minutes ago • 21.2k • 4

liked a model 3 days ago

TAUR-dev/rankalign-v6-gemma-2-2b-d0.15-e1-persona-v1-all-tcs-fsx-sm0.1

Text Generation • 3B • Updated 3 days ago • 14 • 1

upvoted a paper 4 days ago

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

Paper • 2605.11609 • Published 12 days ago • 189

liked a model 6 days ago

arepaconcafe/neko-base

Updated about 4 hours ago • 4

liked a dataset 10 days ago

mrmrx/CADS-dataset

Viewer • Updated 3 days ago • 21.8k • 4.82k • 53

upvoted a paper 10 days ago

Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers

Paper • 2605.06169 • Published 17 days ago • 215

upvoted a paper 13 days ago

Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling

Paper • 2604.28075 • Published 24 days ago • 20

upvoted a paper 17 days ago

MolmoAct2: Action Reasoning Models for Real-world Deployment

Paper • 2605.02881 • Published 20 days ago • 335

liked a model about 1 month ago

Bialy17/qwen-finetuned-Reasoning-Socratic-QandA-unsloth

Updated about 1 month ago • 1

upvoted a paper about 1 month ago

LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model

Paper • 2604.20796 • Published Apr 22 • 240

liked 2 models about 1 month ago

openbmb/VoxCPM2

Text-to-Speech • Updated Apr 16 • 199k • 1.32k

tencent/HY-Embodied-0.5

Image-Text-to-Text • 4B • Updated Apr 14 • 858 • 906

liked a dataset about 1 month ago

rahuljoy/stack_binary_subset_chat

Viewer • Updated Apr 13 • 2.03k • 10 • 1

liked a model about 1 month ago

mradermacher/Vero-MiMo-7B-i1-GGUF

Reinforcement Learning • 8B • Updated Apr 21 • 542 • 2

upvoted a paper about 1 month ago

GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

Paper • 2604.02721 • Published Apr 3 • 629

liked a model about 2 months ago

PhoenixHu/ral_grpo_internvl2_5_how2sign_1b_bleu1_rouge_kl05_temp07_0405_metta

Updated Apr 8 • 1

upvoted 2 papers about 2 months ago

Adam's Law: Textual Frequency Law on Large Language Models

Paper • 2604.02176 • Published Apr 2 • 503

AgentWatcher: A Rule-based Prompt Injection Monitor

Paper • 2604.01194 • Published Apr 1 • 3

liked a model about 2 months ago

Outlier-Ai/Outlier-10B

Text Generation • Updated 19 days ago • 207 • 2

Васильев Сергей

AI & ML interests

Recent Activity

Organizations

miljones2024's activity