SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning Paper • 2602.13515 • Published 10 days ago • 41
WorldCompass: Reinforcement Learning for Long-Horizon World Models Paper • 2602.09022 • Published 14 days ago • 20
DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos Paper • 2602.06949 • Published 17 days ago • 34
DFlash: Block Diffusion for Flash Speculative Decoding Paper • 2602.06036 • Published 18 days ago • 42
Pathwise Test-Time Correction for Autoregressive Long Video Generation Paper • 2602.05871 • Published 18 days ago • 3
Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR Paper • 2602.05261 • Published 18 days ago • 49
ProAct: Agentic Lookahead in Interactive Environments Paper • 2602.05327 • Published 18 days ago • 25
RISE-Video: Can Video Generators Decode Implicit World Rules? Paper • 2602.05986 • Published 18 days ago • 26
HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing Paper • 2602.03560 • Published 20 days ago • 44
Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization Paper • 2602.02958 • Published 20 days ago • 33
Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation Paper • 2602.02214 • Published 21 days ago • 24
Language-based Trial and Error Falls Behind in the Era of Experience Paper • 2601.21754 • Published 25 days ago • 16
Towards Pixel-Level VLM Perception via Simple Points Prediction Paper • 2601.19228 • Published 27 days ago • 18