One-Minute Video Generation with Test-Time Training
Paper
• 2504.05298
• Published
• 110
MoCha: Towards Movie-Grade Talking Character Synthesis
Paper
• 2503.23307
• Published
• 139
Towards Understanding Camera Motions in Any Video
Paper
• 2504.15376
• Published
• 155
Antidistillation Sampling
Paper
• 2504.13146
• Published
• 59
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through
Task Tokenization
Paper
• 2503.19901
• Published
• 41
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation
with Hybrid Guidance
Paper
• 2504.01724
• Published
• 68
Long Video Diffusion Generation with Segmented Cross-Attention and
Content-Rich Video Data Curation
Paper
• 2412.01316
• Published
• 10
STIV: Scalable Text and Image Conditioned Video Generation
Paper
• 2412.07730
• Published
• 74
VidGen-1M: A Large-Scale Dataset for Text-to-video Generation
Paper
• 2408.02629
• Published
• 15
VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video
Generation
Paper
• 2503.01739
• Published
• 9
Video-T1: Test-Time Scaling for Video Generation
Paper
• 2503.18942
• Published
• 90
VideoGuide: Improving Video Diffusion Models without Training Through a
Teacher's Guide
Paper
• 2410.04364
• Published
• 29
Improving Video Generation with Human Feedback
Paper
• 2501.13918
• Published
• 52
Training-free Long Video Generation with Chain of Diffusion Model
Experts
Paper
• 2408.13423
• Published
• 23
VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion
Generation in Video Models
Paper
• 2502.02492
• Published
• 66
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human
Animation Models
Paper
• 2502.01061
• Published
• 223
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising
Steps
Paper
• 2501.09732
• Published
• 72
LTX-Video: Realtime Video Latent Diffusion
Paper
• 2501.00103
• Published
• 50
Expanding Performance Boundaries of Open-Source Multimodal Models with
Model, Data, and Test-Time Scaling
Paper
• 2412.05271
• Published
• 160
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with
Video LLM
Paper
• 2501.00599
• Published
• 46
Shifting AI Efficiency From Model-Centric to Data-Centric Compression
Paper
• 2505.19147
• Published
• 145
AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong
Pretraining Data Selection
Paper
• 2505.07293
• Published
• 28
Alchemist: Turning Public Text-to-Image Data into Generative Gold
Paper
• 2505.19297
• Published
• 84
Predictive Data Selection: The Data That Predicts Is the Data That
Teaches
Paper
• 2503.00808
• Published
• 56
R&B: Domain Regrouping and Data Mixture Balancing for Efficient
Foundation Model Training
Paper
• 2505.00358
• Published
• 26
ICon: In-Context Contribution for Automatic Data Selection
Paper
• 2505.05327
• Published
• 12
SWE-smith: Scaling Data for Software Engineering Agents
Paper
• 2504.21798
• Published
• 14
MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement
Learning
Paper
• 2505.24871
• Published
• 23
Programming Every Example: Lifting Pre-training Data Quality like
Experts at Scale
Paper
• 2409.17115
• Published
• 64