One Model, Many Latencies: Universal Speech Enhancement for Diverse Real-Time Applications Paper • 2606.25621 • Published 2 days ago • 13
Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients Paper • 2606.18216 • Published 10 days ago • 61
Cosmos3 Collection Omnimodal World Models for Physical AI • 16 items • Updated about 6 hours ago • 131
SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning Paper • 2606.13673 • Published 15 days ago • 106
SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning Paper • 2606.13673 • Published 15 days ago • 106
SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning Paper • 2606.13673 • Published 15 days ago • 106
FineBench: Benchmarking and Enhancing Vision-Language Models for Fine-grained Human Activity Understanding Paper • 2605.19846 • Published May 20 • 3
Physics in 2-Steps: Locking Motion Priors Before Visual Refinement Erases Them Paper • 2606.06361 • Published 22 days ago • 16
FineBench: Benchmarking and Enhancing Vision-Language Models for Fine-grained Human Activity Understanding Paper • 2605.19846 • Published May 20 • 3
Physics in 2-Steps: Locking Motion Priors Before Visual Refinement Erases Them Paper • 2606.06361 • Published 22 days ago • 16
Why Far Looks Up: Probing Spatial Representation in Vision-Language Models Paper • 2605.30161 • Published 29 days ago • 60
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published about 1 month ago • 93
view article Article Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation nvidia • May 18 • 21