Point-It-Out: Benchmarking Embodied Reasoning for Vision Language Models in Multi-Stage Visual Grounding Paper • 2509.25794 • Published Sep 30, 2025 • 2
MoGAN: Improving Motion Quality in Video Diffusion via Few-Step Motion Adversarial Post-Training Paper • 2511.21592 • Published Nov 26, 2025 • 1
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model Paper • 2401.09417 • Published Jan 17, 2024 • 62