-
Φeat: Physically-Grounded Feature Representation
Paper • 2511.11270 • Published • 10 -
TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation
Paper • 2412.03069 • Published • 34 -
UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation
Paper • 2510.10575 • Published • 1
Collections
Discover the best community collections!
Collections including paper arxiv:2511.11270
-
A Survey of Reinforcement Learning for Large Reasoning Models
Paper • 2509.08827 • Published • 190 -
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference
Paper • 2508.02193 • Published • 133 -
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations
Paper • 2510.23607 • Published • 177 -
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation
Paper • 2510.08673 • Published • 125
-
End-to-End Vision Tokenizer Tuning
Paper • 2505.10562 • Published • 22 -
Global and Local Entailment Learning for Natural World Imagery
Paper • 2506.21476 • Published • 1 -
DINOv3
Paper • 2508.10104 • Published • 291 -
Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic
Paper • 2509.01363 • Published • 58
-
Φeat: Physically-Grounded Feature Representation
Paper • 2511.11270 • Published • 10 -
TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation
Paper • 2412.03069 • Published • 34 -
UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation
Paper • 2510.10575 • Published • 1
-
A Survey of Reinforcement Learning for Large Reasoning Models
Paper • 2509.08827 • Published • 190 -
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference
Paper • 2508.02193 • Published • 133 -
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations
Paper • 2510.23607 • Published • 177 -
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation
Paper • 2510.08673 • Published • 125
-
End-to-End Vision Tokenizer Tuning
Paper • 2505.10562 • Published • 22 -
Global and Local Entailment Learning for Natural World Imagery
Paper • 2506.21476 • Published • 1 -
DINOv3
Paper • 2508.10104 • Published • 291 -
Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic
Paper • 2509.01363 • Published • 58