Closing the Loop: Universal Repository Representation with RPG-Encoder Paper • 2602.02084 • Published 11 days ago • 82
TwinBrainVLA: Unleashing the Potential of Generalist VLMs for Embodied Tasks via Asymmetric Mixture-of-Transformers Paper • 2601.14133 • Published 24 days ago • 60
MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences Paper • 2601.06789 • Published Jan 11 • 78
KnowMe-Bench: Benchmarking Person Understanding for Lifelong Digital Companions Paper • 2601.04745 • Published Jan 8 • 58
Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows Paper • 2512.13168 • Published Dec 15, 2025 • 52
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference Paper • 2508.02193 • Published Aug 4, 2025 • 136
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL Paper • 2508.13167 • Published Aug 6, 2025 • 129
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs Paper • 2508.16153 • Published Aug 22, 2025 • 160
The Consistency Critic: Correcting Inconsistencies in Generated Images via Reference-Guided Attentive Alignment Paper • 2511.20614 • Published Nov 25, 2025 • 38
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13, 2025 • 191
Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5? Paper • 2311.07587 • Published Nov 8, 2023 • 5
Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer Paper • 2311.06720 • Published Nov 12, 2023 • 9
The Impact of Large Language Models on Scientific Discovery: a Preliminary Study using GPT-4 Paper • 2311.07361 • Published Nov 13, 2023 • 14
LayoutPrompter: Awaken the Design Ability of Large Language Models Paper • 2311.06495 • Published Nov 11, 2023 • 12
GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation Paper • 2311.07562 • Published Nov 13, 2023 • 15
MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks Paper • 2311.07463 • Published Nov 13, 2023 • 15
SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models Paper • 2311.07575 • Published Nov 13, 2023 • 15
Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models Paper • 2311.06783 • Published Nov 12, 2023 • 28