FIRE-Bench: Evaluating Agents on the Rediscovery of Scientific Insights Paper • 2602.02905 • Published 3 days ago • 5
ThinkSum: Probabilistic reasoning over sets using large language models Paper • 2210.01293 • Published Oct 4, 2022 • 1
Bootstrapping a User-Centered Task-Oriented Dialogue System Paper • 2207.05223 • Published Jul 11, 2022
Reasoning with Language Model is Planning with World Model Paper • 2305.14992 • Published May 24, 2023 • 4
PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization Paper • 2310.16427 • Published Oct 25, 2023 • 2
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings Paper • 2305.11554 • Published May 19, 2023 • 2
Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning Paper • 2303.02861 • Published Mar 6, 2023 • 2
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts Paper • 2406.12034 • Published Jun 17, 2024 • 16
LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models Paper • 2404.05221 • Published Apr 8, 2024 • 1
Dynamic Rewarding with Prompt Optimization Enables Tuning-free Self-Alignment of Language Models Paper • 2411.08733 • Published Nov 13, 2024 • 1
FIRE-Bench: Evaluating Agents on the Rediscovery of Scientific Insights Paper • 2602.02905 • Published 3 days ago • 5
Flow Equivariant World Models: Memory for Partially Observed Dynamic Environments Paper • 2601.01075 • Published Jan 3 • 6
Flow Equivariant World Models: Memory for Partially Observed Dynamic Environments Paper • 2601.01075 • Published Jan 3 • 6
SimWorld: An Open-ended Realistic Simulator for Autonomous Agents in Physical and Social Worlds Paper • 2512.01078 • Published Nov 30, 2025 • 34
Next-Embedding Prediction Makes Strong Vision Learners Paper • 2512.16922 • Published Dec 18, 2025 • 86
PAN: A World Model for General, Interactable, and Long-Horizon World Simulation Paper • 2511.09057 • Published Nov 12, 2025 • 80
PAN: A World Model for General, Interactable, and Long-Horizon World Simulation Paper • 2511.09057 • Published Nov 12, 2025 • 80
AimBot: A Simple Auxiliary Visual Cue to Enhance Spatial Awareness of Visuomotor Policies Paper • 2508.08113 • Published Aug 11, 2025 • 11
From Behavioral Performance to Internal Competence: Interpreting Vision-Language Models with VLM-Lens Paper • 2510.02292 • Published Oct 2, 2025 • 1