PlanningBench: Generating Scalable and Verifiable Planning Data for Evaluating and Training Large Language Models Paper • 2605.20873 • Published 2 days ago • 3
PlantMarkerBench: A Multi-Species Benchmark for Evidence-Grounded Plant Marker Reasoning Paper • 2605.10032 • Published 11 days ago • 2
A^2RD: Agentic Autoregressive Diffusion for Long Video Consistency Paper • 2605.06924 • Published 15 days ago • 15
HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation Paper • 2604.28196 • Published 22 days ago • 71
ThinkTwice: Jointly Optimizing Large Language Models for Reasoning and Self-Refinement Paper • 2604.01591 • Published Apr 2 • 42
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 629
arithmetic-circuit-overloading/Llama-3.3-70B-Instruct-v2-3d-5M-500K-0.1-reverse-padzero-99-128D-1L-4H-512I Text Generation • 465k • Updated Apr 9 • 45
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 503
Diffutron: A Masked Diffusion Language Model for Turkish Language Paper • 2603.20466 • Published Mar 20 • 9
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 351
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published Mar 30 • 342