pangpangxuan's picture

97 3

pangpangxuan

pangxuan

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 10 hours ago

CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World Uncertainty

upvoted a paper 3 days ago

MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods

upvoted a paper 3 days ago

TTCS: Test-Time Curriculum Synthesis for Self-Evolving

View all activity

Organizations

None yet

upvoted a paper about 10 hours ago

CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World Uncertainty

Paper • 2601.22027 • Published 9 days ago • 68

upvoted 3 papers 3 days ago

MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods

Paper • 2601.21821 • Published 9 days ago • 57

TTCS: Test-Time Curriculum Synthesis for Self-Evolving

Paper • 2601.22628 • Published 8 days ago • 32

Kimi K2.5: Visual Agentic Intelligence

Paper • 2602.02276 • Published 5 days ago • 202

upvoted a paper 25 days ago

X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests

Paper • 2601.06953 • Published 27 days ago • 44

upvoted 4 papers 26 days ago

RL-AWB: Deep Reinforcement Learning for Auto White Balance Correction in Low-Light Night-time Scenes

Paper • 2601.05249 • Published 30 days ago • 46

Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization

Paper • 2601.05432 • Published 30 days ago • 166

The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning

Paper • 2601.06002 • Published 29 days ago • 51

Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards

Paper • 2601.06021 • Published 29 days ago • 43

upvoted 2 papers 29 days ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published 30 days ago • 222

AT^2PO: Agentic Turn-based Policy Optimization via Tree Search

Paper • 2601.04767 • Published about 1 month ago • 28

upvoted 7 papers about 1 month ago

Benchmark^2: Systematic Evaluation of LLM Benchmarks

Paper • 2601.03986 • Published about 1 month ago • 34

Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization

Paper • 2512.24615 • Published Dec 31, 2025 • 119

Evaluating Parameter Efficient Methods for RLVR

Paper • 2512.23165 • Published Dec 29, 2025 • 26

Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

Paper • 2512.24618 • Published Dec 31, 2025 • 147

Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling

Paper • 2512.23959 • Published Dec 30, 2025 • 112

Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

Paper • 2512.24617 • Published Dec 31, 2025 • 64

Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding

Paper • 2512.17220 • Published Dec 19, 2025 • 112

upvoted 2 papers about 2 months ago

Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding

Paper • 2512.17532 • Published Dec 19, 2025 • 67

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Paper • 2512.16649 • Published Dec 18, 2025 • 26