Github - a tgy2024 Collection

tgy2024 's Collections

Website

Github

Github

updated Oct 23, 2025

MMMR: Benchmarking Massive Multi-Modal Reasoning Tasks

Paper • 2505.16459 • Published May 22, 2025 • 45
Can LLMs Correct Themselves? A Benchmark of Self-Correction in LLMs

Paper • 2510.16062 • Published Oct 17, 2025 • 1