view article Article Distribution Matching Prevents Mode Collapse in Training Reasoning Models 18 days ago • 2