Step-Level Sparse Autoencoder for Reasoning Process Interpretation Paper • 2603.03031 • Published 9 days ago
SSAE Collection Training and evaluation dataset, model checkpoints in 'Step-Level Sparse Autoencoder for Reasoning Process Interpretation' • 3 items • Updated 8 days ago • 1
SSAE Collection Training and evaluation dataset, model checkpoints in 'Step-Level Sparse Autoencoder for Reasoning Process Interpretation' • 3 items • Updated 8 days ago • 1
RLVR Linearity Collection RL training and evaluation datasets, and checkpoints in 'Not All Steps are Informative: On the Linearity of LLMs’ RLVR Training' • 3 items • Updated Jan 26
RLVR Linearity Collection RL training and evaluation datasets, and checkpoints in 'Not All Steps are Informative: On the Linearity of LLMs’ RLVR Training' • 3 items • Updated Jan 26
RLVR Linearity Collection RL training and evaluation datasets, and checkpoints in 'Not All Steps are Informative: On the Linearity of LLMs’ RLVR Training' • 3 items • Updated Jan 26
HAF-RM: A Hybrid Alignment Framework for Reward Model Training Paper • 2407.04185 • Published Jul 4, 2024
ARKS: Active Retrieval in Knowledge Soup for Code Generation Paper • 2402.12317 • Published Feb 19, 2024
ALaRM: Align Language Models via Hierarchical Rewards Modeling Paper • 2403.06754 • Published Mar 11, 2024
DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation Paper • 2211.11501 • Published Nov 18, 2022