Checkpoints from my first 124M LLM pre-training project, covering scratch training, continued pre-training, and SFT experiments.
Mrinaal Arora
mrinaalarora
AI & ML interests
None yet
Recent Activity
updated a collection 1 day ago
124M-Base-Experiments updated a model 1 day ago
mrinaalarora/mrinaal-124m-instruct-v3-mathmix-smoltalk-150k published a model 1 day ago
mrinaalarora/mrinaal-124m-instruct-v3-mathmix-smoltalk-150kOrganizations
124M-Base-Experiments
Checkpoints from my first 124M LLM pre-training project, covering scratch training, continued pre-training, and SFT experiments.
Nanbeige4-3B Cold Start Reasoning LoRA Experiments
Two LoRA cold-start SFT experiments teaching structured think/answer reasoning to Nanbeige4-3B-Base using distilled traces from frontier models
spaces 6
pinned
Running
Agents
Trackio
🎯
Show training metrics for your LLM project
Sleeping
RL
DryLabSim
🚀
LLM agents plan noisy biological experiment pipelines
Running
RL
Crisisops Environment Server
🎛
Manage crisis response units in a dynamic simulation
Sleeping
RL
Textarena Environment Server
👀
Interacting with TextArena games via a web chat interface
Sleeping
RL
Json Cleaning Env Environment Server
🖥
Clean messy JSON to fit a target schema
Sleeping
Agents
2
Qwen3-1.7B Wordle GRPO Training Dashboard
📈
Live training metrics for Qwen3-1.7B GRPO run on Wordle
models 9
mrinaalarora/mrinaal-124m-instruct-v3-mathmix-smoltalk-150k
0.1B • Updated
mrinaalarora/mrinaal-124m-base-v3-mathmix
0.1B • Updated
mrinaalarora/mrinaal-124m-instruct-smoltalk-50k
0.1B • Updated
mrinaalarora/mrinaal-124m-base-v2
0.1B • Updated
mrinaalarora/mrinaal-124m-base
Updated
mrinaalarora/wordle-grpo-Qwen3-1.7B
Reinforcement Learning • 2B • Updated • 41
mrinaalarora/Nanbeige4-3B-Cold-Start-Reasoning-LoRA-Opus-Epoch3
Text Generation • Updated
mrinaalarora/nanbeige4-3b-cold-start-reasoning-lora-glm-12k
Text Generation • Updated
mrinaalarora/Nanbeige4-3B-Cold-Start-Reasoning-LoRA
Text Generation • Updated • 1
datasets 0
None public yet