Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Building on HF
111
23
240
VORTEX
Abhaykoul
Follow
Niansuh's profile picture
V3N0M's profile picture
Maxfox12345's profile picture
177 followers
·
38 following
OEvortex
AI & ML interests
None yet
Recent Activity
reacted
to
KingNish
's
post
with 🔥
about 1 hour ago
Muon vs MuonClip vs Muon+Adamw Muon has gone from an experiment to a mainstream optimizer, but does it hold up for fine‑tuning? We ran head‑to‑head tests on Qwen3‑4B (10k+ high‑quality instruction rows) to find out. Short story: Pure Muon converged fastest at the start, but its gradient‑norm spikes made training unstable. MuonClip (Kimi K2’s clipping) stabilizes long pretraining runs, yet in our small‑scale fine‑tune it underperformed, lower token accuracy and slower convergence. The winner was the hybrid: Muon for 2D layers + AdamW for 1D layers. It delivered the best balance of stability and final performance and even beat vanilla AdamW. Takeaway: for small-scale fine-tuning, hybrid = practical and reliable. Next Step: scale to larger models/datasets to see if Muon’s spikes become catastrophic or if clipping wins out. Full Blog Link: https://huggingface.co/blog/KingNish/optimizer-part1
View all activity
Organizations
Abhaykoul
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a dataset
5 days ago
HelpingAI/KS-WIKI
Viewer
•
Updated
8 days ago
•
1.24M
•
34
•
1
liked
a model
5 days ago
EssentialAI/rnj-1-instruct
Text Generation
•
8B
•
Updated
3 days ago
•
445k
•
•
220
liked
2 datasets
16 days ago
datavorous/entrance-exam-dataset
Viewer
•
Updated
Jan 3
•
97.4k
•
262
•
1
EssentialAI/essential-web-v1.0
Preview
•
Updated
Oct 2
•
13.8k
•
210
liked
a model
27 days ago
WeiboAI/VibeThinker-1.5B
Text Generation
•
2B
•
Updated
18 days ago
•
28.5k
•
502
liked
a model
about 2 months ago
espnet/xeus
Automatic Speech Recognition
•
Updated
Jun 17
•
39
•
143
liked
a dataset
2 months ago
0xZee/dataset-CoT-Advanced-Calculus-268
Viewer
•
Updated
Feb 9
•
268
•
28
•
1
liked
2 models
3 months ago
HelpingAI/Dhanishta-2.0-A30B-Base
Text Generation
•
100B
•
Updated
Sep 22
•
9
•
3
PerceptronAI/Isaac-0.1
Text Generation
•
3B
•
Updated
Oct 9
•
4.84k
•
112
liked
2 models
4 months ago
HelpingAI/hai3.1-checkpoint-0002
Text Generation
•
16B
•
Updated
Sep 15
•
27
•
8
HelpingAI/hai3.1-checkpoint-0001
Text Generation
•
16B
•
Updated
Aug 14
•
10
•
3
liked
a dataset
4 months ago
HelpingAI/Intermediate-Thinking-130k
Viewer
•
Updated
Sep 22
•
135k
•
68
•
46
liked
2 models
5 months ago
HelpingAI/Dhanishtha-2.0-preview-0825
Text Generation
•
15B
•
Updated
Jul 29
•
17
•
18
HelpingAI/Dhanishtha-nsfw
Text Generation
•
15B
•
Updated
Jul 29
•
34
•
24
liked
a dataset
5 months ago
UnfilteredAI/unfiltered-thinker
Updated
Jul 27
•
14
•
7
liked
2 models
5 months ago
HelpingAI/Dhanishtha-2.0-preview-0725
Text Generation
•
15B
•
Updated
Jul 29
•
3
•
11
HelpingAI/Dhanishtha-2.0-preview-mlx
Text Generation
•
15B
•
Updated
Jul 3
•
12
•
2
liked
2 Spaces
5 months ago
Running
on
Zero
7
Dhanishtha 2.0 Preview
🏆
7
Chat with an AI that shows its thinking process
Running
on
Zero
9
Dhanishtha 2.0 Preview
🏆
9
Generate responses with step-by-step thinking
liked
a dataset
5 months ago
HelpingAI/Dhanishtha-2.0-SUPERTHINKER
Viewer
•
Updated
5 days ago
•
11.7k
•
218
•
23
Load more