Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Open to Collab
165.8
TFLOPS
49
22
625
Mike Ravkine
PRO
mike-ravkine
Follow
AbstractPhil's profile picture
webxos's profile picture
OwenArli's profile picture
69 followers
ยท
63 following
the-crypt-keeper
AI & ML interests
LLM Research / Development / Evaluation
Recent Activity
posted
an
update
2 days ago
My hat is off to the https://huggingface.co/upstage team ๐ฉ https://huggingface.co/upstage/Solar-Open-100B is a very interesting, permissively licensed (Apache-with-attribution), trained from scratch (19T tokens), 12B active MoE - but that's not even the cool part. The cool part is that their fork of vLLM comes with the addition of a `reasoning_effort` parameter and a corresponding reasoning/tool-calling controller FSM to consume it! https://github.com/UpstageAI/vllm/blob/c9a05e077cd82df8cab4f729396c178c29c81aa8/vllm/model_executor/models/solar_open_logits_processor.py Looks like only "medium" and "high" are actually implemented, but still absolutely love to see this sorta thing. To make this model a little more accessible, I have created a FP8-Dynamic quant at https://huggingface.co/mike-ravkine/Solar-Open-100B-FP8-Dynamic which makes it fit nicely into 2xPro-6000 or 4xA6000 GPUs. My ReasonScape evaluations are currently running, will take me a couple days for this one but early results are quite strong: it's showing the competency expected from a 100B reasoning model (it can count the r's in strawberry, it can do basic arithmetic, etc..) and I haven't seen a truncation yet.
updated
a model
2 days ago
mike-ravkine/Solar-Open-100B-FP8-Dynamic
published
a model
2 days ago
mike-ravkine/Solar-Open-100B-FP8-Dynamic
View all activity
Organizations
None yet
mike-ravkine
's datasets
3
Sort:ย Recently updated
mike-ravkine/AlteredWorlds
Viewer
โข
Updated
Aug 31, 2024
โข
447
โข
11
โข
3
mike-ravkine/rosettacode-parsed
Viewer
โข
Updated
Jun 20, 2023
โข
4.26k
โข
107
โข
11
mike-ravkine/can-ai-code_junior-dev_v1
Viewer
โข
Updated
May 30, 2023
โข
24
โข
33