Activity Feed

AI & ML interests

None defined yet.

Recent Activity

Locutusque 
posted an update 1 day ago
view post
Post
103
🚀 Introducing Esmeralda-Llama-3.1-8B-control
The first release in the Esmeralda model family by Locutusque.

This model is intentionally small and experimental — a control/baseline proof-of-concept designed to answer one question:

«“How strong is my new "Locutusque/esmeralda-agentic" dataset before scaling to larger runs?”»

Training Details

- Base: Llama 3.1 8B
- Training precision: bf16 mixed precision
- Chat template: modified ChatML
- Dataset size: ~37k examples
- Examples actually used for this run: ~5k

The dataset includes:

- multi-turn agentic traces
- reasoning traces
- structured assistant behavior
- generalist instruction data

Benchmark Results

Compared against:

- Llama 3.1 8B Instruct
- Hermes-3-Llama-3.1-8B

HumanEval

57.3 — Esmeralda
56.1 — Llama 3.1 Instruct
52.4 — Hermes-3

MBPP

53.2 — Esmeralda
56.8 — Llama 3.1 Instruct
48.2 — Hermes-3

GPQA Diamond

15.7 — Esmeralda
15.7 — Llama 3.1 Instruct
18.2 — Hermes-3

EQ-Bench

59.2 — Esmeralda
61.1 — Llama 3.1 Instruct
63.1 — Hermes-3

EQ-Bench Parseable (Syntax Stability)

🔥 100.0% — Esmeralda
92.4% — Llama 3.1 Instruct
91.2% — Hermes-3

Here Be Dragons 🐉

I also experimented with a new TruthfulQA free-generation evaluation setup.

- Responses were judged by Gemma 4 26B A4B
- The judge compared generations directly against ground-truth answers
- Models were evaluated in 8-bit quantized form to speed up inference

TruthfulQA (LLM Judge)

0.682 — Esmeralda-Llama-3.1-8B-control
0.587 — Hermes-3-Llama-3.1-8B (reported MC2 score; methodology differs)

For a lightweight control run trained on only a fraction of the dataset, I’m pretty encouraged by the results.

The model is released under the standard Llama 3.1 license, and I’d genuinely love feedback from people testing it in real workflows.

Model: Locutusque/Esmeralda-Llama-3.1-8B-control

Dataset: Locutusque/esmeralda-agentic

danielhanchen 
posted an update 6 days ago
danielhanchen 
posted an update 14 days ago
view post
Post
5750
We’re excited to announce that Unsloth has joined the PyTorch Ecosystem! 🔥🦥

Unsloth is an open-source project that makes training & running models more accurate and faster with less compute. Our mission is to make local AI accessible to everyone. Thanks to all of you for making this possible! 💕

Blog: https://unsloth.ai/blog/pytorch
GitHub: https://github.com/unslothai/unsloth
  • 2 replies
·
danielhanchen 
posted an update 18 days ago
view post
Post
7669
We collaborated with NVIDIA to teach you how we made LLM training ~25% faster! 🚀

Learn how 3 optimizations help your home GPU train models faster:
1. Packed-sequence metadata caching
2. Double-buffered checkpoint reloads
3. Faster MoE routing

Guide: https://unsloth.ai/blog/nvidia-collab
GitHub: https://github.com/unslothai/unsloth
danielhanchen 
posted an update 22 days ago
view post
Post
8812
We made a guide on how to run open LLMs in Claude Code, Codex and OpenClaw.

Use Gemma 4 and Qwen3.6 GGUFs for local agentic coding on 24GB RAM

Run with self-healing tool calls, code execution, web search via the Unsloth API endpoint and llama.cpp

Guide: https://unsloth.ai/docs/basics/api
danielhanchen 
posted an update 29 days ago
view post
Post
10790
Unsloth is now one of the top 10 most followed organizations on Hugging Face. 🤗🦥

Thanks so much for all the support!
Our HF page:
unsloth
  • 5 replies
·
mlabonne 
posted an update 30 days ago
view post
Post
1958
Big update to llm-datasets, my curated list of datasets and tools for post-training LLMs.

> Added many new datasets
> New "thinking" column
> Refreshed recommended tools.

Thanks to everyone who told me they used it for their research at ICLR, you motivated this update!
  • 2 replies
·
danielhanchen 
posted an update about 1 month ago
danielhanchen 
posted an update about 1 month ago
danielhanchen 
posted an update about 2 months ago
danielhanchen 
posted an update about 2 months ago
danielhanchen 
posted an update about 2 months ago
view post
Post
2786
A new way to use Unsloth.

Coming soon...