Building on HF

3 6 6

Juan Julián

juanjucm

AI & ML interests

Machine Learning Engineer

Recent Activity

liked a model 6 days ago

nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16

upvoted a collection 6 days ago

NVIDIA Nemotron v3

new activity 15 days ago

internlm/JanusCoder-8B:Update `pipeline_tag` from `Image-Text-to-Text` to `Text-Generation`

View all activity

Organizations

repliedto their post about 2 months ago

🔵 zai-org/GLM-4.7-Flash

https://ai.azure.com/catalog/models/zai-org-glm-4.7-flash

repliedto their post about 2 months ago

🔵 unsloth/GLM-4.7-Flash-GGUF

https://ai.azure.com/catalog/models/unsloth-glm-4.7-flash-gguf

posted an update about 2 months ago

Post

282

Last week,

zai-org dropped zai-org/GLM-4.7-Flash. Now, we bring it to Microsoft Foundry!

- 🏆 30B-A3B MoE, the strongest model in the 30B class. It excels at coding tasks, agentic workflows and reasoning.
- 🤏🏻 Lighter version of his 358B big brother, balancing performance and efficiency.

Not light enough for you? We are also adding

unsloth unsloth/GLM-4.7-Flash-GGUF to the catalog, with GPU and CPU support powered by llama.cpp 🔥

Go join the hype and deploy them from the Hugging Face collection on Microsoft Foundry!

2 replies

reactedto alvarobartt's post with 🔥 about 2 months ago

Post

3147

💥 hf-mem v0.4.1 now also estimates KV cache memory requirements for any context length and batch size with the --experimental flag!

uvx hf-mem --model-id ... --experimental will automatically pull the required information from the Hugging Face Hub to include the KV cache estimation, when applicable.

💡 Alternatively, you can also set the --max-model-len, --batch-size and --kv-cache-dtype arguments (à la vLLM) manually if preferred.

1 reply

reactedto sergiopaniego's post with 🔥 about 2 months ago

Post

2601

New TRL + OpenEnv example! 💥

Fine tune an LLM for playing Sudoku using an RL env via OpenEnv

Includes a script that runs on 1 or multiple GPUs with vLLM, plus a Colab-ready notebook.

Enjoy!

Notebook: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/openenv_sudoku_grpo.ipynb

Script: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/sudoku.py

1 reply

reactedto pagezyhf's post with 🔥 5 months ago

Post

2926

🚀 Big news for AI builders!

We’re thrilled to announce that the Qwen3-VL family of vision-language models is now available on Azure AI Foundry, thanks to our collaboration with Microsoft.

We bring open-source innovation to enterprise-grade AI infrastructure, making it easier than ever for enterprise to deploy and scale the latest and greatest from models from hugging Face securely within Azure.

🔍 Highlights:

- Deploy Qwen3-VL instantly via managed endpoints
- Built-in governance, telemetry, and lifecycle management
- True multimodal reasoning — vision, language, and code understanding
- State-of-the-art performance, outperforming closed-source models like Gemini 2.5 Pro and GPT-5
- Available in both *Instruct* and *Thinking* modes, across 24 model sizes

👉 Get started today: search for Qwen3-VL in the Hugging Face Collection on Azure AI Foundry.

1 reply

reactedto pagezyhf's post with 🚀 8 months ago

Post

1575

In our recent push to make more models available on Azure, we recently added SmolLM v3 in the catalog! 🚀

@juanjucm wrote a really detailed guide on how to deploy on Azure AI 🤗

https://huggingface.co/docs/microsoft-azure/azure-ai/examples/deploy-smollm3

If you want to see other models, please let us know

1 reply

Juan Julián

AI & ML interests

Recent Activity

Organizations

juanjucm's activity