π΅ zai-org/GLM-4.7-Flash
Juan JuliΓ‘n
juanjucm
AI & ML interests
Machine Learning Engineer
Recent Activity
liked a model 6 days ago
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 upvoted a collection 6 days ago
NVIDIA Nemotron v3 new activity 15 days ago
internlm/JanusCoder-8B:Update `pipeline_tag` from `Image-Text-to-Text` to `Text-Generation`Organizations
repliedto their post about 2 months ago
repliedto their post about 2 months ago
π΅ unsloth/GLM-4.7-Flash-GGUF
https://ai.azure.com/catalog/models/unsloth-glm-4.7-flash-gguf
posted an update about 2 months ago
Post
282
Last week,
zai-org dropped zai-org/GLM-4.7-Flash. Now, we bring it to Microsoft Foundry!
- π 30B-A3B MoE, the strongest model in the 30B class. It excels at coding tasks, agentic workflows and reasoning.
- π€π» Lighter version of his 358B big brother, balancing performance and efficiency.
Not light enough for you? We are also adding
unsloth unsloth/GLM-4.7-Flash-GGUF to the catalog, with GPU and CPU support powered by llama.cpp π₯
Go join the hype and deploy them from the Hugging Face collection on Microsoft Foundry!
- π 30B-A3B MoE, the strongest model in the 30B class. It excels at coding tasks, agentic workflows and reasoning.
- π€π» Lighter version of his 358B big brother, balancing performance and efficiency.
Not light enough for you? We are also adding
Go join the hype and deploy them from the Hugging Face collection on Microsoft Foundry!
reactedto alvarobartt's post with π₯ about 2 months ago
Post
3147
π₯
π‘ Alternatively, you can also set the
hf-mem v0.4.1 now also estimates KV cache memory requirements for any context length and batch size with the --experimental flag!uvx hf-mem --model-id ... --experimental will automatically pull the required information from the Hugging Face Hub to include the KV cache estimation, when applicable.π‘ Alternatively, you can also set the
--max-model-len, --batch-size and --kv-cache-dtype arguments (Γ la vLLM) manually if preferred. reactedto sergiopaniego's post with π₯ about 2 months ago
Post
2601
New TRL + OpenEnv example! π₯
Fine tune an LLM for playing Sudoku using an RL env via OpenEnv
Includes a script that runs on 1 or multiple GPUs with vLLM, plus a Colab-ready notebook.
Enjoy!
Notebook: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/openenv_sudoku_grpo.ipynb
Script: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/sudoku.py
Fine tune an LLM for playing Sudoku using an RL env via OpenEnv
Includes a script that runs on 1 or multiple GPUs with vLLM, plus a Colab-ready notebook.
Enjoy!
Notebook: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/openenv_sudoku_grpo.ipynb
Script: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/sudoku.py
reactedto pagezyhf's post with π₯ 5 months ago
Post
2926
π Big news for AI builders!
Weβre thrilled to announce that the Qwen3-VL family of vision-language models is now available on Azure AI Foundry, thanks to our collaboration with Microsoft.
We bring open-source innovation to enterprise-grade AI infrastructure, making it easier than ever for enterprise to deploy and scale the latest and greatest from models from hugging Face securely within Azure.
π Highlights:
- Deploy Qwen3-VL instantly via managed endpoints
- Built-in governance, telemetry, and lifecycle management
- True multimodal reasoning β vision, language, and code understanding
- State-of-the-art performance, outperforming closed-source models like Gemini 2.5 Pro and GPT-5
- Available in both *Instruct* and *Thinking* modes, across 24 model sizes
π Get started today: search for Qwen3-VL in the Hugging Face Collection on Azure AI Foundry.
Weβre thrilled to announce that the Qwen3-VL family of vision-language models is now available on Azure AI Foundry, thanks to our collaboration with Microsoft.
We bring open-source innovation to enterprise-grade AI infrastructure, making it easier than ever for enterprise to deploy and scale the latest and greatest from models from hugging Face securely within Azure.
π Highlights:
- Deploy Qwen3-VL instantly via managed endpoints
- Built-in governance, telemetry, and lifecycle management
- True multimodal reasoning β vision, language, and code understanding
- State-of-the-art performance, outperforming closed-source models like Gemini 2.5 Pro and GPT-5
- Available in both *Instruct* and *Thinking* modes, across 24 model sizes
π Get started today: search for Qwen3-VL in the Hugging Face Collection on Azure AI Foundry.
reactedto pagezyhf's post with π 8 months ago
Post
1575
In our recent push to make more models available on Azure, we recently added SmolLM v3 in the catalog! π
@juanjucm wrote a really detailed guide on how to deploy on Azure AI π€
https://huggingface.co/docs/microsoft-azure/azure-ai/examples/deploy-smollm3
If you want to see other models, please let us know
@juanjucm wrote a really detailed guide on how to deploy on Azure AI π€
https://huggingface.co/docs/microsoft-azure/azure-ai/examples/deploy-smollm3
If you want to see other models, please let us know