--- title: GPU Memory Calculator emoji: 🎮 colorFrom: blue colorTo: purple sdk: docker pinned: true license: mit tags: [llm, gpu, deep-learning, pytorch, training, inference, memory-calculator, deepspeed, megatron, fsdp, vllm, quantization, machine-learning, ai, tools] --- # 🎮 GPU Memory Calculator for LLM Training & Inference **Instantly calculate GPU memory requirements for training and running Large Language Models.** Plan your infrastructure, avoid OOM errors, and optimize costs before you start. [![GitHub Stars](https://img.shields.io/github/stars/George614/gpu-mem-calculator?style=social)](https://github.com/George614/gpu-mem-calculator) [![GitHub Issues](https://img.shields.io/github/issues/George614/gpu-mem-calculator)](https://github.com/George614/gpu-mem-calculator/issues) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) ## 🚀 Why Use This Tool? - **💰 Save Money** - Know exactly what GPUs you need before spending thousands - **⚡ Avoid OOM** - Validate your config fits in memory before training - **📊 Compare Strategies** - DeepSpeed vs Megatron vs FSDP at a glance - **🎯 Plan Infrastructure** - From 7B to 175B+ parameter models - **⚙️ Export Configs** - Generate working configs for your training framework ## ✨ Features ### Training Memory Calculation Calculate memory for all major training frameworks: - **PyTorch DDP** - Baseline distributed training - **DeepSpeed ZeRO** (Stages 0-3) with CPU/NVMe offloading - **Megatron-LM** - Tensor + Pipeline parallelism - **PyTorch FSDP** - Fully sharded data parallel - **Megatron + DeepSpeed** - Hybrid approach ### Inference Memory Estimation Optimize your deployment with: - **HuggingFace Transformers** - Baseline inference - **vLLM** - PagedAttention optimization - **TGI** - Text Generation Inference - **TensorRT-LLM** - Maximum throughput - **SGLang** - RadixAttention caching ### Smart Features - 🎯 **Model Presets** - LLaMA 2, GPT-3, Mixtral, GLM, Qwen, DeepSeek-MoE - 📦 **Export Configs** - Accelerate, Lightning, Axolotl, DeepSpeed, YAML, JSON - 🔢 **Batch Optimizer** - Auto-find max batch size for your hardware - 🌐 **Multi-Node** - Calculate network overhead for distributed training - 💾 **KV Cache** - Quantization options (INT4/INT8/FP8/None) ## 🎯 Supported Models | Model | Parameters | Use Case | |-------|-----------|----------| | LLaMA 2 | 7B, 13B, 70B | General purpose | | GPT-3 | 175B | Large scale training | | Mixtral 8x7B | 47B | Mixture of Experts | | GLM-4 | 9B - 355B | Chinese/English | | Qwen MoE | 2.7B | Efficient inference | | DeepSeek-MoE | 16B | sparse training | ## 📖 How to Use 1. **Select a Model** - Choose from presets or enter custom parameters 2. **Pick Your Engine** - Training (DeepSpeed/Megatron/FSDP) or Inference (vLLM/TGI/SGLang) 3. **Configure** - Adjust batch size, GPUs, precision, offloading 4. **Calculate** - Get instant memory breakdown 5. **Export** - Generate working configs for your framework ## 💡 Example Use Cases - **"Can I train a 7B model on 4x A100s?"** → Calculate and find out - **"What's the max batch size for DeepSpeed ZeRO-3?"** → Batch optimizer tells you - **"vLLM vs TGI - which uses less memory?"** → Compare instantly - **"How many GPUs for 175B with Megatron?"** → Plan your cluster ## 🔗 Links & Resources - **[GitHub Repository](https://github.com/George614/gpu-mem-calculator)** - Star us on GitHub! ⭐ - **[Full Documentation](https://github.com/George614/gpu-mem-calculator#readme)** - Complete guide - **[Report Issues](https://github.com/George614/gpu-mem-calculator/issues)** - Bug reports & feature requests - **[Contributing Guide](https://github.com/George614/gpu-mem-calculator/blob/main/CONTRIBUTING.md)** - Pull requests welcome! ## 📚 Technical Details Built with: - **FastAPI** - High-performance web framework - **Pydantic** - Data validation and settings - **Python 3.12** - Latest Python for maximum performance Formulas verified against: - [EleutherAI Transformer Math](https://blog.eleuther.ai/transformer-math/) - [Microsoft DeepSpeed ZeRO](https://www.microsoft.com/en-us/research/blog/zero-deepspeed/) - [NVIDIA Megatron-LM](https://github.com/NVIDIA/Megatron-LM) ## 📊 License MIT License - Free for commercial and personal use. --- **Made with ❤️ by the AI community** [![GitHub stars](https://img.shields.io/github/stars/George614/gpu-mem-calculator?style=flat-square&logo=github&label=Star%20on%20GitHub)](https://github.com/George614/gpu-mem-calculator)