Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought Paper • 2505.15431 • Published May 21, 2025 • 1
TransMamba: Flexibly Switching between Transformer and Mamba Paper • 2503.24067 • Published Mar 31, 2025 • 21
Scaling Laws for Floating Point Quantization Training Paper • 2501.02423 • Published Jan 5, 2025 • 26