DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model Paper • 2405.04434 • Published May 7, 2024 • 24
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models Paper • 2401.06066 • Published Jan 11, 2024 • 58