Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought Paper • 2510.04230 • Published Oct 5, 2025 • 26
Apriel Collection ServiceNow Language Modeling Lab's first model family series • 5 items • Updated 22 days ago • 15
view article Article Training and Finetuning Sparse Embedding Models with Sentence Transformers v5 Jul 1, 2025 • 132
Qwen2-VL Collection Vision-language model series based on Qwen2 • 16 items • Updated 4 days ago • 226
EXAONE-Deep Collection EXAONE reasoning model series of 2.4B, 7.8B, and 32B, optimized for reasoning tasks including math and coding • 10 items • Updated Jul 7, 2025 • 95
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated 4 days ago • 374
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22, 2024 • 259
view article Article The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare +1 Apr 19, 2024 • 189
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache Paper • 2401.02669 • Published Jan 5, 2024 • 16
Masked Audio Generation using a Single Non-Autoregressive Transformer Paper • 2401.04577 • Published Jan 9, 2024 • 43
TransformerFAM: Feedback attention is working memory Paper • 2404.09173 • Published Apr 14, 2024 • 43