view article Article DenseOn with the LateOn: Open State-of-the-Art Single and Multi-Vector Models lightonai • 28 days ago • 38
view article Article DeepSeek Engram × OLMo-core: Distributed Implementation bird-of-paradise • 11 days ago • 1
view article Article mlinter: a linter for Transformers modeling files huggingface • 28 days ago • 10
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures Paper • 2505.09343 • Published May 14, 2025 • 77
view article Article Multimodal Embedding & Reranker Models with Sentence Transformers tomaarsen • Apr 9 • 59
view article Article LLM Architectures Explained: What Powers Today’s Top Models PrunaAI • Mar 4 • 11