Seedream 4.0: Toward Next-generation Multimodal Image Generation Paper • 2509.20427 • Published Sep 24, 2025 • 82
VideoPrism Collection VideoPrism is a foundational video encoder that enables state-of-the-art performance on a large variety of video understanding tasks. • 5 items • Updated Jul 16, 2025 • 15
MedGemma Release Collection Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 7 items • Updated Jul 11, 2025 • 370
VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 8 items • Updated Dec 4, 2025 • 184
Health AI Developer Foundations (HAI-DEF) Collection Groups models released for use in health AI by Google. Read more about HAI-DEF at http://goo.gle/hai-def • 16 items • Updated 16 days ago • 140
view article Article Welcome GPT OSS, the new open-source model family from OpenAI! +10 Aug 5, 2025 • 508
view article Article The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare +1 Apr 19, 2024 • 189
Describe Anything Collection Multimodal Large Language Models for Detailed Localized Image and Video Captioning • 7 items • Updated 14 days ago • 61
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 Mar 12, 2025 • 480