Abhinav Dadhich

resbyte

resbyte

AI & ML interests

None yet

Recent Activity

liked a Space 7 days ago

librarian-bots/recommend_similar_papers

upvoted a paper 7 days ago

Reasoning Language Models: A Blueprint

upvoted a paper 7 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

View all activity

Organizations

None yet

liked a Space 7 days ago

Recommend Similar Papers

🌖

167

Find similar academic papers using a URL

upvoted 2 papers 7 days ago

Reasoning Language Models: A Blueprint

Paper • 2501.11223 • Published Jan 20, 2025 • 33

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 434

liked a Space 7 days ago

The Smol Training Playbook

📚

2.84k

The secrets to building world-class LLMs

liked a Space 8 days ago

AI Deadlines

⚡

613

Find and manage important project deadlines and milestones

updated a model 11 days ago

resbyte/nanoVLM

Image-Text-to-Text • 0.2B • Updated 11 days ago • 9

published a model 11 days ago

resbyte/nanoVLM

Image-Text-to-Text • 0.2B • Updated 11 days ago • 9

upvoted an article 11 days ago

Article

Vision Language Models (Better, faster, stronger)

May 12, 2025

•

585

upvoted an article 14 days ago

Article

The Annotated Diffusion Model

Jun 7, 2022

•

310

liked 2 models 26 days ago

nvidia/nemotron-ocr-v1

Image-to-Text • Updated 27 days ago • 282 • 63

microsoft/VibeVoice-Realtime-0.5B

Text-to-Speech • 1B • Updated Dec 12, 2025 • 270k • 1.05k

upvoted an article 6 months ago

Article

The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare

Apr 19, 2024

•

190

upvoted an article 10 months ago

Article

🦸🏻#12: How Do Agents Learn from Their Own Mistakes? The Role of Reflection in AI

Mar 9, 2025

•

upvoted 2 articles 11 months ago

Article

Welcome PaliGemma 2 – New vision language models by Google

Dec 5, 2024

•

162

Article

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Apr 15, 2024

•

191

upvoted a paper about 1 year ago

PaliGemma: A versatile 3B VLM for transfer

Paper • 2407.07726 • Published Jul 10, 2024 • 72

reacted to merve's post with 🔥 over 1 year ago

Post

5718

I have put together a notebook on Multimodal RAG, where we do not process the documents with hefty pipelines but natively use:
- vidore/colpali for retrieval 📖 it doesn't need indexing with image-text pairs but just images!
- Qwen/Qwen2-VL-2B-Instruct for generation 💬 directly feed images as is to a vision language model with no processing to text!
I used ColPali implementation of the new 🐭 Byaldi library by @bclavie 🤗
https://github.com/answerdotai/byaldi
Link to notebook: https://github.com/merveenoyan/smol-vision/blob/main/ColPali_%2B_Qwen2_VL.ipynb

liked 2 models over 1 year ago

mlx-community/Phi-3.5-mini-instruct-8bit

Text Generation • Updated Aug 24, 2024 • 62 • 6

microsoft/Phi-3.5-vision-instruct

Image-Text-to-Text • 4B • Updated Dec 10, 2025 • 760k • 724

upvoted an article over 1 year ago

Article

Vision Language Models Explained

Apr 11, 2024

•

508

Abhinav Dadhich

AI & ML interests

Recent Activity

Organizations

resbyte's activity

Recommend Similar Papers

The Smol Training Playbook

AI Deadlines

Vision Language Models (Better, faster, stronger)

The Annotated Diffusion Model

The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare

🦸🏻#12: How Do Agents Learn from Their Own Mistakes? The Role of Reflection in AI

Welcome PaliGemma 2 – New vision language models by Google

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Vision Language Models Explained