Dahun Kim's picture

1

Dahun Kim

mcahny

·

http://mcahny.github.io

mcahny

AI & ML interests

None yet

Organizations

authored 11 papers 3 months ago

Learning Image Representations by Completing Damaged Jigsaw Puzzles

Paper • 1802.01880 • Published Feb 6, 2018

Self-Supervised Video Representation Learning with Space-Time Cubic Puzzles

Paper • 1811.09795 • Published Nov 24, 2018

Deep Video Inpainting

Paper • 1905.01639 • Published May 5, 2019

Align-and-Attend Network for Globally and Locally Coherent Video Inpainting

Paper • 1905.13066 • Published May 30, 2019

DeepLab2: A TensorFlow Library for Deep Labeling

Paper • 2106.09748 • Published Jun 17, 2021

Contrastive Feature Masking Open-Vocabulary Vision Transformer

Paper • 2309.00775 • Published Sep 2, 2023 • 10

VideoComp: Advancing Fine-Grained Compositional and Temporal Alignment in Video-Text Models

Paper • 2504.03970 • Published Apr 4

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7 • 64

Context-Adaptive Multi-Prompt Embedding with Large Language Models for Vision-Language Alignment

Paper • 2508.02762 • Published Aug 3

EmbeddingGemma: Powerful and Lightweight Text Representations

Paper • 2509.20354 • Published Sep 24 • 41

Zero-Shot Multi-Spectral Learning: Reimagining a Generalist Multimodal Gemini 2.5 Model for Remote Sensing Applications

Paper • 2509.19087 • Published Sep 23 • 1

authored 2 papers about 2 years ago

Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities

Paper • 2311.05698 • Published Nov 9, 2023 • 13

Detection-Oriented Image-Text Pretraining for Open-Vocabulary Detection

Paper • 2310.00161 • Published Sep 29, 2023 • 1

authored a paper over 2 years ago

Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers

Paper • 2305.07011 • Published May 11, 2023 • 5