COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning
Paper • 2210.15212 • Published
How to use OpenMatch/cocodr-large-msmarco with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("fill-mask", model="OpenMatch/cocodr-large-msmarco") # Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("OpenMatch/cocodr-large-msmarco")
model = AutoModelForMaskedLM.from_pretrained("OpenMatch/cocodr-large-msmarco")This model has been first pretrained on the BEIR corpus and fine-tuned on the MS MARCO dataset following the approach described in the paper COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning. The associated GitHub repository is available here https://github.com/OpenMatch/COCO-DR.
This model is trained with BERT-large as the backbone with 335M hyperparameters. See the paper https://arxiv.org/abs/2210.15212 for details.
Pre-trained models can be loaded through the HuggingFace transformers library:
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained("OpenMatch/cocodr-large-msmarco")
tokenizer = AutoTokenizer.from_pretrained("OpenMatch/cocodr-large-msmarco")