BabyLM
Collection
UT Austin's model submissions to BabyLM challenge. • 7 items • Updated • 1
How to use venkatasg/lil-bevo with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("fill-mask", model="venkatasg/lil-bevo") # Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("venkatasg/lil-bevo")
model = AutoModelForMaskedLM.from_pretrained("venkatasg/lil-bevo")Lil-Bevo is UT Austin's submission to the BabyLM challenge, specifically the strict-small track.
Unigram tokenizer trained on 10M BabyLM tokens plus MAESTRO dataset for a vocab size of 16k.
deberta-small-v3 trained on mixture of MAESTRO and 10M tokens for 5 epochs.
Model continues training for 50 epochs on 10M tokens with sequence length of 128.
Model is trained for 2 epochs with targeted linguistic masking with sequence length of 512.
This README will be updated with more details soon.