Datasets: NeurIPS LLM Challenge 2023
Datasets that were under consideration for usage in my submission to the 2023 NeurIPS Large Language Model Efficiency Challenge.
Viewer • Updated • 63k • 209 • 34Note Ultimately used in my full eval submission, with exclusion of dolly_hhrlhf. Included only in Mistral-7B-sft-v1.
databricks/databricks-dolly-15k
Viewer • Updated • 15k • 16.9k • 896Note Used both for Mistral-7B-sft-v0 and Mistral-7B-sft-v1 in my submissions.
kaist-ai/CoT-Collection
Viewer • Updated • 1.84M • 1.45k • 154Note Looked promising, but did not have time to explore.
tasksource/icl-symbol-tuning-instruct
Viewer • Updated • 484k • 203 • 19Note Considered for improving ICL. Did not have time to explore.
cais/mmlu
Viewer • Updated • 231k • 298k • 608Note Decided against training on MMLU data.
GAIR/lima
Viewer • Updated • 1.33k • 795 • 452Note Avoided due to CC BY-NC-SA license, though it would have been allowed for the competition. Likely would have been a good resource otherwise.
grammarly/coedit
Viewer • Updated • 70.8k • 935 • 83Note The plan here would be to target robustness metrics by finetuning an expert model to correct perturbations and/or clarify the input. This could have paraphrasing or other text revision tasks if they appeared in the hidden eval. Did not have time to fully explore.
wanyu/IteraTeR_human_sent
Viewer • Updated • 4.02k • 127Note Similar use case as coedit.
allenai/social_i_qa
Updated • 20.5k • 26Note Now knowing that the holdout tasks had ethics questions, I wish I had used this.
lighteval/siqa
Viewer • Updated • 35.4k • 515 • 8Note Same as social_i_qa
tau/commonsense_qa
Viewer • Updated • 12.1k • 50.5k • 125Note Now knowing that the holdout tasks had ethics questions, I wish I had used this.
euirim/goodwiki
Viewer • Updated • 44.8k • 126 • 53Note Could have been useful for RAG.
alexfabbri/multi_news
Updated • 7.02k • 71Note The thought was this could help with CNN/DM summarization, but some quality and license concerns combined with acceptable performance without it led to its exclusion.
-
allenai/math_qa
Updated • 18.6k • 113 -
allenai/ropes
Viewer • Updated • 14.3k • 10.1k • 50 -
allenai/openbookqa
Viewer • Updated • 11.9k • 94.2k • 120 -
allenai/ai2_arc
Viewer • Updated • 7.79k • 227k • 244 -
INK-USC/riddle_sense
Updated • 2.05k • 26 -
allenai/qasc
Viewer • Updated • 9.98k • 8.06k • 23 -
nyu-mll/blimp
Viewer • Updated • 67k • 15.3k • 37 -
google/boolq
Viewer • Updated • 12.7k • 18.5k • 90 -
corypaik/prost
Viewer • Updated • 18.7k • 639 • 1 -
allenai/sciq
Viewer • Updated • 13.7k • 33.3k • 130 -
facebook/belebele
Viewer • Updated • 110k • 15.3k • 121 -
derek-thomas/ScienceQA
Viewer • Updated • 21.2k • 18.9k • 203 -
openlifescienceai/medmcqa
Viewer • Updated • 193k • 17.2k • 200 -
embedding-data/QQP_triplets
Viewer • Updated • 102k • 256 • 8 -
VMware/open-instruct
Viewer • Updated • 143k • 75 • 44