Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
lblaoke 's Collections
Preference Data
Draft Models
Yifan's PPO Models
Yifan's RMs

Draft Models

updated May 12
Upvote
-

  • lblaoke/qwama-0.5b-skywork-pref-dpo-llama-factory-v1

    0.5B • Updated Mar 19 • 9

  • lblaoke/qwama-0.5b-skywork-pref-dpo-trl-v1

    0.5B • Updated Mar 19 • 10

  • lblaoke/qwama-0.5b-skywork-pref-dpo-trl-v2

    0.5B • Updated Mar 21 • 6

  • lblaoke/qwama-0.5b-skywork-pref-sft-rejected-trl-v3

    0.5B • Updated Mar 28 • 7

  • lblaoke/qwama-0.5b-skywork-pref-sft-chosen-trl-v3

    0.5B • Updated Mar 28 • 7

  • lblaoke/qwama-0.5b-skywork-pref-sft-rejected-chosen-trl-v3

    0.5B • Updated Mar 28 • 8

  • lblaoke/qwama-0.5b-skywork-pref-sft-chosen-dpo-trl-v3

    0.5B • Updated Mar 28 • 9

  • lblaoke/qwama-0.5b-hh-rlhf-sft-chosen-trl-v4

    0.5B • Updated Apr 8 • 6

  • lblaoke/opt-125m-hh-rlhf-dpo-trl-v5

    0.1B • Updated May 8 • 4

  • lblaoke/opt-125m-hh-rlhf-chosen-sft-trl-v5

    0.1B • Updated May 7 • 7

  • lblaoke/opt-350m-hh-rlhf-chosen-sft-trl-v5

    0.3B • Updated May 11 • 7

  • lblaoke/opt-350m-hh-rlhf-dpo-trl-v5

    0.3B • Updated May 12 • 9
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs