Yifan's PPO Models - a lblaoke Collection

Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

lblaoke 's Collections

Preference Data

Yifan's PPO Models

Yifan's PPO Models

updated Mar 19, 2025

lblaoke/llama2-7b-ppo-human

7B • Updated Feb 3, 2025
lblaoke/llama2-7b-ppo-self

7B • Updated Feb 3, 2025
lblaoke/llama2-7b-ppo-self-human

7B • Updated Feb 3, 2025
lblaoke/mistral-v0.1-7b-ppo-human

7B • Updated Feb 4, 2025
lblaoke/mistral-v0.1-7b-ppo-self

7B • Updated Feb 4, 2025
lblaoke/mistral-v0.1-7b-ppo-self-human

7B • Updated Feb 4, 2025
lblaoke/llama-3.1-8b-ppo-human

8B • Updated Feb 21, 2025
lblaoke/llama-3.1-8b-ppo-self

8B • Updated Feb 22, 2025 • 1
lblaoke/llama-3.1-8b-ppo-self-human

8B • Updated Feb 24, 2025
lblaoke/qwen2.5-7b-ppo-human

8B • Updated Feb 26, 2025 • 1
lblaoke/qwen2.5-7b-ppo-self-human

8B • Updated Feb 27, 2025 • 1
lblaoke/qwen2.5-7b-ppo-self

8B • Updated Feb 27, 2025 • 1
lblaoke/mistral-v0.3-7b-ppo-human

7B • Updated Feb 28, 2025 • 1
lblaoke/mistral-v0.3-7b-ppo-self

7B • Updated Feb 28, 2025 • 1
lblaoke/mistral-v0.3-7b-ppo-self-human

7B • Updated Mar 1, 2025

Collection guide
Browse collections

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs