Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Ville Komulainen's picture
1 7 1

Ville Komulainen

Villekom
kjoona's profile picture husnainvirk's profile picture hannamarikupari's profile picture
·
  • Vmjkom

AI & ML interests

NLP, text generation, semantic analysis

Organizations

TurkuNLP Research Group's profile picture HPLT's profile picture LumiOpen's profile picture Open-ψ (Open-Sci) Collective's profile picture OpenEuroLLM's profile picture MultiSynt's profile picture

upvoted 2 papers 8 months ago

Got Compute, but No Data: Lessons From Post-training a Finnish LLM

Paper • 2503.09407 • Published Mar 12, 2025 • 1

An Expanded Massive Multilingual Dataset for High-Performance Language Technologies

Paper • 2503.10267 • Published Mar 13, 2025 • 2
upvoted 3 papers about 1 year ago

Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published Jan 14, 2025 • 62

Preference Leakage: A Contamination Problem in LLM-as-a-judge

Paper • 2502.01534 • Published Feb 3, 2025 • 40

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4, 2025 • 254
upvoted a paper almost 2 years ago

Poro 34B and the Blessing of Multilinguality

Paper • 2404.01856 • Published Apr 2, 2024 • 15
upvoted a paper about 2 years ago

Instruction-Following Evaluation for Large Language Models

Paper • 2311.07911 • Published Nov 14, 2023 • 22
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs