|
|
--- |
|
|
title: Phoneme Detection Leaderboard |
|
|
emoji: π€ |
|
|
colorFrom: blue |
|
|
colorTo: purple |
|
|
sdk: gradio |
|
|
sdk_version: "4.44.1" |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
--- |
|
|
|
|
|
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |
|
|
|
|
|
# Phoneme Detection Leaderboard |
|
|
|
|
|
A clean, simplified phoneme detection leaderboard based on the open_asr_leaderboard interface. |
|
|
|
|
|
## Features |
|
|
|
|
|
- **Clean Interface**: Uses the same interface structure as open_asr_leaderboard |
|
|
- **Phoneme Evaluation**: Evaluates models on phoneme recognition tasks |
|
|
- **Multiple Datasets**: Supports evaluation on multiple phoneme datasets |
|
|
- **Model Request System**: Allows users to request evaluation of new models |
|
|
|
|
|
## Structure |
|
|
|
|
|
``` |
|
|
βββ app.py # Main Gradio application |
|
|
βββ constants.py # Constants and text definitions |
|
|
βββ utils_display.py # Display utilities and column definitions |
|
|
βββ init.py # Initialization and hub integration |
|
|
βββ phoneme_eval.py # Core phoneme evaluation logic |
|
|
βββ utils/ # Utility modules |
|
|
β βββ load_model.py # Model loading and inference |
|
|
β βββ audio_process.py # Audio processing and PER calculation |
|
|
β βββ cmu_process.py # CMU to IPA conversion |
|
|
βββ requirements.txt # Python dependencies |
|
|
βββ README.md # This file |
|
|
``` |
|
|
|
|
|
## Usage |
|
|
|
|
|
1. Install dependencies: |
|
|
```bash |
|
|
pip install -r requirements.txt |
|
|
``` |
|
|
|
|
|
2. Run the application: |
|
|
```bash |
|
|
python app.py |
|
|
``` |
|
|
|
|
|
3. Run evaluation: |
|
|
```bash |
|
|
python phoneme_eval.py |
|
|
``` |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
The leaderboard evaluates models on: |
|
|
- **PER (Phoneme Error Rate)**: Lower is better |
|
|
- **Average Duration**: Processing time per sample |
|
|
|
|
|
Models are ranked by Average PER across all datasets. |
|
|
|
|
|
## Datasets |
|
|
|
|
|
- `phoneme_asr`: General phoneme recognition dataset |
|
|
- `kids_phoneme_md`: Children's speech phoneme dataset |