Phoneme_Detection_Leaderboard

Running

File size: 1,929 Bytes

609adc6
c282db4
 
 
 
 
ff5a4d6
609adc6
 
 
 
 
 
dba24db
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9230ae1
 
dba24db
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9230ae1
dba24db
9230ae1
dba24db
9230ae1
dba24db

---
title: Phoneme Detection Leaderboard
emoji: 🎤
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "4.44.1"
app_file: app.py
pinned: false
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

# Phoneme Detection Leaderboard

A clean, simplified phoneme detection leaderboard based on the open_asr_leaderboard interface.

## Features

- **Clean Interface**: Uses the same interface structure as open_asr_leaderboard
- **Phoneme Evaluation**: Evaluates models on phoneme recognition tasks
- **Multiple Datasets**: Supports evaluation on multiple phoneme datasets
- **Model Request System**: Allows users to request evaluation of new models

## Structure

```
├── app.py                 # Main Gradio application
├── constants.py          # Constants and text definitions
├── utils_display.py      # Display utilities and column definitions
├── init.py              # Initialization and hub integration
├── phoneme_eval.py      # Core phoneme evaluation logic
├── utils/               # Utility modules
│   ├── load_model.py    # Model loading and inference
│   ├── audio_process.py # Audio processing and PER calculation
│   └── cmu_process.py   # CMU to IPA conversion
├── requirements.txt     # Python dependencies
└── README.md           # This file
```

## Usage

1. Install dependencies:
   ```bash
   pip install -r requirements.txt
   ```

2. Run the application:
   ```bash
   python app.py
   ```

3. Run evaluation:
   ```bash
   python phoneme_eval.py
   ```

## Evaluation

The leaderboard evaluates models on:
- **PER (Phoneme Error Rate)**: Lower is better
- **Average Duration**: Processing time per sample

Models are ranked by Average PER across all datasets.

## Datasets

- `phoneme_asr`: General phoneme recognition dataset
- `kids_phoneme_md`: Children's speech phoneme dataset