| --- |
| license: mit |
| tags: |
| - audio |
| - music |
| - source-separation |
| - stem-separation |
| - roformer |
| - safetensors |
| - maestraea |
| pipeline_tag: audio-to-audio |
| --- |
| |
| # RoFormer Stem Separation Models (Safetensors) |
|
|
| **BS-RoFormer & MelBand RoFormer — State-of-the-art music source separation** |
|
|
| > Pretrained weights converted to safetensors format for use with [Mæstræa AI Workstation](https://github.com/AEmotionStudio/Maestraea). |
|
|
| ## Models |
|
|
| ### BS-RoFormer (Band-Split RoPE Transformer) |
|
|
| | Variant | SDR | Task | Path | |
| |---------|-----|------|------| |
| | Vocals (viperx) | 12.97 | Vocal/instrumental separation | `bs_roformer/vocals_viperx/` | |
| | Multi-stem | 9.65 | 4-stem (bass/drums/vocals/other) | `bs_roformer/multistem/` | |
|
|
| ### MelBand RoFormer (Mel-Band RoPE Transformer) |
|
|
| | Variant | SDR | Task | Path | |
| |---------|-----|------|------| |
| | Vocals (KimberleyJensen) | 10.98 | Best vocal isolation | `mel_band_roformer/vocals_kj/` | |
| | Vocals (viperx) | 11.43 | Vocal/instrumental separation | `mel_band_roformer/vocals_viperx/` | |
| | Dereverb (anvuew) | 19.17 | Remove reverb from audio | `mel_band_roformer/dereverb/` | |
| | Denoise (aufr33) | 27.99 | Remove noise from audio | `mel_band_roformer/denoise/` | |
|
|
| ## Architecture |
|
|
| Both models use the Band-Split RoPE Transformer architecture from [lucidrains/BS-RoFormer](https://github.com/lucidrains/BS-RoFormer): |
|
|
| - **BS-RoFormer**: Splits spectrogram into uniform-width subbands |
| - **MelBand RoFormer**: Splits using mel-scale (perceptually-weighted) overlapping bands |
|
|
| Both significantly outperform HTDemucs on vocal separation tasks. |
|
|
| ## Usage |
|
|
| Each model directory contains: |
| - `model.safetensors` — Model weights |
| - `config.yaml` — Architecture configuration (required for model instantiation) |
|
|
| Requires `bs-roformer` Python package: `pip install bs-roformer` |
|
|
| ## Credits |
|
|
| - **Architecture**: [lucidrains/BS-RoFormer](https://github.com/lucidrains/BS-RoFormer) |
| - **Training framework**: [ZFTurbo/Music-Source-Separation-Training](https://github.com/ZFTurbo/Music-Source-Separation-Training) |
| - **BS-RoFormer vocals**: [viperx](https://github.com/playdasegunda) via [TRvlvr](https://github.com/TRvlvr/model_repo) |
| - **MelBand vocals**: [KimberleyJensen](https://github.com/KimberleyJensen), [viperx](https://github.com/playdasegunda) |
| - **MelBand dereverb**: [anvuew](https://github.com/anvuew) |
| - **MelBand denoise**: [aufr33](https://github.com/aufr33) |
| - **Conversion & Mirror by**: [AEmotionStudio](https://huggingface.co/AEmotionStudio) |
|
|
| ## License |
|
|
| MIT — same as all upstream model releases. |
|
|