roformer-models / README.md
AEmotionStudio's picture
Add README.md
5c0dd47 verified
---
license: mit
tags:
- audio
- music
- source-separation
- stem-separation
- roformer
- safetensors
- maestraea
pipeline_tag: audio-to-audio
---
# RoFormer Stem Separation Models (Safetensors)
**BS-RoFormer & MelBand RoFormer — State-of-the-art music source separation**
> Pretrained weights converted to safetensors format for use with [Mæstræa AI Workstation](https://github.com/AEmotionStudio/Maestraea).
## Models
### BS-RoFormer (Band-Split RoPE Transformer)
| Variant | SDR | Task | Path |
|---------|-----|------|------|
| Vocals (viperx) | 12.97 | Vocal/instrumental separation | `bs_roformer/vocals_viperx/` |
| Multi-stem | 9.65 | 4-stem (bass/drums/vocals/other) | `bs_roformer/multistem/` |
### MelBand RoFormer (Mel-Band RoPE Transformer)
| Variant | SDR | Task | Path |
|---------|-----|------|------|
| Vocals (KimberleyJensen) | 10.98 | Best vocal isolation | `mel_band_roformer/vocals_kj/` |
| Vocals (viperx) | 11.43 | Vocal/instrumental separation | `mel_band_roformer/vocals_viperx/` |
| Dereverb (anvuew) | 19.17 | Remove reverb from audio | `mel_band_roformer/dereverb/` |
| Denoise (aufr33) | 27.99 | Remove noise from audio | `mel_band_roformer/denoise/` |
## Architecture
Both models use the Band-Split RoPE Transformer architecture from [lucidrains/BS-RoFormer](https://github.com/lucidrains/BS-RoFormer):
- **BS-RoFormer**: Splits spectrogram into uniform-width subbands
- **MelBand RoFormer**: Splits using mel-scale (perceptually-weighted) overlapping bands
Both significantly outperform HTDemucs on vocal separation tasks.
## Usage
Each model directory contains:
- `model.safetensors` — Model weights
- `config.yaml` — Architecture configuration (required for model instantiation)
Requires `bs-roformer` Python package: `pip install bs-roformer`
## Credits
- **Architecture**: [lucidrains/BS-RoFormer](https://github.com/lucidrains/BS-RoFormer)
- **Training framework**: [ZFTurbo/Music-Source-Separation-Training](https://github.com/ZFTurbo/Music-Source-Separation-Training)
- **BS-RoFormer vocals**: [viperx](https://github.com/playdasegunda) via [TRvlvr](https://github.com/TRvlvr/model_repo)
- **MelBand vocals**: [KimberleyJensen](https://github.com/KimberleyJensen), [viperx](https://github.com/playdasegunda)
- **MelBand dereverb**: [anvuew](https://github.com/anvuew)
- **MelBand denoise**: [aufr33](https://github.com/aufr33)
- **Conversion & Mirror by**: [AEmotionStudio](https://huggingface.co/AEmotionStudio)
## License
MIT — same as all upstream model releases.