AEmotionStudio
/

roformer-models

source-separation

stem-separation

Model card Files Files and versions

roformer-models / README.md

AEmotionStudio's picture

Add README.md

5c0dd47 verified about 1 month ago

|

history blame contribute delete

2.55 kB

	---
	license: mit
	tags:
	- audio
	- music
	- source-separation
	- stem-separation
	- roformer
	- safetensors
	- maestraea
	pipeline_tag: audio-to-audio
	---

	# RoFormer Stem Separation Models (Safetensors)

	BS-RoFormer & MelBand RoFormer — State-of-the-art music source separation

	> Pretrained weights converted to safetensors format for use with [Mæstræa AI Workstation](https://github.com/AEmotionStudio/Maestraea).

	## Models

	### BS-RoFormer (Band-Split RoPE Transformer)

	\| Variant \| SDR \| Task \| Path \|
	\|---------\|-----\|------\|------\|
	\| Vocals (viperx) \| 12.97 \| Vocal/instrumental separation \| `bs_roformer/vocals_viperx/` \|
	\| Multi-stem \| 9.65 \| 4-stem (bass/drums/vocals/other) \| `bs_roformer/multistem/` \|

	### MelBand RoFormer (Mel-Band RoPE Transformer)

	\| Variant \| SDR \| Task \| Path \|
	\|---------\|-----\|------\|------\|
	\| Vocals (KimberleyJensen) \| 10.98 \| Best vocal isolation \| `mel_band_roformer/vocals_kj/` \|
	\| Vocals (viperx) \| 11.43 \| Vocal/instrumental separation \| `mel_band_roformer/vocals_viperx/` \|
	\| Dereverb (anvuew) \| 19.17 \| Remove reverb from audio \| `mel_band_roformer/dereverb/` \|
	\| Denoise (aufr33) \| 27.99 \| Remove noise from audio \| `mel_band_roformer/denoise/` \|

	## Architecture

	Both models use the Band-Split RoPE Transformer architecture from [lucidrains/BS-RoFormer](https://github.com/lucidrains/BS-RoFormer):

	- BS-RoFormer: Splits spectrogram into uniform-width subbands
	- MelBand RoFormer: Splits using mel-scale (perceptually-weighted) overlapping bands

	Both significantly outperform HTDemucs on vocal separation tasks.

	## Usage

	Each model directory contains:
	- `model.safetensors` — Model weights
	- `config.yaml` — Architecture configuration (required for model instantiation)

	Requires `bs-roformer` Python package: `pip install bs-roformer`

	## Credits

	- Architecture: [lucidrains/BS-RoFormer](https://github.com/lucidrains/BS-RoFormer)
	- Training framework: [ZFTurbo/Music-Source-Separation-Training](https://github.com/ZFTurbo/Music-Source-Separation-Training)
	- BS-RoFormer vocals: [viperx](https://github.com/playdasegunda) via [TRvlvr](https://github.com/TRvlvr/model_repo)
	- MelBand vocals: [KimberleyJensen](https://github.com/KimberleyJensen), [viperx](https://github.com/playdasegunda)
	- MelBand dereverb: [anvuew](https://github.com/anvuew)
	- MelBand denoise: [aufr33](https://github.com/aufr33)
	- Conversion & Mirror by: [AEmotionStudio](https://huggingface.co/AEmotionStudio)

	## License

	MIT — same as all upstream model releases.