Natija kutilgandek emas

#2
by BoburAmirov - opened
import numpy as np
import soundfile as sf
from matcha.text import text_to_sequence

# =====================
# ONNX session
# =====================
session = ort.InferenceSession(
    "model.onnx",
    providers=["CPUExecutionProvider"]
)

# =====================
# Text → token IDs
# =====================
text = "Salom, dunyo!"

tokens, _ = text_to_sequence(text, cleaner_names=["basic_cleaners"])

x = np.array(tokens, dtype=np.int64)[None, :]   # shape: (1, T)
x_lengths = np.array([x.shape[1]], dtype=np.int64)

# scales = [noise_scale, length_scale]
scales = np.array([0.667, 1.0], dtype=np.float32)

# =====================
# Inference
# =====================
audio = session.run(
    None,
    {
        "x": x,
        "x_lengths": x_lengths,
        "scales": scales,
    }
)[0]

audio = audio[0] if audio.ndim == 2 else audio

# =====================
# Save WAV
# =====================
sf.write("output.wav", audio.astype(np.float32), 22050)

print("✅ output.wav saved")

Mendagi example lekin natija kutilganidek emas. Cleaner va Phonemizer uchun nima ishlatilgan?

This comment has been hidden (marked as Off-Topic)
Ovozify Labs org

Assalomu aleykum. Githubga inference qilish uchun Repo joylab qo'ydik, shu tarzda modelni yuklab ishlatsangiz bo'ladi.

Sign up or log in to comment