Qwen3-TTS is a powerful multilingual TTS model with voice cloning, emotion control, and voice design capabilities. Requires QWEN_TTS_ENABLED=1. Supports 11 languages including English, Chinese, Japanese, Korean, French, German, Spanish, and more.
Generate Speech
POST /qwen-tts
| Parameter |
Type |
Default |
Description |
text * |
string |
- |
Text to convert to speech (1-10,000 characters) |
voice (optional) |
string |
Chelsie |
Voice to use (see /qwen-tts/voices for available voices) |
language (optional) |
string |
English |
Language for synthesis |
format (optional) |
string |
wav |
Output format: "wav", "opus", or "buffer" (base64 JSON). Also accepted as ?format= on this endpoint. |
cURL Example
curl -X POST https://voice.sogni.ai/qwen-tts \
-H "Content-Type: application/json" \
-d '{"text": "Hello, world!", "voice": "Chelsie"}' \
--output output.wav
Opus via query-string format override
curl -X POST 'https://voice.sogni.ai/qwen-tts?format=opus' \
-H "Content-Type: application/json" \
-d '{"text": "Hello, world!", "voice": "Chelsie"}' \
--output output.opus
Custom Voice (Emotion/Style Control)
POST /qwen-tts/custom-voice
Generate speech with emotion and style instructions. Requires the CustomVoice model variant.
| Parameter |
Type |
Description |
text * |
string |
Text to convert to speech |
speaker (optional) |
string |
Speaker voice to use (default: Chelsie) |
instruct * |
string |
Emotion/style instruction (e.g., "Very happy and excited", "Speak slowly with a calm tone") |
format (optional) |
string |
Output format: "wav", "opus", or "buffer". Also accepted as ?format= on this endpoint. |
cURL Example
curl -X POST https://voice.sogni.ai/qwen-tts/custom-voice \
-H "Content-Type: application/json" \
-d '{"text": "I am so excited!", "speaker": "Chelsie", "instruct": "Very happy and enthusiastic"}' \
--output excited.wav
Opus via query-string format override
curl -X POST 'https://voice.sogni.ai/qwen-tts/custom-voice?format=opus' \
-H "Content-Type: application/json" \
-d '{"text": "I am so excited!", "speaker": "Chelsie", "instruct": "Very happy and enthusiastic"}' \
--output excited.opus
Voice Design (Create Voice from Description)
POST /qwen-tts/voice-design
Generate speech using a voice created from a text description. Requires the VoiceDesign model variant.
| Parameter |
Type |
Description |
text * |
string |
Text to convert to speech |
instruct * |
string |
Voice description (e.g., "A deep male voice with a warm, calm tone") |
format (optional) |
string |
Output format: "wav", "opus", or "buffer". Also accepted as ?format= on this endpoint. |
cURL Example
curl -X POST https://voice.sogni.ai/qwen-tts/voice-design \
-H "Content-Type: application/json" \
-d '{"text": "Welcome to our service.", "instruct": "A deep male voice with calm, professional tone"}' \
--output designed_voice.wav
Opus via query-string format override
curl -X POST 'https://voice.sogni.ai/qwen-tts/voice-design?format=opus' \
-H "Content-Type: application/json" \
-d '{"text": "Welcome to our service.", "instruct": "A deep male voice with calm, professional tone"}' \
--output designed_voice.opus
List Voices
GET /qwen-tts/voices
Response
{
"voices": ["Chelsie", "Ethan", "Serena", "Vivian", "Ryan", "Aiden", "Eric", "Dylan"],
"clones": ["my_clone"],
"default": "Chelsie",
"defaultLanguage": "English",
"modelVariants": {
"base": "base-0.6b",
"customVoice": "custom-voice"
},
"features": ["voice_cloning", "custom_voice"]
}
Create Voice Clone
POST /qwen-tts/voices/clone
Upload a reference audio file with its transcript to create a voice clone. Requires the Base model variant.
| Parameter |
Type |
Description |
audio * |
File |
Reference audio file (3-10 seconds, WAV/MP3/OGG) |
transcript * |
string |
Exact text spoken in the reference audio |
cloneId (optional) |
string |
Custom name for the clone (alphanumeric, underscore, hyphen) |
cURL Example
curl -X POST https://voice.sogni.ai/qwen-tts/voices/clone \
-F "audio=@/path/to/reference.wav" \
-F "transcript=Hello, this is my voice sample." \
-F "cloneId=my_voice"
Response
{
"success": true,
"cloneId": "my_voice",
"message": "Voice clone created successfully"
}
Generate with Cloned Voice
POST /qwen-tts/voices/clone/{cloneId}/generate
| Parameter |
Type |
Description |
text * |
string |
Text to convert to speech |
language (optional) |
string |
Language for synthesis (default: English) |
format (optional) |
string |
Output format: "wav", "opus", or "buffer" |
cURL Example
curl -X POST https://voice.sogni.ai/qwen-tts/voices/clone/my_voice/generate \
-H "Content-Type: application/json" \
-d '{"text": "Hello from my cloned voice!"}' \
--output cloned_output.wav
Rename Voice Clone
PATCH /qwen-tts/voices/clone/{cloneId}
| Parameter |
Type |
Description |
newCloneId * |
string |
New name for the voice clone |
cURL Example
curl -X PATCH https://voice.sogni.ai/qwen-tts/voices/clone/my_voice \
-H "Content-Type: application/json" \
-d '{"newCloneId": "renamed_voice"}'
Delete Voice Clone
DELETE /qwen-tts/voices/clone/{cloneId}
cURL Example
curl -X DELETE https://voice.sogni.ai/qwen-tts/voices/clone/my_voice
Response
{
"success": true,
"cloneId": "my_voice",
"message": "Voice clone deleted successfully"
}
Download Voice Clone
GET /qwen-tts/voices/clone/{cloneId}/download
Download a voice clone as a ZIP file containing the voice embedding and metadata. Useful for backup or transferring clones between servers.
Response
Returns a ZIP file (Content-Type: application/zip) containing:
{cloneId}.safetensors - The voice embedding (safe, no code execution possible)
metadata.json - Clone metadata (clone ID, creation date, service info)
cURL Example
curl https://voice.sogni.ai/qwen-tts/voices/clone/my_voice/download \
--output my_voice.zip
JavaScript Example
const response = await fetch(
'https://voice.sogni.ai/qwen-tts/voices/clone/my_voice/download'
);
const blob = await response.blob();
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = 'my_voice.zip';
a.click();
Python Example
import requests
response = requests.get(
'https://voice.sogni.ai/qwen-tts/voices/clone/my_voice/download'
)
with open('my_voice.zip', 'wb') as f:
f.write(response.content)
Import Voice Clone
POST /qwen-tts/voices/clone/import
Import a previously exported voice clone from a ZIP file. The ZIP must contain a .safetensors file.
Import access: If voiceCloneImports.mode is api_key, provide a valid API key via X-API-Key or Authorization: Bearer. If it is blocked, the server must set AUTH_API_KEY or DANGEROUSLY_ALLOW_IMPORTS=1 before imports will work.
| Parameter |
Type |
Description |
file * |
File |
ZIP file containing the voice clone (.safetensors + optional metadata.json) |
cloneId (optional) |
string |
Custom name for the imported clone. If omitted, uses the name from metadata or filename. |
cURL Example
curl -X POST https://voice.sogni.ai/qwen-tts/voices/clone/import \
-H "X-API-Key: sk_your_secret_key_here" \
-F "file=@my_voice.zip" \
-F "cloneId=restored_voice"
Response
{
"success": true,
"cloneId": "restored_voice",
"message": "Voice clone imported successfully"
}
JavaScript Example
const formData = new FormData();
formData.append('file', zipFile);
formData.append('cloneId', 'my_imported_voice');
const response = await fetch('https://voice.sogni.ai/qwen-tts/voices/clone/import', {
method: 'POST',
headers: { 'X-API-Key': 'sk_your_secret_key_here' },
body: formData
});
const data = await response.json();
console.log('Imported clone:', data.cloneId);
Python Example
import requests
with open('my_voice.zip', 'rb') as f:
response = requests.post(
'https://voice.sogni.ai/qwen-tts/voices/clone/import',
headers={'X-API-Key': 'sk_your_secret_key_here'},
files={'file': f},
data={'cloneId': 'restored_voice'}
)
data = response.json()
print(f"Imported clone: {data['cloneId']}")