Kyutai Pocket TTS is a lightweight 100M-parameter, CPU-only, English-only TTS with ~200ms latency and voice cloning support. Requires POCKET_TTS_ENABLED=true.
Generate Speech
POST /pocket-tts
| Parameter |
Type |
Default |
Description |
text * |
string |
- |
Text to convert to speech (1-10,000 characters) |
voice (optional) |
string |
alba |
Built-in voice: alba, marius, javert, jean, fantine, cosette, eponine, azelma |
format (optional) |
string |
wav |
Output format: "wav", "opus" (returns audio file) or "buffer" (returns base64 JSON) |
cURL Example
curl -X POST https://voice.sogni.ai/pocket-tts \
-H "Content-Type: application/json" \
-d '{"text": "Hello, world!", "voice": "alba"}' \
--output output.wav
List Voices
GET /pocket-tts/voices
Response
{
"voices": ["alba", "marius", "javert", "jean", "fantine", "cosette", "eponine", "azelma"],
"clones": ["my_clone"],
"default": "alba"
}
Create Voice Clone
POST /pocket-tts/voices/clone
Upload a reference audio file to create a voice clone. No transcript needed.
| Parameter |
Type |
Description |
audio * |
File |
Reference audio file (WAV, MP3, OGG) |
cloneId (optional) |
string |
Custom name for the clone (alphanumeric, underscore, hyphen) |
cURL Example
curl -X POST https://voice.sogni.ai/pocket-tts/voices/clone \
-F "audio=@/path/to/reference.wav" \
-F "cloneId=my_voice"
Response
{
"success": true,
"cloneId": "my_voice",
"message": "Voice clone created successfully"
}
Generate with Cloned Voice
POST /pocket-tts/voices/clone/{cloneId}/generate
cURL Example
curl -X POST https://voice.sogni.ai/pocket-tts/voices/clone/my_voice/generate \
-H "Content-Type: application/json" \
-d '{"text": "Hello from my cloned voice!"}' \
--output output.wav
Delete Voice Clone
DELETE /pocket-tts/voices/clone/{cloneId}
cURL Example
curl -X DELETE https://voice.sogni.ai/pocket-tts/voices/clone/my_voice
Download Voice Clone
GET /pocket-tts/voices/clone/{cloneId}/download
Download a voice clone as a ZIP file containing the reference audio and metadata. Useful for backup or transferring clones between servers.
Response
Returns a ZIP file (Content-Type: application/zip) containing:
reference.wav - The reference audio file
metadata.json - Clone metadata (clone ID, source audio filename)
cURL Example
curl https://voice.sogni.ai/pocket-tts/voices/clone/my_voice/download \
--output my_voice.zip
JavaScript Example
const response = await fetch(
'https://voice.sogni.ai/pocket-tts/voices/clone/my_voice/download'
);
const blob = await response.blob();
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = 'my_voice.zip';
a.click();
Python Example
import requests
response = requests.get(
'https://voice.sogni.ai/pocket-tts/voices/clone/my_voice/download'
)
with open('my_voice.zip', 'wb') as f:
f.write(response.content)
Import Voice Clone
POST /pocket-tts/voices/clone/import
Import a previously exported voice clone from a ZIP file. The ZIP must contain a reference.wav file.
| Parameter |
Type |
Description |
file * |
File |
ZIP file containing the voice clone |
cloneId (optional) |
string |
Custom name for the imported clone. If omitted, uses the name from metadata. |
cURL Example
curl -X POST https://voice.sogni.ai/pocket-tts/voices/clone/import \
-F "file=@my_voice.zip" \
-F "cloneId=restored_voice"
Response
{
"success": true,
"cloneId": "restored_voice",
"message": "Voice clone imported successfully"
}
JavaScript Example
const formData = new FormData();
formData.append('file', zipFile);
formData.append('cloneId', 'my_imported_voice');
const response = await fetch('https://voice.sogni.ai/pocket-tts/voices/clone/import', {
method: 'POST',
body: formData
});
const data = await response.json();
console.log('Imported clone:', data.cloneId);
Python Example
import requests
with open('my_voice.zip', 'rb') as f:
response = requests.post(
'https://voice.sogni.ai/pocket-tts/voices/clone/import',
files={'file': f},
data={'cloneId': 'restored_voice'}
)
data = response.json()
print(f"Imported clone: {data['cloneId']}")