Add user-configurable TTS pronunciation overrides

A JSON map (TTS_PRONUNCIATIONS_PATH, default tts_pronunciations.json)
rewrites mispronounced words — place names, initialisms, dotted
abbreviations — to phonetic spellings before synthesis, applied after
markdown cleanup in both /tts/speech paths. Whole-word smartcase
matching (lowercase keys match any casing, uppercase keys exact),
longest key wins, hot-reloaded on mtime change with last-good fallback
on parse errors. See tts_pronunciations.example.json.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
Cameron Cordes
2026-06-11 23:06:18 -04:00
parent 3fa4fa8501
commit 2e0f78aa1b
7 changed files with 319 additions and 3 deletions
+10 -2
View File
@@ -254,6 +254,14 @@ fn clean_for_tts(input: &str) -> String {
s.trim().to_string()
}
/// Full text-preparation pipeline for synthesis: markdown/emoji cleanup, then
/// the user's pronunciation overrides (see [`crate::ai::pronunciation`]) on
/// the resulting plain text — after cleanup so word boundaries aren't
/// obscured by `**WSL**`-style markup.
fn prepare_for_tts(input: &str) -> String {
crate::ai::pronunciation::apply_pronunciations(&clean_for_tts(input))
}
/// Decode an audio/video file to mono 24 kHz WAV via ffmpeg, returning the WAV
/// bytes. Chatterbox validates the reference clip by file *extension* and
/// rejects several formats (e.g. `.aac`, `.opus`), so we always normalize to
@@ -337,7 +345,7 @@ pub async fn tts_speech_handler(
let parent_context = extract_context_from_request(&http_request);
let mut span = global_tracer().start_with_context("http.tts.speech", &parent_context);
let text = clean_for_tts(&req.text);
let text = prepare_for_tts(&req.text);
if text.is_empty() {
span.set_status(Status::error("text is required"));
return HttpResponse::BadRequest().json(json!({ "error": "text is required" }));
@@ -435,7 +443,7 @@ pub async fn create_speech_job_handler(
let mut span =
global_tracer().start_with_context("http.tts.speech_job.create", &parent_context);
let text = clean_for_tts(&req.text);
let text = prepare_for_tts(&req.text);
if text.is_empty() {
span.set_status(Status::error("text is required"));
return HttpResponse::BadRequest().json(json!({ "error": "text is required" }));