Text-to-Speech
TTS
ELI5 — The Vibe Check
Text-to-Speech takes written words and reads them out loud with a computer voice. Old TTS sounded like a robot reading a phone book. New TTS sounds so human you'd swear it's a real person. It can even match different voices, emotions, and accents. The uncanny valley just keeps getting shallower.
Real Talk
Text-to-Speech (TTS) systems convert written text into spoken audio. Modern neural TTS models (like Bark, Tortoise, XTTS, ElevenLabs, and OpenAI's TTS) produce highly natural speech with controllable prosody, emotion, and speaker characteristics. They use transformer-based or diffusion-based architectures trained on large speech datasets.
When You'll Hear This
"We added TTS to the app for accessibility." / "The AI voice sounds so real it's honestly creepy."
Related Terms
Accessibility (a11y)
Accessibility (a11y) is making your website usable by everyone — including people using screen readers, keyboard-only navigation, or who have low vision.
AI (Artificial Intelligence)
AI is when you teach a computer to do stuff that normally needs a human brain — like recognizing cats, translating languages, or writing code for you.
NLP (Natural Language Processing)
NLP is the branch of AI that deals with human language — reading it, writing it, translating it, summarizing it.
Whisper
Whisper is OpenAI's speech recognition model — it listens to audio and writes down what was said.