Speech & Conversational AI
Transcribe, Polly, Lex, Connect AI features
AWS speech AI services convert audio to text and text to audio, while conversational AI services enable building chatbots and intelligent contact centres. Amazon Connect integrates these capabilities for enterprise-grade customer service AI.
Key Points
- Amazon Transcribe: speech-to-text; supports real-time streaming and batch; speaker diarisation, custom vocabulary
- Transcribe Medical: clinical audio transcription — medical terms, physician-patient conversations
- Transcribe Call Analytics: analyse call centre recordings — sentiment, talk time, interruptions, action items
- Amazon Polly: text-to-speech with 60+ voices in 30+ languages; SSML for pronunciation control; Neural TTS
- Amazon Lex: conversational AI (NLU) — intents, slots, fulfilment with Lambda; powers Amazon Alexa
- Lex Automated Chatbot Designer: generate Lex bot from conversation transcripts automatically
- Amazon Connect: cloud contact centre; integrates Lex (bots), Transcribe (recording), Comprehend (sentiment)
- Amazon Q in Connect: real-time agent assistance — surfaces relevant knowledge during live calls
- Amazon Chime SDK: add voice/video to applications; includes noise suppression and voice focus ML
| Service | Direction | Key Feature | Exam Trap |
|---|---|---|---|
| Transcribe | Audio → Text | Speaker diarisation, custom vocabulary | Not Polly — Polly is text→audio |
| Polly | Text → Audio | Neural TTS, SSML, 60+ voices | Cannot transcribe — only synthesise |
| Lex | Text/Voice → Intent | Slots, fulfilment, multi-turn | Lex uses NLU; Kendra uses NLP search |
| Connect | Contact centre | Integrates all speech services | Connect orchestrates; services do the work |
Real-World Example
Capital One uses Amazon Lex in their voice IVR to handle millions of calls per month — customers can check balances, make payments, and dispute charges by voice, without reaching a human agent. Transcribe Call Analytics reviews 100% of agent calls for quality assurance.