Vision & Language Services | AWS AI Practitioner | AI / ML

AWS provides pre-built vision and language AI services that require no ML expertise. You call an API with your image or text and receive structured insights. These are ideal for adding AI capabilities to existing applications without building models from scratch.

Key Points

Amazon Rekognition: detect objects, scenes, faces, text, celebrities, inappropriate content in images/video
Rekognition Video: real-time video analysis for streaming, or batch on S3 stored video
Rekognition Custom Labels: train a custom image classifier with your own labelled images (few hundred images enough)
Amazon Textract: go beyond OCR — extract text AND structure (tables, forms, key-value pairs) from PDFs/images
Textract Queries: ask questions about a document in natural language ("What is the invoice total?")
Amazon Comprehend: NLP — detect sentiment, entities, key phrases, language, PII, and custom classifiers
Comprehend Medical: clinical NLP — extract diagnoses, medications, dosages, conditions from medical text
Amazon Kendra: intelligent enterprise search — connect SharePoint, S3, Salesforce; NLP ranking over 40+ connectors
Amazon Translate: neural machine translation; Custom Terminology for brand-specific terms
Amazon Macie: uses ML to automatically discover, classify, and protect sensitive PII data in S3

Service	Input	Output	Exam Tip
Rekognition	Image / Video	Labels, faces, text, moderation flags	Use for content moderation, identity verification
Textract	PDF / Image	Structured text, tables, forms, key-value pairs	"Structured" extraction = Textract, not Rekognition
Comprehend	Text (any)	Sentiment, entities, key phrases, PII	Comprehend = NLP on text; not for images
Kendra	Enterprise docs	Relevant passages with NLP ranking	Intelligent search, not raw full-text search
Translate	Text in language A	Text in language B	Custom Terminology preserves brand names

Real-World Example

The NHS uses Amazon Textract to digitise millions of paper patient records, extracting structured medical data that feeds into analytics. This replaced a manual data-entry process that took weeks and introduced transcription errors.

←PreviousAmazon SageMaker NextSpeech & Conversational AI→