AWS provides pre-built vision and language AI services that require no ML expertise. You call an API with your image or text and receive structured insights. These are ideal for adding AI capabilities to existing applications without building models from scratch.

Key Points

  • Amazon Rekognition: detect objects, scenes, faces, text, celebrities, inappropriate content in images/video
  • Rekognition Video: real-time video analysis for streaming, or batch on S3 stored video
  • Rekognition Custom Labels: train a custom image classifier with your own labelled images (few hundred images enough)
  • Amazon Textract: go beyond OCR — extract text AND structure (tables, forms, key-value pairs) from PDFs/images
  • Textract Queries: ask questions about a document in natural language ("What is the invoice total?")
  • Amazon Comprehend: NLP — detect sentiment, entities, key phrases, language, PII, and custom classifiers
  • Comprehend Medical: clinical NLP — extract diagnoses, medications, dosages, conditions from medical text
  • Amazon Kendra: intelligent enterprise search — connect SharePoint, S3, Salesforce; NLP ranking over 40+ connectors
  • Amazon Translate: neural machine translation; Custom Terminology for brand-specific terms
  • Amazon Macie: uses ML to automatically discover, classify, and protect sensitive PII data in S3
ServiceInputOutputExam Tip
RekognitionImage / VideoLabels, faces, text, moderation flagsUse for content moderation, identity verification
TextractPDF / ImageStructured text, tables, forms, key-value pairs"Structured" extraction = Textract, not Rekognition
ComprehendText (any)Sentiment, entities, key phrases, PIIComprehend = NLP on text; not for images
KendraEnterprise docsRelevant passages with NLP rankingIntelligent search, not raw full-text search
TranslateText in language AText in language BCustom Terminology preserves brand names

Real-World Example

The NHS uses Amazon Textract to digitise millions of paper patient records, extracting structured medical data that feeds into analytics. This replaced a manual data-entry process that took weeks and introduced transcription errors.