[AIF-C01]What is the AWS Certified AI Practitioner Exam

The AWS Certified AI Practitioner Exam is a foundational-level certification introduced by AWS to validate a person’s understanding of Artificial Intelligence (AI), Machine Learning (ML), and Generative AI (GenAI)—specifically in the context of AWS cloud services.

It’s not meant to turn you into a hardcore ML scientist. Instead, it ensures you:

  • Understand AI/ML concepts, terminology, and responsible AI practices

  • Know how AWS AI services like Amazon SageMaker, Bedrock, Rekognition, Lex, Polly, and Comprehend are used

  • Can identify real-world AI/ML/GenAI use cases and map them to AWS solutions

  • Speak the language of AI in business and technology contexts, even if you’re not writing production ML code

Key Facts About the Exam:

  • Level: Foundational (no deep coding/math required)

  • Format: Multiple-choice, multiple-response questions

  • Duration: 100 minutes

  • Delivery: Online proctoring or at a Pearson VUE test center

  • Cost: $75 

  • Language Availability: English and Few other

[Cheet Sheet]the classic “one last look before battle”

1. AI vs ML vs DL

  • AI: Any system that mimics human intelligence.

  • ML: AI that learns from data.

  • DL: ML using neural networks with multiple layers.


2. Types of ML

  • Supervised Learning → You have labeled data (inputs + correct outputs).

    • Goal: Learn mapping from inputs → outputs.

    • Example: Predict house price, classify spam/not spam.

  • Unsupervised Learning → You only have inputs, no labels.

    • Goal: Find hidden patterns.

    • Example: Customer segmentation, anomaly detection.

  • Reinforcement Learning (RL) → Agent learns by interacting with environment (reward/penalty).

    • Example: Game AI, robots, self-driving cars.

2.1.  Supervised Learning Algorithms

  • Linear Regression → Predicts a number (continuous value).

    • Example: Predict house price.

  • Logistic Regression → Predicts binary outcomes (yes/no, spam/not spam).

  • Decision Trees → Splits data using rules (“if X then Y”).

  • Random Forest → Many trees vote → more accurate, less overfitting.

  • Support Vector Machine (SVM) → Finds best boundary between classes.

  • Naïve Bayes → Probabilistic model, good for text classification.

  • Neural Networks → Layers of nodes, can capture complex relationships.

2.2.  Unsupervised Learning Algorithms

  • Clustering (K-means, Hierarchical) → Groups similar data.

    • Example: Group customers into buyer types.

  • Dimensionality Reduction (PCA, t-SNE) → Reduces features while keeping information.

    • Example: Compressing data for visualization.

2.3.  Reinforcement Learning

  • Agent, Environment, Reward system.

  • Tries actions → gets feedback → learns best strategy.

  • Example: AlphaGo beating humans in Go.

3. Key AWS AI Services (Know their use cases!)

  • Amazon SageMaker – Build, train, deploy ML models.

  • Amazon Bedrock – Build GenAI apps with foundation models (no infra).

  • Amazon Polly – Text → Speech.

  • Amazon Transcribe – Speech → Text.

  • Amazon Comprehend – NLP: sentiment, key phrases, entities.

  • Amazon Translate – Language translation.

  • Amazon Rekognition – Image/video analysis (faces, labels, unsafe content).

  • Amazon Lex – Chatbots (voice/text).

  • Amazon Textract – Extract text & data from docs (including tables/forms).

  • Amazon Kendra – Enterprise search.

  • Amazon Personalize – Recommendations engine.

  • Amazon Forecast – Time-series forecasting.

  • Amazon CodeWhisperer – AI code assistant.


4. ML Lifecycle (SageMaker focus)

  1. Data Prep (Ground Truth, S3).

  2. Train (built-in algos, custom, marketplace).

  3. Tune (hyperparameter optimization).

  4. Deploy (endpoints, multi-model, A/B).

  5. Monitor (bias detection, drift).


5. Ethics & Responsible AI

  • Fairness (avoid bias).

  • Explainability (know why the model predicts X).

  • Privacy & Security (encrypt, IAM roles, KMS).

  • Sustainability (optimize compute).


6. GenAI Basics

  • Foundation Models (FMs): Pre-trained, huge datasets, reusable.

  • Prompt Engineering: Clear instructions → better responses.

  • RAG: Retrieval-Augmented Generation = fetch relevant data + FM.

  • Bedrock Features: Guardrails, Agents, Knowledge Bases.

Core Features

  • Foundation Models (FMs)

    • Models from providers like Anthropic (Claude), AI21, Cohere, Stability AI, Amazon Titan, Meta Llama, Mistral etc.

    • You don’t need to fine-tune from scratch—just use or customize.

  • Model Choice & Abstraction

    • Unified API → switch between models easily (e.g., from Claude to Titan).

    • Reduces vendor lock-in.

  • Model Customization

    • Fine-tuning → Adjust FM with your labeled data.

    • Retrieval-Augmented Generation (RAG) → Bring in private data without retraining.

  • Security & Compliance

    • Data not used to train models (your data is safe).

    • VPC support, encryption, IAM integration.

  • Integrations

    • Works with LangChain, LlamaIndex, Agents, AWS SDKs.

    • Integrated with SageMaker (if you need custom ML).

  • Evaluation Tools

    • Guardrails: Filter harmful content, ensure safe responses.

    • Model Evaluation: Test models for performance and bias.

Key Parameters in Bedrock and GENAI

  • modelId → Choose which FM to use (e.g., anthropic.claude-v2, amazon.titan-text).

  • inputText / prompt → Your query or instruction.

  • maxTokens → Max output length.

  • temperature → Controls creativity/randomness (low = factual, high = creative).

  • topP → Nucleus sampling (another way to control randomness).

  • stopSequences → Define where the model should stop generating.

  • streaming → Stream results back instead of waiting for completion.


7. ML Concepts

7.1.  Overfitting vs. Underfitting

  • Overfitting → Model too complex → memorizes training data, fails on new data.

  • Underfitting → Model too simple → can’t capture patterns.

  • Fixes: Regularization, more training data, cross-validation.

    Good Model= Low on trining data. Low on new data

7.2.  Bias vs. Variance

  • Bias → Error due to assumptions (model too simple).

  • Variance → Error due to sensitivity (model too complex).

  • Good model = balance bias & variance.

  • Bias vs Variance → Bias = wrong assumptions, Variance = too sensitive.

  • Confusion Matrix → Accuracy, Precision, Recall, F1.


8. Security + Deployment

  • Data in S3: Encrypt with KMS.

  • IAM: Principle of least privilege.

  • Endpoint Security: VPC, PrivateLink.

  • Data Governance: PI/PII handling, anonymization.

Quick tip: In AWS AI exams, they love asking “Which ML algorithm should be used for X?”. Always map:

  • Prediction of number → Regression

  • Yes/No classification → Logistic Regression / Decision Tree

  • Grouping without labels → Clustering (K-means)

  • Text classification → Naïve Bayes

  • Complex patterns (images, speech) → Neural Networks / Deep Learning

[algorithm]Problem → Algorithm → AWS Service

1. Classification Problems

  • Problem: Is this X or Y? (binary) / Which category does it belong to? (multi-class)

  • Algorithms: Logistic Regression, Decision Trees, Random Forest, XGBoost, Neural Networks

  • AWS Services:

    • Amazon SageMaker (built-in algorithms, Autopilot)

    • Amazon Rekognition (image classification, moderation)

    • Amazon Comprehend (text classification, sentiment analysis)

2. Regression Problems

  • Problem: Predict a number (continuous value). E.g., house price, demand forecasting.

  • Algorithms: Linear Regression, Polynomial Regression, Gradient Boosted Trees

  • AWS Services:

    • Amazon Forecast (time-series regression)

    • Amazon SageMaker (Linear Learner, XGBoost)

3. Clustering Problems

  • Problem: Group similar things when no labels exist (unsupervised). E.g., customer segmentation.

  • Algorithms: K-Means, Hierarchical Clustering

  • AWS Services:

    • Amazon SageMaker (K-Means built-in)

    • Amazon Personalize (implicit clustering for recommendations)

4. Recommendation Systems

  • Problem: Suggest what a user might like based on history/preferences.

  • Algorithms: Collaborative Filtering, Matrix Factorization, Deep Learning Embeddings

  • AWS Services:

    • Amazon Personalize

5. Natural Language Processing (NLP)

  • Problem: Understand or generate text.

  • Algorithms: RNNs, Transformers (BERT, GPT-style models)

  • AWS Services:

    • Amazon Comprehend (sentiment, key phrases, topics)

    • Amazon Translate (language translation)

    • Amazon Lex (chatbots, conversational AI)

    • Amazon Bedrock (foundation models for text generation, Q&A)

6. Computer Vision

  • Problem: Identify/understand images or video.

  • Algorithms: CNN (Convolutional Neural Networks), Object Detection (YOLO, Faster R-CNN)

  • AWS Services:

    • Amazon Rekognition (faces, labels, moderation, video analysis)

    • SageMaker (train custom vision models)

7. Anomaly Detection

  • Problem: Spot outliers/fraud/rare events.

  • Algorithms: Random Cut Forest (RCF), Isolation Forest

  • AWS Services:

    • Amazon Lookout for Metrics

    • Amazon SageMaker (RCF built-in algorithm)

    • Amazon Fraud Detector

8. Time-Series Forecasting

  • Problem: Predict future values based on historical data.

  • Algorithms: ARIMA, Prophet, DeepAR (RNN-based)

  • AWS Services:

    • Amazon Forecast

    • SageMaker DeepAR

9. Generative AI

  • Problem: Generate new text, code, or images.

  • Algorithms: Transformer-based LLMs, Diffusion Models (for images)

  • AWS Services:

    • Amazon Bedrock (Claude, Titan, Stable Diffusion, Llama 2, etc.)

    • SageMaker JumpStart (pretrained models)

[Types of Lerning in details]One-Shot, Few-Shot, Zero-Shot Learning

Zero-Shot Learning

  • Definition: Model performs a task without seeing any example of that task.

  • Relies on the model’s pretraining knowledge.

  • Example:

    • Prompt: “Translate this English sentence to French: ‘How are you?’”

    • Model translates correctly even without training examples.

  • Used when: You don’t have labeled data or the model is general-purpose.


One-Shot Learning

  • Definition: Model performs a task after seeing just one example of how it should be done.

  • Example:

    • Prompt:

      • “Translate this English sentence to French. Example: ‘Good morning’ → ‘Bonjour’”

      • “Now, translate: ‘How are you?’”

  • Used when: You want to guide the model with a single demonstration.


Few-Shot Learning

  • Definition: Model performs a task after seeing a few examples (2–5 typically).

  • Example:

    • Prompt:

      • “English → French examples:

        • ‘Good morning’ → ‘Bonjour’

        • ‘Thank you’ → ‘Merci’

        • Now translate: ‘How are you?’”*

  • Used when: You want to fine-tune behavior with small context examples.


Traditional Training (Many-Shot / Supervised)

  • Definition: Model is trained with large labeled datasets.

  • Example: Training an image classifier with 1000+ labeled dog & cat images.

TermDefinitionExample
Zero-ShotNo examples given“Summarize this text.”
One-ShotOne example providedTranslate after 1 example
Few-ShotFew examples providedTranslate with 3–5 examples
Supervised (Many-Shot)Trained with large datasetDog vs Cat classifier

[Very Important]AWS SageMaker Cheat Sheet

1. Core Concepts

  • SageMaker = End-to-end ML service (Build → Train → Deploy).

  • Focus on:

    • Data prep

    • Training

    • Deployment/Inference

    • MLOps (monitoring, pipelines)


2. Data Preparation

  • SageMaker Ground Truth → Label datasets (human + automated).

  • SageMaker Data Wrangler → Clean, transform, and visualize data (low-code).

  • SageMaker Feature Store → Central repo for ML features (real-time + batch).


3. Model Building

  • SageMaker Studio → Web-based IDE for ML (like Jupyter + AWS integration).

  • SageMaker JumpStart → Pre-trained models & solutions (ready-to-use).

  • SageMaker Autopilot → AutoML (build, train, tune models automatically).


4. Model Training

  • SageMaker Training Jobs → Managed training infrastructure.

  • SageMaker Debugger → Real-time training metrics, detects training issues.

  • SageMaker Experiments → Track model versions, experiments, metrics.

  • SageMaker Distributed Training → Train large models faster.

  • Spot Training → Cost-optimized training using spare capacity.


5. Model Deployment & Inference

  • SageMaker Endpoints → Real-time inference (deploy model as API).

  • SageMaker Batch Transform → Batch inference (large datasets).

  • SageMaker Serverless Inference → Cost-optimized inference (scale-to-zero).

  • SageMaker Asynchronous Inference → For long-running inference jobs.

  • SageMaker Multi-Model Endpoints (MME) → Host multiple models on same endpoint.


6. MLOps & Monitoring

  • SageMaker Pipelines → ML workflow automation (CI/CD for ML).

  • SageMaker Model Monitor → Detects drift, bias, and quality issues.

  • SageMaker Clarify → Detect bias, explain model predictions (interpretability).

  • SageMaker Model Registry → Store, version, approve models before deploy.