Amazon SageMaker Data Tools and Model Evaluation

SageMaker Data Wrangler

SageMaker Data Wrangler is a tool designed to make data preparation easier before building machine learning (ML) models.

With Data Wrangler, you can: - Prepare tabular and image data for ML

  • Perform data preparation, transformation, and feature engineering
  • Use a single interface for: - Data selection - Cleansing - Exploration - Visualization - Processing
  • Run SQL queries directly
  • Use the Data Quality tool to check for missaing or inconsistent values

Key Features

  • Import Data: Load from sources like Amazon S3.

  • Preview Data: Inspect column names, types, and values.

  • Visualize Data: Build charts to better understand the dataset.

  • Transform Data: Apply functions, drop or add columns.

  • Quick Model: Run a quick test to check model performance.

  • Export Data Flow: Save transformations for reuse in pipelines.

Exam Tip: If you see a question about data preparation and feature engineering in SageMaker, think of Data Wrangler.


What are ML Features?

Features are the inputs to ML models during training and inference.

Example:
For a music dataset, features might include:

  • Song ratings
  • Listening duration
  • Listener demographics

High-quality, reusable features are critical. They improve consistency across teams and projects within a company.


SageMaker Feature Store

The Feature Store helps manage and reuse features.

  • Ingest features from multiple sources.
  • Define transformations to convert raw data into usable features.
  • Publish features directly from Data Wrangler into Feature Store.
  • Features are searchable and shareable within SageMaker Studio.

Exam Tip: Feature Store = centralized place to manage, discover, and reuse ML features.

------------------------------------------------------------------------

SageMaker Clarify

SageMaker Clarify is about trust and fairness in ML models. It helps with:

  1. Model Evaluation: Compare performance of two models (e.g., Model A vs Model B).
  • Can evaluate human factors like friendliness or humor in a foundation model.
  • Use AWS-managed human reviewers or your own employees.
  • Use built-in datasets or bring your own.
  • Includes built-in metrics and algorithms.

  1. Model Explainability: Understand why a model made its predictions.
    • Example: “Why was this loan rejected?”
    • Helps debug deployed models and build trust.
    • Exam Tip: Look for keywords like explain predictions or increase transparency → Clarify.

  1. Bias Detection: Identify and measure bias in data or models using statistical metrics.
    • Example: If your dataset heavily favors one group, Clarify can flag it.
    • Types of Bias:
      • Sampling Bias: Data doesn’t fairly represent the population.
      • Measurement Bias: Errors in how data is measured.
      • Observer Bias: Human judgment skews results.
      • Confirmation Bias: Favoring information that supports preconceptions.

Exam Tip: If the question mentions detecting bias or explaining ML predictions, the answer is usually SageMaker Clarify.


SageMaker Ground Truth

Ground Truth focuses on data labeling and human feedback.

  • Supports RLHF (Reinforcement Learning from Human Feedback).
  • Use cases:
    • Model review and evaluation
    • Aligning models to human preferences
    • Creating labeled datasets (e.g., tagging images)

How it Works

  • Humans review and provide feedback, which is added to the model’s “reward” function.
  • Feedback improves model accuracy and aligns it with desired behavior.
  • Reviewers can be:
    • Amazon Mechanical Turk workers
    • Your employees
    • Third-party vendors

Ground Truth Plus

  • A managed option where AWS provides a workforce to label your data.

Exam Tip: If the exam mentions data labeling or RLHF, think
Ground Truth.

Key Takeaways for the Exam

  • Data Wrangler = data preparation and feature engineering.
  • Feature Store = manage and reuse ML features across teams.
  • Clarify = bias detection and explainability of models.
  • Ground Truth = human labeling and reinforcement learning from feedback.