AWS Certified AI Practitioner(12) - Pricing & Model Improvement

Created2025-08-20|Updated2025-08-21|CERTIFICATIONAWS_AI_PRACTITIONER

|Post Views:

📘 Amazon Bedrock – Pricing & Model Improvement

1️⃣ Pricing Options

🔹 On-Demand (Pay-as-you-go)

How it works: Pay only for what you use, like an electricity bill.
Pricing basis
- Text Models → Input/Output token count
- Embedding Models → Input token count
- Image Models → Number of images generated
Available Models: Base Models only
✅ Pros: Flexible, good for unpredictable workloads
❌ Cons: Can become expensive if used continuously over time

🔹 Batch Mode (Bulk processing, up to 50% discount)

How it works: Group multiple requests together → results stored as a single file in Amazon S3
Discount: Up to 50% cheaper
✅ Pros: Great for large-scale processing, strong cost savings
❌ Cons: No real-time response, results are delayed
Best use case: Large batch jobs where immediate results are not required

🔹 Provisioned Throughput (Reserved capacity, guaranteed performance)

How it works: Like a gym membership — reserve processing capacity for a set period (e.g., 1–6 months)
Guaranteed performance: Ensures a maximum number of input/output tokens per minute
Available Models: Base, Fine-tuned, and Custom Models
✅ Pros: Stable performance and capacity, supports custom models
❌ Cons: Not a cost-saving option, purpose is performance guarantee

📊 Pricing Options Comparison Table

Option	Billing Method	Pricing Basis	Available Models	Pros	Cons	Best Use Case
On-Demand	Pay-as-you-go	- Text: Input/Output tokens - Embedding: Input tokens - Image: Generated images	Base Models only	High flexibility Great for unpredictable workloads	Expensive for long-term use	Occasional use / Unpredictable demand
Batch Mode	Bulk processing	Results stored in Amazon S3	Base Models only	Up to 50% discount Efficient for large-scale jobs	No real-time response Delayed results	Large requests / No need for instant results
Provisioned Throughput	Reserved capacity (1–6 months)	Guaranteed tokens per minute	Base, Fine-tuned, Custom Models	Guaranteed stable performance Supports custom models	Almost no cost savings	When using custom models / Need guaranteed performance

2️⃣ Model Improvement Techniques (Low → High Cost)

1. Prompt Engineering

Improve results simply by optimizing prompts
No extra computation → Lowest cost

2. Retrieval Augmented Generation (RAG)

Uses an external knowledge database (Vector DB)
No model retraining → relatively low cost
Additional cost for building and maintaining the database

RAG = “Model + Search function” → lets the model find external knowledge it doesn’t already know.

3. Instruction-based Fine-tuning

Fine-tune the model with labeled data and specific instructions
Requires extra computation → Higher cost

4. Domain Adaptation Fine-tuning

Retrain the model with a large domain-specific dataset
Requires extensive data preparation + heavy computation → Highest cost

3️⃣ Cost Optimization Tips

Token management → main driver of cost savings
- Keep prompts concise
- Limit output length to what’s necessary
Use Batch Mode → up to 50% cheaper
Choose smaller models → generally cheaper
Adjust hyperparameters (Temperature, Top-K, Top-P)
- Affects model behavior but not pricing

📝 Final Summary (Exam/Practical Points)

On-Demand = Flexibility / Batch = Bulk & Discounts / Provisioned = Guaranteed Performance
Cost order: Prompt Engineering < RAG < Instruction Fine-tuning < Domain Adaptation
Cost-saving keys: Token management + Batch Mode

Author: Danny Ki

Link: https://kish191919.github.io/2025/08/20/AWS-Certified-AI-Practitioner-12/

Copyright Notice: All articles on this blog are licensed under CC BY-NC-SA 4.0 unless otherwise stated.

AWS AWS_AI_PRACTITIONER

Related Articles

AWS Certified AI Practitioner(11) - Agents

📊 Amazon Bedrock & CloudWatch📌 What is CloudWatch?Amazon CloudWatch is a monitoring service for AWS resources and applications.It provides: Logs – Detailed records of events and invocations Metrics – Numerical measurements of system performance Alarms – Notifications when thresholds are crossed Dashboards – Visualizations for monitoring 🔑 Bedrock & CloudWatch Integration1. Model Invocation Logging Logs all inputs and outputs from Bedrock model invocations. Data can inclu...

AWS Certified AI Practitioner(14) - Prompt Engineering

📘 Prompt EngineeringWhat is Prompt Engineering?Prompt Engineering is the process of designing, refining, andoptimizing prompts to guide a foundation model (FM) or large languagemodel (LLM) toward producing the best possible output for your needs. A naïve prompt gives little guidance and leaves interpretation up tothe model.Example: “Summarize what is AWS.”This works, but the answer may not be clear or focused. By contrast, Prompt Engineering uses a structured approach toimprove results. Com...

AWS Certified AI Practitioner(15) - LLM Text Generation & Prompt Optimizatio

🤖 LLM Text Generation & Prompt Optimization1. How Text is Generated in an LLMWhen a model generates text, it predicts the next word based on probabilities. Example:“After the rain, the streets were…”Possible next words and probabilities: wet (0.40) flooded (0.25) slippery (0.15) empty (0.10) muddy (0.05) clean (0.03) blocked (0.02) The model randomly selects a word according to these probabilities. 2. Prompt Performance Optimization🔹 System Prompts Define how the mode...

AWS Certified AI Practitioner(13) - End-to-End Use Case (AI Stylist Demo)

👗 Amazon Bedrock End-to-End Use Case (AI Stylist Demo)📌 Why This Demo MattersSo far, we’ve explored many features of Amazon Bedrock. But in reality, using Bedrock isn’t just about clicking around in the console.To build a real-world application, you need to make API calls to Bedrock and integrate those capabilities directly into your service. To demonstrate this, AWS provides an AI Stylist demo application.This demo shows how end users actually experience an application built on top of Be...

AWS Certified AI Practitioner(16) - Prompt Engineering Techniques

🎯 Prompt Engineering TechniquesUnderstanding different prompting techniques is essential for getting the most out of Large Language Models (LLMs). These concepts are also important for AWS certification exams, especially when dealing with Amazon Bedrock and generative AI. 1. 🔹 Zero-Shot PromptingDefinition:Present a task to the model without providing any examples or prior training for that specific task. Prompt Example: 1Write a short story about a dog that helps solve a mystery. Respons...

AWS Certified AI Practitioner(10) - Agents

🤖 Amazon Bedrock – Agents📌 What Are Agents?Agents in Amazon Bedrock are advanced components that can think, plan, and act on multi-step tasks.Unlike regular models that only provide answers, agents can perform real actions such as: Provisioning infrastructure Deploying applications Executing operations on systems Interacting with APIs, databases, and knowledge bases 🔑 Key Features of Bedrock Agents Multi-step task execution: Agents can follow a sequence of steps to complete...