Press ESC to exit fullscreen
📖 Lesson ⏱️ 60 minutes

Domain 1 — Model Selection and Capabilities

Claude model families, tiers, context windows, and how to select the right model for a use case

Domain 1 Overview

Model selection is the first architectural decision in any Claude application. Domain 1 tests whether you understand the capability differences between tiers deeply enough to make correct trade-off decisions — not just know which model exists.

Exam weight: ~15% (approximately 9 questions)


The Claude Model Tiers

Each Claude generation ships in three tiers. For the Claude 4.x generation:

ModelPositioningContext WindowBest For
Claude Opus 4Maximum intelligence200K tokensComplex reasoning, research, long-doc analysis, multi-step problems
Claude Sonnet 4Balanced performance200K tokensCoding, customer-facing apps, summarization, most production workloads
Claude Haiku 4Speed and efficiency200K tokensHigh-volume classification, routing, extraction, sub-200ms tasks

All current models share the same 200K token context window. Model selection is therefore driven by capability, cost, and latency — not context size.


The Cost and Latency Gradient

Across tiers (approximate relative values):

OpusSonnetHaiku
Cost per tokenHighest (~10x Haiku)Medium (~4x Haiku)Lowest (baseline)
Time to first tokenSlowestMediumFastest (~3x faster than Opus)
Reasoning depthMaximumHighGood for simple tasks

The architectural principle the exam tests: Use the cheapest model that reliably solves the task. Sending a classification task to Opus is architecturally incorrect — it is wasteful even if it works.


The Model Selection Decision Framework

Work through these questions in order:

1. What is the task complexity?

  • Simple classification, routing, extraction → Haiku
  • Coding, summarization, customer interaction → Sonnet
  • Multi-step reasoning, research synthesis, long-doc analysis → Opus

2. What is the latency budget?

  • Sub-200ms → Haiku only
  • Sub-2s acceptable → Sonnet
  • Latency not critical → consider Opus for hard tasks

3. What is the volume?

  • High volume (thousands/hour) → minimize model tier; cost compounds
  • Low volume (occasional) → can afford Opus even for moderate tasks

4. What is the accuracy requirement?

  • Near-perfect required, complex domain → Opus + extended thinking
  • High but not perfect → Sonnet
  • Good enough for classification/routing → Haiku

Extended Thinking

Extended thinking allows Claude to perform internal reasoning before producing a response. It is only available on Opus models.

When extended thinking improves results:

  • Multi-step logical deduction
  • Security vulnerability analysis
  • Complex architecture review
  • Problems where auditing the reasoning chain matters

When extended thinking does NOT help (and should not be used):

  • Simple Q&A
  • Classification
  • Content generation
  • Any task where latency matters
  • Tasks Sonnet already handles reliably

Extended thinking increases both cost (additional reasoning tokens) and latency. The exam tests whether you know the cases where it’s wrong to use it, not just when it’s available.

import anthropic

client = anthropic.Anthropic()

# Extended thinking — use only for genuinely hard reasoning tasks
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000,
    },
    messages=[{
        "role": "user",
        "content": "Review this system design for security vulnerabilities and single points of failure."
    }],
)

for block in response.content:
    if block.type == "thinking":
        print("Reasoning:", block.thinking)
    elif block.type == "text":
        print("Answer:", block.text)

Model Routing Pattern

A common production pattern is routing requests to the cheapest model that can handle them:

def classify_complexity(user_message: str) -> str:
    """Use a cheap Haiku call to classify request complexity."""
    response = client.messages.create(
        model="claude-haiku-4-5-20251001",
        max_tokens=10,
        system="Classify the complexity of the user's request. Respond with exactly one word: simple, medium, or complex.",
        messages=[{"role": "user", "content": user_message}],
    )
    return response.content[0].text.strip().lower()

def route_request(user_message: str) -> str:
    complexity = classify_complexity(user_message)
    model_map = {
        "simple": "claude-haiku-4-5-20251001",
        "medium": "claude-sonnet-4-6",
        "complex": "claude-opus-4-7",
    }
    model = model_map.get(complexity, "claude-sonnet-4-6")
    response = client.messages.create(
        model=model,
        max_tokens=2048,
        messages=[{"role": "user", "content": user_message}],
    )
    return response.content[0].text

The exam may ask you to critique this pattern — the main risk is misclassification (a complex task routed to Haiku). Always have a fallback or confidence threshold.


Cost Estimation at Scale

The exam includes questions about cost impact of model choice. Practice this mental math:

Scenario: 50,000 requests/day, average 1,000 input tokens + 500 output tokens per request.

At approximate prices (Sonnet: $3 input / $15 output per million tokens):

  • Daily input cost: 50,000 × 1,000 × $0.000003 = $150/day
  • Daily output cost: 50,000 × 500 × $0.000015 = $375/day
  • Total: $525/day on Sonnet

Switching to Haiku (~8x cheaper input, ~10x cheaper output):

  • Input: $150 / 8 = ~$19/day
  • Output: $375 / 10 = ~$37/day
  • Total: ~$56/day

For a classification workload, Haiku saves ~$469/day. The exam expects you to recognize this magnitude of difference and recommend accordingly.


Domain 1 Key Facts to Memorize

  • All Claude 4.x models: 200K token context window
  • Haiku: fastest, cheapest; Opus: most capable, most expensive
  • Extended thinking: Opus only, increases cost and latency
  • Model selection principle: cheapest model that reliably solves the task
  • Routing pattern risk: misclassification sends hard tasks to a weak model

Continue to the Domain 1 Practice Questions to test your knowledge.