Course Content
Domain 2 — Prompt Engineering Lab
Hands-on: write and iterate system prompts for three real-world scenarios
Lab Overview
Reading about prompt engineering is not sufficient for the exam. The exam includes scenario questions where you must identify why a prompt fails and which specific change fixes it. You learn this by breaking and fixing real prompts.
Complete all three scenarios. Each builds toward a working implementation you can reference on your path to certification.
Setup
pip install anthropic
export ANTHROPIC_API_KEY="your-key-here"import anthropic
import json
client = anthropic.Anthropic()Scenario 1 — Customer Support Bot System Prompt
Brief: Build a system prompt for a SaaS customer support bot. The product is a cloud data pipeline tool. Requirements:
- Answer questions about the product only
- Decline politely for off-topic requests
- Respond in under 150 words unless detail is explicitly requested
- Always suggest a documentation link when relevant (placeholder:
[docs link]) - Tone: professional, concise, not robotic
Step 1 — Write your first version
Write a system prompt that meets all five requirements. Do not look at the reference below until you have written your own.
Step 2 — Test against these 5 inputs:
test_inputs = [
"How do I connect to Snowflake?",
"What's the weather in Paris?",
"My pipeline keeps failing with a timeout error",
"Can you write me a poem?",
"What are your pricing plans?",
]
for user_input in test_inputs:
response = client.messages.create(
model="claude-haiku-4-5-20251001", # Use Haiku for fast iteration
max_tokens=300,
system=YOUR_SYSTEM_PROMPT,
messages=[{"role": "user", "content": user_input}],
)
print(f"Input: {user_input}")
print(f"Response: {response.content[0].text}\n")Step 3 — Evaluate: Does the bot decline off-topic requests? Stay under 150 words? Maintain consistent tone? Fix any failures.
Reference system prompt (compare after your own attempt):
You are a support agent for DataFlow, a cloud data pipeline platform.
Scope: Answer only questions about DataFlow features, integrations, troubleshooting, pricing, and documentation. For all other topics, politely decline and redirect.
Response rules:
- Keep responses under 150 words unless the user explicitly asks for detail
- When relevant, include: "For more detail, see [docs link]"
- Tone: professional and direct — helpful without being informal
- If you cannot resolve an issue, suggest: "Please contact our support team at support@dataflow.io"
Out-of-scope response: "I'm DataFlow's support assistant and can only help with DataFlow-related questions. Is there something about DataFlow I can help you with?"Scenario 2 — Reliable JSON Extraction
Brief: Build a prompt that extracts structured data from unstructured customer feedback. Target schema:
{
"sentiment": "positive" | "neutral" | "negative",
"topics": ["string"],
"urgency": "low" | "medium" | "high",
"action_required": boolean
}Step 1 — Write a zero-shot attempt:
ZERO_SHOT = """Extract structured data from customer feedback.
Return JSON matching this schema: {"sentiment": ..., "topics": [...], "urgency": ..., "action_required": ...}"""
feedback_samples = [
"Love the new dashboard! Much faster than before.",
"URGENT: Our production pipeline has been down for 2 hours and we're losing data.",
"The documentation is a bit confusing for the Snowflake connector setup.",
]
for feedback in feedback_samples:
response = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=200,
system=ZERO_SHOT,
messages=[{"role": "user", "content": feedback}],
)
print(response.content[0].text)
# Try to parse — does it always produce valid JSON?
try:
parsed = json.loads(response.content[0].text)
print("✓ Valid JSON")
except json.JSONDecodeError as e:
print(f"✗ Parse failed: {e}")Step 2 — Count your parse failures. Zero-shot JSON extraction fails 5–15% of the time. Now add few-shot examples and retry logic:
FEW_SHOT_WITH_RETRY = """Extract structured data from customer feedback.
Return ONLY valid JSON — no prose, no markdown fences, no explanation.
Schema: {"sentiment": "positive"|"neutral"|"negative", "topics": [strings], "urgency": "low"|"medium"|"high", "action_required": boolean}
Examples:
Input: The onboarding was smooth and the team was super helpful!
Output: {"sentiment": "positive", "topics": ["onboarding", "support"], "urgency": "low", "action_required": false}
Input: Our ETL job failed at 3am and we have no alerts set up — need help ASAP
Output: {"sentiment": "negative", "topics": ["ETL", "alerts", "incident"], "urgency": "high", "action_required": true}
Input: The UI is fine but the docs could use more examples
Output: {"sentiment": "neutral", "topics": ["UI", "documentation"], "urgency": "low", "action_required": false}"""
def extract_feedback(feedback: str, max_retries: int = 3) -> dict:
for attempt in range(max_retries):
response = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=200,
system=FEW_SHOT_WITH_RETRY,
messages=[{"role": "user", "content": feedback}],
)
text = response.content[0].text.strip().strip("```json").strip("```").strip()
try:
return json.loads(text)
except json.JSONDecodeError:
if attempt == max_retries - 1:
raise
raise RuntimeError("Extraction failed after retries")Step 3 — Run 20 samples and confirm 100% parse success rate. This is the production-quality pattern.
Scenario 3 — Structured Chain-of-Thought for Architecture Review
Brief: Build a prompt that analyzes a system architecture description and produces a structured vulnerability report. The reasoning must be auditable.
Step 1 — Write the structured CoT scaffold:
ARCH_REVIEW_SYSTEM = """You are a senior cloud security architect.
When reviewing an architecture, follow this exact process:
1. List every component mentioned in the description
2. For each component, identify: (a) what it does, (b) what fails if it goes down, (c) whether it is a single point of failure
3. List all network boundaries and data flows
4. Identify security concerns: authentication gaps, unencrypted data paths, overprivileged roles
5. Produce a final report with sections: Single Points of Failure, Security Concerns, Recommendations
Always show your reasoning for steps 1–4 before writing the final report."""
test_architecture = """
Our system has a single API Gateway that routes to three Lambda functions.
All three Lambdas share one RDS Postgres instance with a single admin user credential
stored in an environment variable. The API Gateway is public-facing with no authentication.
Static assets are served from S3. Logs go to CloudWatch.
"""
response = client.messages.create(
model="claude-sonnet-4-6", # Use Sonnet — reasoning task
max_tokens=2000,
system=ARCH_REVIEW_SYSTEM,
messages=[{"role": "user", "content": f"Review this architecture:\n\n{test_architecture}"]},
)
print(response.content[0].text)Step 2 — Evaluate the output:
- Does it follow steps 1–4 before the report?
- Does it identify the shared admin credential as a critical issue?
- Does it flag the public API with no authentication?
- Is the final report clearly separated from the reasoning?
Step 3 — Compare with explicit CoT (replace system prompt reasoning scaffold with just “Think step by step”). Which produces more thorough output? This is the exam’s CoT comparison question in practice.
Lab Completion Checklist
- Customer support bot declines off-topic requests reliably across all 5 test inputs
- JSON extraction prompt achieves 100% parse success rate across 20 samples
- Architecture review prompt produces auditable step-by-step reasoning before the final report
- You can articulate why structured CoT outperformed explicit CoT on the architecture task
Once all four boxes are checked, proceed to the Domain 2 Practice Questions.