Course Content
Multi-Agent Systems: Introduction
Orchestrator-worker patterns and when to split tasks across agents
Why One Agent Isn’t Always Enough
A single agent with access to every tool sounds like the most powerful design. In practice, it’s often the worst one.
Imagine a single agent responsible for: researching a topic, writing a draft, editing for tone, checking facts, optimizing for SEO, formatting for publication, and posting to your blog. This agent needs 10+ tools, a system prompt covering every role, and must context-switch between radically different tasks in a single conversation.
The problems compound quickly:
- Context window pressure: Long research outputs eat into the space available for writing
- Role confusion: A system prompt that says “you are both a researcher and an editor” leads to blurry, mediocre output for both
- No parallelism: Tasks that could run simultaneously are forced to run sequentially
- Hard to debug: When something fails, which “hat” was the agent wearing?
Multi-agent systems solve these problems by splitting work across specialized agents that each do one thing well.
The Content Production Pipeline
Let’s use a concrete scenario: a content production system for a tech blog.
Input: A topic (“The rise of mixture-of-experts models”) Output: A published blog post with headline, body, metadata, and source citations
The pipeline has five agents:
Orchestrator
|
├── Research Agent (searches web, finds sources, extracts key facts)
|
├── Writer Agent (drafts the blog post from research notes)
|
├── Editor Agent (refines tone, fixes structure, cuts fluff)
|
└── Publisher Agent (formats as markdown, adds metadata, posts to CMS)Each agent has a focused system prompt, a minimal set of tools for its role, and a clear input/output contract. The Orchestrator coordinates the flow and handles failures.
The Orchestrator-Worker Pattern
The orchestrator is an LLM that manages other agents (the workers). It doesn’t do the work itself — it directs, monitors, and synthesizes.
import anthropic
from dataclasses import dataclass
from typing import Optional
client = anthropic.Anthropic()
@dataclass
class AgentResult:
agent_name: str
success: bool
output: str
error: Optional[str] = None
def run_worker_agent(
agent_name: str,
system_prompt: str,
task: str,
tools: list = None
) -> AgentResult:
"""Run a single worker agent on a specific task."""
try:
kwargs = {
"model": "claude-opus-4-5",
"max_tokens": 2048,
"system": system_prompt,
"messages": [{"role": "user", "content": task}]
}
if tools:
kwargs["tools"] = tools
response = client.messages.create(**kwargs)
return AgentResult(
agent_name=agent_name,
success=True,
output=response.content[0].text
)
except Exception as e:
return AgentResult(
agent_name=agent_name,
success=False,
output="",
error=str(e)
)Implementing Each Worker Agent
# Agent system prompts — focused, single-purpose
RESEARCH_SYSTEM = """You are a research specialist. Your job is to find accurate,
relevant information about a topic. Focus on:
- Key technical concepts and definitions
- Recent developments and trends
- Credible sources with specific facts and data
- Common misconceptions to address
Output: structured research notes with clear headings and source attributions.
Be thorough but focus on what's most relevant for a technical blog post."""
WRITER_SYSTEM = """You are a technical content writer for a developer audience.
Given research notes, write an engaging blog post that:
- Opens with a hook that establishes why this matters
- Explains technical concepts clearly without oversimplifying
- Uses concrete examples and analogies
- Flows naturally from introduction to conclusion
Target: 800-1200 words. Write for senior engineers who are busy and skeptical.
Do not pad. Do not repeat yourself."""
EDITOR_SYSTEM = """You are a senior technical editor. Your job is to improve a draft without
changing its substance. Focus on:
- Cutting sentences that don't add information
- Strengthening the opening paragraph
- Ensuring the conclusion delivers a clear takeaway
- Fixing any awkward phrasing or unclear explanations
- Ensuring consistent technical terminology
Return the improved draft. Note any significant changes you made at the end."""
PUBLISHER_SYSTEM = """You are responsible for formatting content for publication.
Given a final draft, produce:
1. A compelling title (under 70 characters)
2. A meta description (150-160 characters)
3. 3-5 relevant tags
4. The post body formatted as clean markdown
5. A "published at" timestamp placeholder
Output valid JSON with keys: title, meta_description, tags, body, published_at."""The Orchestrator Logic
def run_content_pipeline(topic: str) -> dict:
"""
Orchestrate the full content production pipeline.
Returns the final published post or an error report.
"""
print(f"\n{'='*60}")
print(f"Content Pipeline: {topic}")
print('='*60)
# Step 1: Research
print("\n[1/4] Research Agent working...")
research_result = run_worker_agent(
agent_name="Research Agent",
system_prompt=RESEARCH_SYSTEM,
task=f"Research this topic for a technical blog post: {topic}"
)
if not research_result.success:
return {"error": f"Research failed: {research_result.error}"}
print(f"Research complete ({len(research_result.output)} chars)")
# Step 2: Write (uses research output)
print("\n[2/4] Writer Agent working...")
write_result = run_worker_agent(
agent_name="Writer Agent",
system_prompt=WRITER_SYSTEM,
task=f"""Write a blog post about: {topic}
Based on these research notes:
{research_result.output}"""
)
if not write_result.success:
return {"error": f"Writing failed: {write_result.error}"}
print(f"Draft complete ({len(write_result.output)} chars)")
# Step 3: Edit (uses draft)
print("\n[3/4] Editor Agent working...")
edit_result = run_worker_agent(
agent_name="Editor Agent",
system_prompt=EDITOR_SYSTEM,
task=f"Edit and improve this draft:\n\n{write_result.output}"
)
if not edit_result.success:
# Non-critical failure — fall back to unedited draft
print(f"Edit failed: {edit_result.error}. Using unedited draft.")
final_draft = write_result.output
else:
final_draft = edit_result.output
print(f"Edit complete ({len(final_draft)} chars)")
# Step 4: Publish (format for CMS)
print("\n[4/4] Publisher Agent working...")
publish_result = run_worker_agent(
agent_name="Publisher Agent",
system_prompt=PUBLISHER_SYSTEM,
task=f"Format this post for publication:\n\n{final_draft}"
)
if not publish_result.success:
return {"error": f"Publishing failed: {publish_result.error}", "draft": final_draft}
# Parse the JSON output from the publisher
import json
try:
published_post = json.loads(publish_result.output)
print(f"\nPipeline complete!")
print(f"Title: {published_post.get('title', 'N/A')}")
print(f"Tags: {published_post.get('tags', [])}")
return published_post
except json.JSONDecodeError:
return {"error": "Publisher output was not valid JSON", "raw": publish_result.output}
# Run the pipeline
result = run_content_pipeline("The rise of mixture-of-experts models in LLMs")How Agents Communicate
In this pipeline, agents communicate through shared state — each agent’s output becomes the next agent’s input, passed as text in the message. This is the simplest pattern and works well for linear pipelines.
For more complex systems, there are two other patterns:
Message passing: Agents put messages into a queue; other agents pick them up. Good for asynchronous or parallel workflows.
from queue import Queue
import threading
research_queue = Queue()
writing_queue = Queue()
def research_worker(topic: str):
result = run_worker_agent("Research Agent", RESEARCH_SYSTEM, topic)
research_queue.put(result)
def writer_worker():
research = research_queue.get() # Blocks until research is ready
result = run_worker_agent("Writer Agent", WRITER_SYSTEM,
f"Write post using: {research.output}")
writing_queue.put(result)
# Run in parallel where possible
# Research and metadata gathering can happen simultaneously
threads = [
threading.Thread(target=research_worker, args=(topic,)),
threading.Thread(target=writer_worker)
]Shared database: All agents read from and write to a central store. The orchestrator coordinates access. Good for long-running pipelines where agents may run at different times.
When NOT to Use Multi-Agent Systems
Multi-agent is not the answer to every problem. It adds complexity — more API calls, more things that can fail, harder debugging. Use it only when the benefits are clear.
Use multi-agent when:
- Tasks are naturally separable with clear input/output boundaries
- Different tasks require very different “personas” or tool sets
- Tasks can run in parallel and time matters
- The context window of a single agent would overflow
Don’t use multi-agent when:
- A single agent can handle the task cleanly in one pass
- The tasks are tightly coupled (each step depends on real-time feedback from the previous)
- You’re just starting out and want to understand the problem space first
- The overhead of coordination exceeds the benefit of specialization
The content pipeline above is a good candidate because research, writing, editing, and publishing are genuinely different skills with clear handoffs. A “summarize this document” task is not — it doesn’t benefit from splitting across agents.
Handling Failures in Multi-Agent Systems
Multi-agent systems have more failure points than single-agent systems. Build resilience in from the start:
def run_with_retry(agent_name: str, system: str, task: str, max_retries: int = 2):
"""Run an agent with retry logic."""
for attempt in range(max_retries + 1):
result = run_worker_agent(agent_name, system, task)
if result.success:
return result
if attempt < max_retries:
print(f"{agent_name} failed (attempt {attempt+1}). Retrying...")
return result # Return the failed result after exhausting retriesAlso think about graceful degradation: if the editor agent fails, use the unedited draft rather than failing the whole pipeline. If the publisher agent can’t produce valid JSON, return the markdown text directly. The goal is to always produce something useful, even when individual agents fail.
Summary
- Multi-agent systems split work across specialized agents, each with a focused role
- The orchestrator-worker pattern gives one agent (the orchestrator) responsibility for coordination
- The content pipeline (Research → Write → Edit → Publish) is a concrete 4-agent example
- Agents communicate through shared state (simplest), message queues (async), or shared databases (persistent)
- Don’t use multi-agent for tasks a single agent handles well — it adds complexity without benefit
- Build in retry logic and graceful degradation since multi-agent systems have more failure points
Next: Safety and Guardrails — because agents that take real-world actions can do real-world harm.
