Documentation
Swarm Engine runs multiple AI agents on the same task -- researchers find the relevant code, implementers write the changes, reviewers catch the mistakes. This is the full reference for configuring agents, choosing patterns, mixing backends, and getting the most out of every run.
Installation
Swarm Engine requires Node.js 20+ and at least one AI coding tool (Claude Code, Codex, or Gemini CLI).
$ npm install -g swarm-engine $ swarm install # Set up Claude Code integration $ swarm doctor # Verify everything works
swarm install configures Claude Code by symlinking agents, slash commands, and hooks into ~/.claude/. swarm doctor checks your Node version, CLI tools, and integration setup.
Quickstart
Run your first multi-agent orchestration:
# Run a full research > implement > review cycle $ swarm orchestrate "add rate limiting to the API" # Preview the plan without executing $ swarm orchestrate "add rate limiting" --dry-run # Use a different pattern $ swarm orchestrate "fix auth bypass" --pattern red-team # Run a single agent $ swarm run researcher "how does the auth middleware work?"
Full Walkthrough
Here's a complete session from scratch -- initializing a project, previewing a plan, running an orchestration, and inspecting the results.
$ cd my-api $ swarm init --yes Project: my-api Backend: claude Pattern: hybrid Budget: $5.00 Created .swarm/
$ swarm plan "add rate limiting to POST /api/users" Execution Plan: hybrid Task: add rate limiting to POST /api/users Phase 1: research [parallel] researcher-code -- researcher (claude-sonnet-4-6) researcher-context -- researcher (claude-sonnet-4-6) Phase 2: implement [sequential] implementer -- implementer (claude-opus-4-6) Phase 3: review [parallel] reviewer-security -- security-reviewer (claude-opus-4-6) reviewer-convention -- reviewer (claude-sonnet-4-6) Est. cost: $0.18 Est. duration: 94s Est. tokens: 28,400 Quality: 87%
$ swarm orchestrate "add rate limiting to POST /api/users" Swarm Engine - hybrid pattern Phase 1: research ━━━━━━━━━━━━━━━━━━━━ 100% 38s ✓ researcher-code claude-sonnet-4-6 3.2K tokens ✓ researcher-context claude-sonnet-4-6 1.8K tokens Phase 2: implement ━━━━━━━━━━━━━━━━━━━ 100% 1m 42s ✓ implementer claude-opus-4-6 12.1K tokens Phase 3: review ━━━━━━━━━━━━━━━━━━━━━━ 100% 52s ✓ reviewer-security claude-opus-4-6 4.8K tokens ✓ reviewer-convention claude-sonnet-4-6 3.1K tokens ┌──────────────────────────────────────────────┐ │ ✓ Orchestration complete │ │ Pattern: hybrid (3 phases, 5 agents) │ │ Tokens: 24,900 | Cost: $0.21 | Time: 3m 12s│ └──────────────────────────────────────────────┘ Changes: src/middleware/rate-limit.ts | 48 +++ src/routes/users.ts | 3 +- tests/middleware/rate-limit.test.ts | 62 +++ Template name (or Enter to skip):
$ swarm memory search "rate limiting" Found 1 result(s): Rate limiting implementation outcome [outcome] Completed hybrid orchestration for rate limiting on POST /api/users... repo: my-api
Project Setup
Initialize Swarm Engine in a project for custom agents and templates:
$ cd my-project $ swarm init Project: my-project Available backends: claude ✓ codex ✓ gemini - Created: .swarm/config.yml .swarm/agents/ .swarm/templates/
The agent resolution chain loads from three locations (project overrides global):
.swarm/agents/in your project (highest priority)~/.claude/agents/user-level agents- Built-in agents shipped with the engine
Architecture
How the pieces fit together:
┌─────────────────────────────────────────────────────────────┐ │ CLI / Slash Command / VS Code │ │ swarm orchestrate "task" | /swarm "task" | @swarm task │ └────────────────────────────┬────────────────────────────────┘ │ ┌────────────────────────────▼────────────────────────────────┐ │ Planner │ │ Pattern selection → Cost estimation → Plan search │ │ Heuristic injection → Adaptive replanning │ └────────────────────────────┬────────────────────────────────┘ │ ┌────────────────────────────▼────────────────────────────────┐ │ DAG Executor │ │ Phase 1 (parallel) → Phase 2 (sequential) → Phase 3 ... │ │ Error retry · Cascade escalation · Agent dropout │ └──┬──────────┬──────────┬──────────┬────────────────────────┘ │ │ │ │ ┌──▼──┐ ┌──▼──┐ ┌──▼──┐ ┌──▼──┐ Backends │Claude│ │Codex│ │Gemin│ │AI SD│ (pluggable) └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ │ │ │ │ ┌──▼──────────▼──────────▼──────────▼────────────────────────┐ │ Memory & Learning │ │ Traces → Heuristics → Knowledge compounding │ │ SQLite + FTS5 | Obsidian vault (optional) │ └─────────────────────────────────────────────────────────────┘
The flow: your task enters through the CLI, slash command, or VS Code. The Planner selects a pattern, estimates cost, searches for the best plan variant, and injects lessons from past runs. The DAG Executor runs each phase in order, dispatching agents in parallel within a phase. Each agent runs on its configured Backend. After completion, the results feed back into Memory for future runs.
Orchestration
An orchestration is a DAG (directed acyclic graph) of phases. Each phase contains one or more agents that run in parallel. Phases execute sequentially, with output from earlier phases feeding into later ones.
Phase 1: research [parallel] researcher-code claude-sonnet-4-6 Explore codebase researcher-context claude-sonnet-4-6 Check memory | Phase 2: implement [parallel] implementer claude-opus-4-6 Build the feature | Phase 3: review [parallel] reviewer-security claude-opus-4-6 OWASP, injection reviewer-perf claude-sonnet-4-6 Perf regression reviewer-convention claude-sonnet-4-6 Code style
The engine handles agent lifecycle, error recovery with retry, adaptive replanning, cost tracking, and output aggregation between phases.
Patterns
Patterns are predefined orchestration workflows. Each defines a sequence of phases, which agents to use, and how they connect.
| Pattern | Phases | Best For |
|---|---|---|
hybrid | Research → Implement → Review | General features, most tasks |
tdd | Write Tests → Implement → Verify → Review | Test-driven development |
red-team | Build → Attack → Harden | Security-sensitive code |
spike | Approach A + B → Judge → Integrate | Uncertain approach, need to compare |
discover | Hypothesize → Experiment → Build | Complex, novel problems |
review-cycle | Implement → Challenge → Fix (loop) | Quality-critical code |
research | Parallel fan-out investigation | Understanding codebases |
Composing Patterns
Chain patterns together with the pipe operator:
# TDD then red-team the result $ swarm orchestrate "add payment processing" --pattern "tdd | red-team" # Research first, then TDD $ swarm orchestrate "migrate to new ORM" --pattern "research | tdd"
Agents
26 specialized agents ship with the engine. Each has a defined role, model preference, tool access, and prompt.
Core Agents
Specialized Reviewers
$ swarm agents list $ swarm agents list --json # Machine-readable
Model Configuration
Every agent specifies which model it runs on. The model field takes the actual model name from any provider.
--- name: security-reviewer model: claude-opus-4-6 # The actual model this agent runs on backend: claude --- --- name: codex-reviewer model: o3 # OpenAI model backend: codex --- --- name: gemini-researcher model: gemini-2.5-pro # Google model backend: gemini ---
Overriding Models
| Level | How | Scope |
|---|---|---|
| Agent definition | model: claude-opus-4-6 in frontmatter | That agent everywhere |
| Project config | agents.reviewer.model: claude-sonnet-4-6 in .swarm.yml | That agent in this project |
| Template | model in agent assignment | That agent in that template |
| CLI flag | --model claude-opus-4-6 | All agents in this run |
| Adaptive replanner | Automatic | Adjusts at runtime based on task complexity |
# Force all agents to use Opus $ swarm orchestrate "add auth" --model claude-opus-4-6 # Force all agents to use Haiku (cheapest) $ swarm orchestrate "add docs" --model claude-haiku-4-5-20251001 # Single agent with a specific model $ swarm run researcher "explore codebase" -m claude-opus-4-6
Tiers
The planner uses a tier to estimate cost and decide when to upgrade or downgrade models. Tiers are auto-inferred from the model name, so you rarely need to set them manually.
| Tier | When to use | Claude | OpenAI | |
|---|---|---|---|---|
cheap | Quick tasks, boilerplate, scanning | claude-haiku-4-5 | gpt-4o-mini, o4-mini | gemini-2.5-flash |
mid | Most tasks, general implementation | claude-sonnet-4-6 | gpt-4o, gpt-4.1 | gemini-2.5-flash |
expensive | Hard reasoning, security, architecture | claude-opus-4-6 | o3 | gemini-2.5-pro |
You can set tier explicitly in the frontmatter to override the auto-inference, but the default behavior covers most cases:
--- name: my-agent model: o3 backend: codex tier: expensive # Optional. Auto-inferred from model name if omitted. ---
expensive agent for a mid one to save cost. If a task is struggling, it may escalate from mid to expensive.Backends
Different agents can use different AI backends in the same orchestration. Swarm Engine supports four backends, each with its own CLI tool or SDK.
| Backend | CLI Tool | Models | API Key |
|---|---|---|---|
claude | Claude Code | Opus, Sonnet, Haiku | Via claude auth login |
codex | OpenAI Codex CLI | o4-mini, o3, GPT-4.1 | OPENAI_API_KEY |
gemini | Google Gemini CLI | Gemini 2.5 Pro, 2.5 Flash | GEMINI_API_KEY |
vercel-ai | Programmatic (no CLI) | 20+ providers | Per-provider env vars |
Installing Backends
# Claude Code (required for default backend) $ npm install -g @anthropic-ai/claude-code # OpenAI Codex CLI $ npm install -g @openai/codex $ export OPENAI_API_KEY=sk-... # Google Gemini CLI $ npm install -g @google/gemini-cli $ export GEMINI_API_KEY=AIza... # Vercel AI SDK (for programmatic access to 20+ providers) $ npm install ai @ai-sdk/openai @ai-sdk/anthropic @ai-sdk/google # Check which backends are available $ swarm doctor
Using Different Backends
# Run entire orchestration on Codex $ swarm orchestrate "add rate limiting" --backend codex # Run entire orchestration on Gemini $ swarm orchestrate "add logging" --backend gemini # Run a single agent on Codex $ swarm run researcher "explore the auth module" --backend codex
Mixing Backends Per Agent
The real power is mixing backends in the same orchestration. Set backend and model in the agent's frontmatter:
--- name: codex-reviewer description: "Code reviewer powered by OpenAI" model: o3 backend: codex tools: Read, Glob, Grep, Bash --- You are a code reviewer. Review for correctness, edge cases, and security.
--- name: gemini-researcher description: "Research agent powered by Google Gemini" model: gemini-2.5-pro backend: gemini tools: Read, Glob, Grep --- You are a research agent. Thoroughly investigate the codebase and report findings.
Now when you run an orchestration, Claude handles implementation, Codex handles review, and Gemini handles research -- all in the same run:
$ swarm orchestrate "add payment processing" Pattern: hybrid Phase 1: research gemini-researcher gemini gemini-2.5-pro Phase 2: implement implementer claude claude-sonnet-4-6 Phase 3: review codex-reviewer codex o3 security-reviewer claude claude-opus-4-6
Vercel AI SDK (Advanced)
For programmatic access without CLI tools, use the vercel-ai backend. Specify the model in provider/model format:
--- name: llama-researcher model: groq/llama-3.3-70b-versatile backend: vercel-ai ---
Cost Guide
Every orchestration consumes API tokens billed by the underlying providers. Here's what to expect.
Typical Costs by Pattern
| Pattern | Agents | Tokens | Estimated Cost |
|---|---|---|---|
hybrid | 5-6 | 25-40K | $0.15 - $0.40 |
tdd | 6-8 | 35-55K | $0.25 - $0.60 |
red-team | 6-8 | 40-60K | $0.30 - $0.70 |
spike | 5-7 | 40-60K | $0.30 - $0.65 |
review-cycle | 4-6 | 30-50K | $0.20 - $0.50 |
research | 2-4 | 10-20K | $0.05 - $0.15 |
Single agent (swarm run) | 1 | 5-15K | $0.02 - $0.10 |
These are rough estimates for typical tasks. Complex tasks with large codebases will use more tokens. Simple tasks will use less.
Tier Pricing
Cost depends heavily on which tier the agents use:
| Tier | Input (per 1M tokens) | Output (per 1M tokens) | Use for |
|---|---|---|---|
cheap | $0.25 | $1.25 | Scanning, docs, dependencies |
mid | $3.00 | $15.00 | Research, most implementation |
expensive | $15.00 | $75.00 | Security review, complex reasoning |
Controlling Cost
# Preview cost before running (free, instant) $ swarm plan "add auth" # Set a hard budget cap $ swarm orchestrate "add auth" --budget 1 # Test with mock execution (free, no API calls) $ swarm orchestrate "add auth" --mock # Force cheaper models $ swarm orchestrate "add auth" --model claude-sonnet-4-6
swarm orchestrate
Run a full multi-phase orchestration pattern.
swarm orchestrate <task> [options]
| Option | Default | Description |
|---|---|---|
-p, --pattern | hybrid | Orchestration pattern to use |
-m, --model | Override model for all agents (e.g., claude-opus-4-6, o3) | |
-b, --backend | claude | Execution backend |
--budget | 5 | Max budget in USD |
--dry-run | Show plan without executing | |
--tui | Live terminal dashboard | |
--panes | Run agents in tmux split panes | |
--mock | Mock execution (no API calls) | |
--verbose | Debug-level output | |
--json | Machine-readable JSON output |
# Full orchestration with live TUI dashboard $ swarm orchestrate "add WebSocket support" --tui # Test-driven with $2 budget cap $ swarm orchestrate "add input validation" --pattern tdd --budget 2 # Red-team with agents in visible tmux panes $ swarm orchestrate "harden the auth flow" --pattern red-team --panes # Compose patterns: TDD then adversarial review $ swarm orchestrate "add payment API" --pattern "tdd | red-team" # Dry run to preview cost and plan $ swarm orchestrate "refactor database layer" --dry-run
Running Modes
There are three ways to run an orchestration, each with different visual output:
| Mode | How | Agent Visibility | Terminal Support |
|---|---|---|---|
| Inline progress | swarm orchestrate "task" |
Rich progress bars, phase status, live cost counter in your terminal | Any terminal |
| TUI dashboard | swarm orchestrate "task" --tui |
Full-screen dashboard with agent status, token usage, phase timeline | Any terminal |
| tmux panes | swarm orchestrate "task" --panes |
Each agent runs in a visible tmux split pane showing raw output | Any terminal with tmux installed |
| Claude Code panes | /swarm "task" in Claude Code |
Each agent spawns as a teammate in a native split pane | iTerm2, VS Code terminal |
brew install tmux (or your package manager). They work in any terminal.Claude Code panes are built into Claude Code and require no extra config. They work automatically in iTerm2 and VS Code. In terminals without pane support, agents still run in the background.
swarm run
Run a single agent with a task. Useful for quick operations that don't need a full orchestration.
swarm run <agent> <task> [options]
| Option | Default | Description |
|---|---|---|
-m, --model | claude-sonnet-4-6 | Model to use (e.g., claude-opus-4-6, o3) |
-b, --backend | claude | Execution backend |
--timeout | 300 | Timeout in seconds |
--mock | Mock execution |
# Research a codebase question $ swarm run researcher "how does the caching layer work?" # Run the security reviewer on current changes $ swarm run security-reviewer "review recent changes for vulnerabilities" -m claude-opus-4-6 # Debug a specific issue $ swarm run debugger "why does the login endpoint return 500?"
swarm plan
Generate an execution plan without running it. Shows phases, agents, estimated cost, duration, and quality score.
swarm plan <task> [options] # Options: -p/--pattern, --budget, --json
$ swarm plan "add rate limiting" --pattern hybrid Execution Plan: hybrid Task: add rate limiting Budget: $5 Phase 1: research [parallel] researcher-code — researcher (claude-sonnet-4-6) researcher-context — researcher (claude-sonnet-4-6) Phase 2: implement [sequential] implementer — implementer (claude-opus-4-6) Phase 3: review [parallel] reviewer-security — security-reviewer (claude-opus-4-6) reviewer-perf — performance-reviewer (claude-sonnet-4-6) reviewer-convention — reviewer (claude-sonnet-4-6) Est. cost: $0.1842 Est. duration: 94s Est. tokens: 28,400 Quality: 87% Optimizations applied: → Downgraded reviewer-convention from expensive to mid tier (low complexity) → Injected 2 heuristics from similar past runs
swarm template
Manage reusable orchestration templates.
# List available templates $ swarm template list # Run a template interactively (prompts for parameters) $ swarm template run add-feature -i # Run with explicit parameters $ swarm template run bug-fix --param bug_description="login fails on mobile" # Create template from last successful orchestration $ swarm template create my-workflow --from-last # Create a blank template scaffold $ swarm template create my-workflow
swarm memory
Query and manage the knowledge base. The engine learns from every orchestration and stores decisions, patterns, and outcomes.
# Search the knowledge base $ swarm memory search "authentication" $ swarm memory search "rate limiting" --type decision --repo my-api # Store a new entry $ echo "Use Redis for rate limiting, not in-memory" | swarm memory store decision "Rate limit backend choice" --repo my-api # List recent entries $ swarm memory list --type outcome --limit 5 # View statistics $ swarm memory stats
decision, pattern, learning, context, outcome. The engine automatically stores outcomes after each orchestration and injects relevant memories into future plans.swarm convert
Export agents to work natively in other AI coding tools.
# List conversion targets $ swarm convert --list # Convert agents for a specific tool $ swarm convert --to copilot # GitHub Copilot instructions $ swarm convert --to cursor # Cursor rules $ swarm convert --to codex # OpenAI Codex agents $ swarm convert --to gemini # Gemini CLI $ swarm convert --to opencode # OpenCode $ swarm convert --to windsurf # Windsurf # Convert only agent definitions (skip commands) $ swarm convert --to cursor --agents-only # Custom output directory $ swarm convert --to copilot --output ./my-copilot-config
swarm agents
Manage agent definitions. Agents are loaded from project, user, installed packs, and built-in directories.
# List all registered agents $ swarm agents list $ swarm agents list --json # Show full agent definition (frontmatter + prompt) $ swarm agents show researcher # Create a new agent scaffold $ swarm agents new my-reviewer $ swarm agents new my-reviewer --dir ~/.swarm/agents # Install agents from a GitHub repo $ swarm agents install github:myorg/my-agents # Test an agent with mock execution $ swarm agents test researcher "explore the auth module" # Search GitHub for agent packs $ swarm agents search security
swarm init
Interactive project setup wizard. Creates .swarm/ directory structure with agents, templates, and a project config file.
$ swarm init # Interactive setup $ swarm init --yes # Accept defaults, skip prompts
Detects available backends, prompts for default pattern and budget, creates .swarm.yml, and runs swarm doctor.
swarm verify
Run project verification commands. Auto-detects TypeScript, Node.js, ESLint, Python, Go, and Rust projects.
$ swarm verify # Run all detected checks $ swarm verify --cwd ./packages/api # Verify a subdirectory
swarm status
Show active teams, registered agents, templates, recent orchestrations, and saved checkpoints.
$ swarm status # Full status overview $ swarm status --backends # Available execution backends $ swarm status --patterns # Orchestration patterns with phase breakdown
swarm compound
Compound knowledge from orchestration outcomes into searchable solution docs, organized by problem category.
# Extract solutions from recent traces $ swarm compound extract --count 10 # Search compounded knowledge $ swarm compound search "rate limiting" # List by category $ swarm compound list --category security # Statistics and stale entries $ swarm compound stats $ swarm compound stale
bug-fix, feature, refactor, security, performance, testing, migration, integration, architectureswarm learn
Interactive tutorial with 5 guided lessons. Progress is tracked across sessions.
$ swarm learn # List lessons with completion status $ swarm learn 1 # Lesson 1: Your First Plan $ swarm learn 2 # Lesson 2: Run a Mock Orchestration $ swarm learn 3 # Lesson 3: Explore Your Agents $ swarm learn 4 # Lesson 4: Templates $ swarm learn 5 # Lesson 5: Health Check
swarm doctor
Diagnostic tool that checks your installation.
$ swarm doctor Swarm Doctor ✓ Node.js 22.1.0 ✓ jq ✓ Claude Code ✓ swarm on PATH ✓ CLAUDE.md configured ✓ ~/.swarm/ exists ✓ Claude Code integration installed Optional backends: ✓ Codex CLI - Gemini not installed (npm i -g @google/gemini-cli) All checks passed.
Templates
Templates are parameterized orchestrations you can reuse. They define a pattern, parameters, agent customizations, and success criteria.
name: add-feature description: "Add a new feature with tests and documentation" parameters: feature: type: string description: "Feature description" required: true test_framework: type: string description: "Test framework to use" default: "jest" pattern: hybrid agents: researcher: extra_rules: - "Understand how similar features are implemented" - "Identify affected modules and side effects" implementer: extra_rules: - "Follow existing code patterns" - "Write tests using ${test_framework}" reviewer: extra_rules: - "Check for edge cases and error handling" - "Verify test coverage" success_criteria: - "Feature works as described" - "Tests pass with good coverage" - "No regressions" tags: [feature, implementation]
8 built-in templates ship with the engine: add-feature, add-endpoint, bug-fix, refactor, migration, security-audit, explore, and fix-pr.
Custom Agents
Agents are markdown files with YAML frontmatter. Create custom agents in .swarm/agents/ or ~/.claude/agents/.
--- name: api-reviewer description: "Reviews API endpoints for consistency and best practices" model: claude-sonnet-4-6 tools: Read, Glob, Grep, Bash disallowedTools: Write, Edit permissionProfile: safe maxTurns: 20 --- You are an API Review Agent. Check every endpoint for: 1. Consistent naming (REST conventions) 2. Proper HTTP status codes 3. Input validation on all parameters 4. Rate limiting configuration 5. Authentication requirements 6. Response schema consistency ## Output Format ``` ## API Review - Endpoint: [path] - Status: [pass/fail] - Issues: [list] ```
name, description, model (actual model name, e.g., claude-sonnet-4-6, o3, gemini-2.5-pro), tier (optional: cheap/mid/expensive), tools, disallowedTools, permissionProfile (safe/default/bypassPermissions), maxTurns, backend (claude/codex/gemini/vercel-ai).Memory & Learning
The engine has a built-in knowledge base that stores decisions, patterns, learnings, and outcomes. It learns from every orchestration and feeds that knowledge into future plans.
Storage
Memory is backed by SQLite with FTS5 full-text search. The database is created automatically on first use -- no setup required.
| Component | Location | Setup |
|---|---|---|
| Memory database | ~/.swarm/data/memory.db | Auto-created on first use |
| Heuristic store | ~/.swarm/data/heuristics.db | Auto-created on first use |
| Trace store | ~/.swarm/data/traces.db | Auto-created on first use |
| Obsidian vault (optional) | ~/swarm-vault/ or $SWARM_VAULT | Manual setup |
All SQLite databases and the ~/.swarm/ directory are created automatically. You don't need to install SQLite separately -- it's bundled via better-sqlite3.
Obsidian Vault (Optional)
If you use Obsidian, the engine can write every memory entry as a markdown file to a vault directory. This gives you cross-machine sync (via Obsidian Sync or git), human-browsable knowledge, and graph visualization of related entries.
# Option 1: Use the default location $ mkdir -p ~/swarm-vault # Option 2: Use a custom location via environment variable $ export SWARM_VAULT=/path/to/your/obsidian/vault # Then open the vault directory in Obsidian
When a vault directory exists, every swarm memory store call writes both to SQLite (primary) and to the vault as a markdown file with YAML frontmatter. The vault structure looks like:
~/swarm-vault/
decisions/
2026-04-05-rate-limit-backend-choice.md
2026-04-03-auth-session-storage.md
patterns/
2026-04-04-error-handling-pattern.md
learnings/
2026-04-05-q1-refactor-results.md
repos/
my-api/
rate-limiting.md
auth-middleware.md
Heuristic Learning
After each orchestration, the engine extracts lessons from execution traces. These heuristics are injected into future plans for similar tasks.
Run 1: "add auth" → research took 60s, implement took 120s Run 2: "add rate limiting" → engine injects heuristic: "Similar tasks average 90s research. Allocate accordingly." Run 3: "add caching" → engine now has 2 data points, refines estimate
Knowledge Compounding
Solutions are organized by problem category (auth, data, API, testing, etc.) and retrieved when similar tasks appear.
$ swarm compound extract # Extract knowledge from recent traces $ swarm compound search "auth" # Search compounded knowledge $ swarm compound stats # Show category breakdown $ swarm compound stale # Find outdated entries
Execution Traces
Every orchestration is recorded as a trace with full timing, token usage, and outcome data. Traces feed the heuristic learning and plan optimization systems.
Cross-Tool Export
Export your agent definitions to work natively in 6 other AI coding tools. The converter translates frontmatter, prompts, and tool configurations into each tool's native format.
| Target | Output | What's Generated |
|---|---|---|
copilot | .github/copilot-instructions.md | Copilot instruction files |
cursor | .cursor/rules/ | Cursor rule files |
codex | .codex/agents/ | Codex agent definitions |
gemini | .gemini/agents/ | Gemini agent configs |
opencode | .opencode/agents/ | OpenCode agent files |
windsurf | .windsurf/ | Windsurf rule files |
Project Config
The .swarm.yml file in your project root configures defaults for all orchestrations in that project.
project: my-api default_pattern: hybrid default_backend: claude cost_budget: 5.00 agents: reviewer: extra_rules: - "Check for PII in log statements" - "Verify all endpoints have rate limiting" context_files: - docs/architecture.md - docs/api-spec.md
Create this file with swarm init or manually. Agent extra_rules are appended to the agent's prompt for every orchestration in this project.
Claude Code Integration
After running swarm install, Swarm Engine integrates directly into Claude Code with slash commands, agents, and hooks.
Setup
$ swarm install Installed: ~/.claude/agents/ 26 agent definitions ~/.claude/commands/ 9 slash commands ~/.claude/hooks/ event hooks ~/.claude/CLAUDE.md swarm snippet injected
Slash Commands
# Full orchestration (hybrid pattern by default) /swarm "add rate limiting to the API" # Pattern-specific commands /research "how does the auth system work?" /tdd "add input validation to user endpoints" /red-team "harden the payment flow" /review-cycle "refactor the database layer" # Utility commands /diff-review # Review current git diff /fix-pr "fix CI failures" # Fix a failing pull request
Split Panes
When you run /swarm in Claude Code, each agent spawns as a teammate in its own split pane. This is Claude Code's native team protocol, not tmux.
| Terminal | Pane Support | Notes |
|---|---|---|
| iTerm2 (macOS) | Full split panes | Each agent visible in its own pane |
| VS Code terminal | Full split panes | Works in the integrated terminal |
| Terminal.app | Background only | Agents run but no visible panes |
| Other terminals | Varies | Falls back to background if unsupported |
No extra configuration needed. Claude Code detects your terminal and uses native pane splitting when available. Agents always complete their work regardless of pane visibility.
swarm orchestrate "task" --panes (uses tmux, works everywhere).VS Code Extension
The VS Code extension adds Swarm Engine to the Copilot Chat sidebar.
# Use the @swarm chat participant @swarm add rate limiting to the API # Or use the command palette Cmd+Shift+P → Swarm: Orchestrate
Install from the VS Code Marketplace.
Environment Variables
| Variable | Purpose | Required |
|---|---|---|
OPENAI_API_KEY | API key for Codex backend | Only if using --backend codex |
GEMINI_API_KEY | API key for Gemini backend | Only if using --backend gemini |
GOOGLE_API_KEY | Alternate key for Gemini | Alternative to GEMINI_API_KEY |
SWARM_VAULT | Path to Obsidian vault directory | No (defaults to ~/swarm-vault/) |
SWARM_DATA_DIR | Override data directory | No (defaults to ~/.swarm/data/) |
Claude Code authentication is handled through claude auth login, not environment variables.
Best Practices
Choosing a Pattern
- Start with
hybridfor most tasks. It covers research, implementation, and review. - Use
researchwhen you just want to understand code, not change it. - Use
tddwhen test coverage matters. The tests-first approach catches requirements gaps early. - Use
red-teamfor anything touching auth, payments, or user data. - Use
spikewhen you're not sure which approach to take. Let two implementations compete.
Keeping Costs Down
- Always run
swarm planfirst to preview the cost estimate. - Use
--budgetto set a hard cap. The engine will stop if the budget is exceeded. - Use
--mockto test orchestration flow without API calls. - For simple tasks, override with a cheaper model:
--model claude-sonnet-4-6. - The adaptive replanner automatically downgrades when tasks are simpler than expected.
- Use
swarm runfor single-agent tasks instead of a full orchestration.
Writing Custom Agents
- Start from an existing agent and modify it. Don't write from scratch.
- Be specific in the prompt. "Review for security" is weak. "Check for SQL injection in parameterized queries, XSS in template rendering, and IDOR in resource endpoints" is strong.
- Always define an output format. Agents without structured output produce inconsistent results.
- Set
disallowedToolsfor agents that should be read-only. This prevents accidental file writes. - Use
swarm agents test my-agent "test prompt"to dry-run before using in orchestrations.
Working with Memory
- Run
swarm memory statsperiodically to see what the engine has learned. - Store important decisions manually:
echo "Use Redis for sessions" | swarm memory store decision "Session backend". - Use
swarm compound staleto find outdated knowledge entries. - Agents check memory before starting. The more you store, the better future orchestrations get.
Glossary
| Term | Definition |
|---|---|
| Agent | A specialized AI worker with a defined role, model, tools, and prompt. Runs a single task within a phase. |
| Backend | The AI tool that executes an agent: Claude Code, Codex CLI, Gemini CLI, or Vercel AI SDK. |
| DAG | Directed acyclic graph. The execution structure where phases run in dependency order. |
| Heuristic | A lesson extracted from a past execution trace. Injected into future plans to improve estimates. |
| Model | The specific AI model an agent runs on (e.g., claude-opus-4-6, o3). |
| Orchestration | A complete multi-agent run from start to finish, following a pattern through multiple phases. |
| Pattern | A predefined workflow defining which phases run, with what agents, in what order (e.g., hybrid, tdd, red-team). |
| Phase | A step in an orchestration. Contains one or more agents that run in parallel or sequentially. |
| Template | A parameterized, reusable orchestration config. Like a pattern but with custom agent rules and parameters. |
| Tier | Cost/capability classification: cheap (fast, simple tasks), mid (general work), expensive (hard reasoning). |
| Trace | A recorded execution with timing, tokens, cost, and outcome. Feeds the learning system. |
Troubleshooting
Installation
| Problem | Fix |
|---|---|
swarm: command not found | Run npm install -g swarm-engine. If installed, check your PATH includes npm global bin: npm bin -g |
swarm doctor says "Node.js < 20" | Upgrade Node: nvm install 22 or brew install node |
swarm doctor says "CLAUDE.md missing" | Run swarm install to inject the Claude Code integration snippet |
swarm doctor says "jq not found" | brew install jq (macOS) or apt install jq (Linux) |
Orchestration
| Problem | Fix |
|---|---|
| Orchestration stuck on "running" | Check swarm status. An agent may be waiting for input. Use --budget to set a timeout, or Ctrl+C and check the event log at ~/.swarm/data/events.jsonl |
| "Unknown pattern: xyz" | Check available patterns: swarm status --patterns. Built-in: hybrid, tdd, red-team, spike, discover, review-cycle, research |
| "Unknown agent: xyz" | Check registered agents: swarm agents list. Custom agents go in .swarm/agents/ |
| Budget exceeded mid-run | Normal. The engine stops when the budget cap is hit. Increase with --budget 10 or use swarm plan to estimate first |
| Agent returning low confidence | The task may be too vague or the model too weak. Try a more specific prompt or --model claude-opus-4-6 |
Backends
| Problem | Fix |
|---|---|
| "Authentication error" | For Claude: claude auth login. For Codex: check OPENAI_API_KEY. For Gemini: check GEMINI_API_KEY |
| "Backend not available: codex" | npm install -g @openai/codex then verify with swarm doctor |
| "Backend not available: gemini" | npm install -g @google/gemini-cli then verify with swarm doctor |
| tmux panes not showing | Install tmux: brew install tmux. Verify: tmux -V |
Memory
| Problem | Fix |
|---|---|
| "No results found" on search | Memory builds up over time. Run a few orchestrations first. Or store entries manually with swarm memory store |
| Vault not syncing | Check the vault directory exists: ls ~/swarm-vault/. Or set export SWARM_VAULT=/your/path |
| SQLite database locked | Another swarm process may be running. Check with swarm status or ps aux | grep swarm |