Software delivery teams

Self-ReportedAll claims are the subject's own. No external evidence is on record yet.Curated

3 proof

📐

Aider Architect/Editor

Aider (Paul Gauthier)

Two-role pipeline: reasoning architect + format-specialist editor — 85% SWE-bench.

Architect· o1-previewEditor· o1-mini

Pipeline2 agentsAider

Outcome85%

Self-ReportedAll claims are the subject's own. No external evidence is on record yet.Curated

🔀

Anthropic Orchestrator-Workers Pattern

Anthropic

Central LLM dynamically breaks down tasks and delegates to specialist workers.

Orchestrator–Worker2 agentsClaude API

OrchestratorWorker

software-deliveryresearchdata-extraction

Self-ReportedAll claims are the subject's own. No external evidence is on record yet.Curated

💬

AutoGen Group Chat

Microsoft Research (AutoGen)

Flexible multi-agent group conversation — hierarchical, peer, or proxy topologies.

updated 2y ago

Supervisor3 agentsAutoGen

ConversableAgent

software-deliveryresearchdata-extraction

Outcome69.5%

Self-ReportedAll claims are the subject's own. No external evidence is on record yet.Curated

💬

ChatDev Communicative Pipeline

Academic / Open-Source (ChatDev & MetaGPT)

5-role sequential pipeline — 22,949 tokens, 148s per software task.

updated 2y ago

Pipeline5 agentsChatDev

CEOCTOProgrammerReviewer+1 more

Claude Code Agent Teams (Experimental)

Anthropic

Self-ReportedAll claims are the subject's own. No external evidence is on record yet.Curated

Swarm teammates sharing a task list — parallel exploration on a single codebase.

Teammate ATeammate BTeammate C

Swarm3 agentsClaude Code

Self-ReportedAll claims are the subject's own. No external evidence is on record yet.Curated

🔱

Claude Code Sub-agents Pattern

Anthropic

Lead agent spawns specialist subagents in isolated context windows.

Orchestrator–Worker2 agentsClaude Code

LeadSubagent

Self-ReportedAll claims are the subject's own. No external evidence is on record yet.Curated

🏆

Claude SWE-Bench Team

Anthropic

Single-agent software engineer achieving 49% on SWE-bench Verified.

Solo + Tools1 agentClaude API

Software Engineer· Claude 3.5 Sonnet

Outcome49%

Self-ReportedAll claims are the subject's own. No external evidence is on record yet.Curated

🕹️

Magentic-One

Microsoft Research

Microsoft Research generalist 5-agent system: GAIA 32.33%, WebArena 32.8%.

Supervisor5 agentsAutoGen

Orchestrator· GPT-4oWebSurfer· GPT-4oFileSurfer· GPT-4oCoder· GPT-4o+1 more

researchsoftware-deliverydata-extraction

Self-ReportedAll claims are the subject's own. No external evidence is on record yet.Curated

📊

MALBO Bayesian-Optimized Team

University of Milano-Bicocca (MALBO)

Multi-objective Bayesian search for team config — >45% cost reduction vs random.

researchsoftware-delivery

Supervisor3 agentsMALBO

Optimized Team Member

Self-ReportedAll claims are the subject's own. No external evidence is on record yet.Curated

🏭

MetaGPT Software Dev Pipeline

Academic / Open-Source (ChatDev & MetaGPT)

5-agent SOP-encoded pipeline — 124.3 tokens/LoC, executability 3.75/4.

updated 2y ago

Pipeline5 agentsMetaGPT

Product ManagerArchitectProject ManagerEngineer+1 more

Outcome85.9%

Self-ReportedAll claims are the subject's own. No external evidence is on record yet.Curated

🙌

OpenHands (OpenDevin)

All Hands AI (OpenHands)

Open-source AI software developer with sandboxed runtime — ICLR 2025.

Solo + Tools1 agentOpenHands

AI Developer

Outcome26%