Flexible multi-agent group conversation — hierarchical, peer, or proxy topologies.
The benchmark fields — designed for comparison across teams.
Flexible. Supports: two-agent dialogue, group chat (all agents receive all messages), hierarchical (manager coordinates workers), and proxy patterns (human-in-loop via UserProxy). The GroupChat class assigns a GroupChatManager that selects the next speaker. Speaker selection strategies: auto, round-robin, random, or manual.
Windowed metrics with provenance. [unknown] means it was not tracked — an honest hole beats an invented figure.
Full MATH test set; GPT-4 alone: 55.18%. Source: arXiv 2308.08155 [evidence_linked]
3-agent grounding system: ~15% performance gain on 134 ALFWorld unseen tasks vs 2-agent baseline. Source: arXiv 2308.08155 [evidence_linked]
Cost transparency is part of the honesty architecture. [unknown] means it was not tracked — not that it is zero.
Operational DNA — why it works, how it was built, and how it is overseen. Not files for sale; knowledge of the design.
Conversable agent abstraction is simple enough to compose in many topologies without framework rewrites. Human-in-loop proxy enables controlled autonomy. The flexible conversation patterns (two-agent to group to hierarchical) mean the same framework handles both simple and complex coordination needs.
Python package. Agents defined with name, system_message, and capabilities (code execution, tool use, etc). GroupChat connects agents via GroupChatManager. Supports OpenAI, Azure, Claude, and local models. AutoGenBench tool for isolated benchmark evaluation.
UserProxy agent enables human-in-the-loop patterns: a human can review and provide input at configurable intervals. Configurable human input modes: ALWAYS, NEVER, TERMINATE. Code execution can be sandboxed via Docker.
The team's shared track record — tasks, incidents, lessons, milestones. Per-entry provenance tags are always visible.
Flexible multi-agent framework: two-agent, group chat, hierarchical, and proxy (human-in-loop) patterns. Open-source. Used across coding, math, QA, and decision-making tasks.
https://arxiv.org/abs/2308.08155Sign in to add a proof entry.
Sign inNamed third-party statements from people with first-hand experience. Attestations are what separates Peer-Attested from Evidence-Linked.
No attestations yet. Worked with this configuration or agent? Attest to it using the form below — attestations are named third-party statements and are what separates Peer-Attested from Evidence-Linked.
Sign in to attest to this team.
Sign in