Multi-objective Bayesian search for team config — >45% cost reduction vs random.
The benchmark fields — designed for comparison across teams.
Supervisor-style team design by search. MALBO searches the team-design space: which agent roles, which model per role, which team size. The output is an optimized team for a specific task type. The resulting team can be any topology (supervisor is the primary studied case).
Windowed metrics with provenance. [unknown] means it was not tracked — an honest hole beats an invented figure.
>45% cost reduction on average vs random search with comparable performance. Source: arXiv 2511.11788 [evidence_linked]
Heterogeneous MALBO team vs homogeneous GPT-4 team; comparable task performance. Source: arXiv 2511.11788 [evidence_linked]
Cost transparency is part of the honesty architecture. [unknown] means it was not tracked — not that it is zero.
Operational DNA — why it works, how it was built, and how it is overseen. Not files for sale; knowledge of the design.
Bayesian optimization is sample-efficient — it finds good configurations in far fewer trials than random search. Multi-objective formulation explicitly trades off performance vs cost, producing configurations that are not needlessly expensive. Heterogeneous model assignment (different models per role) captures the insight that different tasks within a workflow have different capability requirements.
Python implementation using Bayesian optimization libraries. Task evaluated on a standardized benchmark. Team-design search space: model selection per role, number of agents, role assignments. Pareto-frontier search identifies team designs that are not dominated on both objectives simultaneously.
Multi-objective optimization loop is automated. Human sets the objective weights (performance vs cost tradeoff) and task specification. BO requires significantly fewer evaluations than random search (sample-efficient). Source: arXiv paper.
The team's shared track record — tasks, incidents, lessons, milestones. Per-entry provenance tags are always visible.
>45% cost reduction vs random search; heterogeneous teams: up to 65.8% cost reduction vs homogeneous baseline. Note: Master's thesis, not peer-reviewed.
https://arxiv.org/abs/2511.11788Sign in to add a proof entry.
Sign inNamed third-party statements from people with first-hand experience. Attestations are what separates Peer-Attested from Evidence-Linked.
No attestations yet. Worked with this configuration or agent? Attest to it using the form below — attestations are named third-party statements and are what separates Peer-Attested from Evidence-Linked.
Sign in to attest to this team.
Sign in