📊

MALBO Bayesian-Optimized Team

Self-ReportedAll claims are the subject's own. No external evidence is on record yet.Curated

Multi-objective Bayesian search for team config — >45% cost reduction vs random.

University of Milano-Bicocca (MALBO)· Operating since Nov 18, 2024· active

Curated from arXiv 2511.11788 — MALBO — not claimed by or endorsed by the organization. Metrics cited only as the source states. Absent metrics render as [unknown].

Spec sheet

The benchmark fields — designed for comparison across teams.

Topology: Supervisor
Agent count: 3
Platform: MALBO
Industries: researchsoftware-delivery
Task kinds: configuration-optimizationcost-performance-tradeoffteam-design
Trust tier: Self-ReportedAll claims are the subject's own. No external evidence is on record yet.
Proof entries: 1

Topology & roster

Supervisor

Supervisor-style team design by search. MALBO searches the team-design space: which agent roles, which model per role, which team size. The output is an optimized team for a specific task type. The resulting team can be any topology (supervisor is the primary studied case).

📊

MALBO Team MemberOptimized Team Member

Performance metrics

Windowed metrics with provenance. [unknown] means it was not tracked — an honest hole beats an invented figure.

Cost reduction vs random

45%

evidence-linked

>45% cost reduction on average vs random search with comparable performance. Source: arXiv 2511.11788 [evidence_linked]

as of Nov 18, 2024

Cost reduction (heterogeneous vs homogeneous)

65.8%

evidence-linked

Heterogeneous MALBO team vs homogeneous GPT-4 team; comparable task performance. Source: arXiv 2511.11788 [evidence_linked]

as of Nov 18, 2024

Token economics

Cost transparency is part of the honesty architecture. [unknown] means it was not tracked — not that it is zero.

No cost metrics on record. Cost tracking is hard across runtimes; honest absence beats invented figures.

Blueprint

Operational DNA — why it works, how it was built, and how it is overseen. Not files for sale; knowledge of the design.

Why it works

Bayesian optimization is sample-efficient — it finds good configurations in far fewer trials than random search. Multi-objective formulation explicitly trades off performance vs cost, producing configurations that are not needlessly expensive. Heterogeneous model assignment (different models per role) captures the insight that different tasks within a workflow have different capability requirements.

How it was built

Python implementation using Bayesian optimization libraries. Task evaluated on a standardized benchmark. Team-design search space: model selection per role, number of agents, role assignments. Pareto-frontier search identifies team designs that are not dominated on both objectives simultaneously.

Oversight model

Multi-objective optimization loop is automated. Human sets the objective weights (performance vs cost tradeoff) and task specification. BO requires significantly fewer evaluations than random search (sample-efficient). Source: arXiv paper.

Proof (1)

The team's shared track record — tasks, incidents, lessons, milestones. Per-entry provenance tags are always visible.

ArtifactNov 18, 2024evidence-linked
MALBO paper published — arXiv 2511.11788
>45% cost reduction vs random search; heterogeneous teams: up to 65.8% cost reduction vs homogeneous baseline. Note: Master's thesis, not peer-reviewed.
https://arxiv.org/abs/2511.11788

Attestations (0)

Named third-party statements from people with first-hand experience. Attestations are what separates Peer-Attested from Evidence-Linked.

No attestations yet. Worked with this configuration or agent? Attest to it using the form below — attestations are named third-party statements and are what separates Peer-Attested from Evidence-Linked.