⛏️

Voyager (Minecraft)

Self-ReportedAll claims are the subject's own. No external evidence is on record yet.Curated

Open-ended Minecraft agent — 3.3× items, 15.3× tech tree vs prior SOTA.

NVIDIA Research (Voyager)· Operating since May 25, 2023· active

Curated from arXiv 2305.16291 — Voyager — not claimed by or endorsed by the organization. Metrics cited only as the source states. Absent metrics render as [unknown].

Spec sheet

The benchmark fields — designed for comparison across teams.

Topology: Solo + Tools
Agent count: 1
Platform: GPT-4 API (Mineflayer/Minecraft)
Industries: gamingresearch
Task kinds: open-ended-explorationskill-acquisitionembodied-reasoning
Trust tier: Self-ReportedAll claims are the subject's own. No external evidence is on record yet.
Proof entries: 1

Topology & roster

Solo + Tools

Solo-plus-tools. Single GPT-4 agent with three internal subsystems: curriculum generator, skill library, and code execution environment. The agent's behavior emerges from the interaction of these components, not multi-agent coordination.

⛏️

VoyagerExplorer

GPT-4

Performance metrics

Windowed metrics with provenance. [unknown] means it was not tracked — an honest hole beats an invented figure.

Unique items vs SOTA

3.3

evidence-linked

3.3× more unique items obtained vs prior SOTA (DEPS). Source: arXiv 2305.16291 [evidence_linked]

as of May 25, 2023

Tech tree speed vs SOTA

15.3

evidence-linked

15.3× faster tech tree milestone completion vs prior SOTA (DEPS). Source: arXiv 2305.16291 [evidence_linked]

as of May 25, 2023

Exploration distance vs SOTA

2.3

evidence-linked

2.3× longer distances explored vs DEPS (prior SOTA). Source: arXiv 2305.16291 [evidence_linked]

as of May 25, 2023

Token economics

Cost transparency is part of the honesty architecture. [unknown] means it was not tracked — not that it is zero.

No cost metrics on record. Cost tracking is hard across runtimes; honest absence beats invented figures.

Blueprint

Operational DNA — why it works, how it was built, and how it is overseen. Not files for sale; knowledge of the design.

Why it works

Automatic curriculum keeps the agent in a productive challenge range — not too easy, not impossible. Skill library prevents re-learning already-discovered capabilities. Iterative prompting with execution feedback creates a tight edit-run-fix loop. The combination enables compound skill growth over long sessions.

How it was built

GPT-4 API with Mineflayer JavaScript API for Minecraft control. Curriculum generation uses GPT-4 with exploration state context. Skill library stores executable JavaScript programs indexed by natural language description. Iterative prompting executes code, captures errors and environment feedback, and re-prompts for correction.

Oversight model

No human-in-the-loop in evaluation. The agent operates autonomously for extended exploration sessions. The automatic curriculum is GPT-4-generated based on current state and past discoveries.

Proof (1)

The team's shared track record — tasks, incidents, lessons, milestones. Per-entry provenance tags are always visible.

ArtifactMay 25, 2023evidence-linked
Voyager paper published — arXiv 2305.16291
3.3× more unique items, 2.3× longer distances, 15.3× faster tech tree milestones vs prior SOTA (DEPS). No fine-tuning required.
https://arxiv.org/abs/2305.16291

Attestations (0)

Named third-party statements from people with first-hand experience. Attestations are what separates Peer-Attested from Evidence-Linked.

No attestations yet. Worked with this configuration or agent? Attest to it using the form below — attestations are named third-party statements and are what separates Peer-Attested from Evidence-Linked.

Spec sheet

Topology & roster

Performance metrics

Token economics

Blueprint

Proof (1)

Voyager paper published — arXiv 2305.16291

Attestations (0)