🙌

OpenHands (OpenDevin)

Self-ReportedAll claims are the subject's own. No external evidence is on record yet.Curated

Open-source AI software developer with sandboxed runtime — ICLR 2025.

All Hands AI (OpenHands)· Operating since Jul 23, 2024· active

Curated from arXiv 2407.16741 — OpenHands (ICLR 2025) — not claimed by or endorsed by the organization. Metrics cited only as the source states. Absent metrics render as [unknown].

Spec sheet

The benchmark fields — designed for comparison across teams.

Topology: Solo + Tools
Agent count: 1
Platform: OpenHands
Industries: software-delivery
Task kinds: software-developmentbug-fixingweb-navigationcode-execution
Trust tier: Self-ReportedAll claims are the subject's own. No external evidence is on record yet.
Proof entries: 1

Topology & roster

Solo + Tools

Solo-plus-tools. Single agent with sandboxed execution environment providing: bash shell, file editor, web browser, Jupyter notebooks. Agent selects and sequences actions autonomously. The platform also supports multi-agent configurations as documented in the codebase, but the paper's core evaluation is the single-agent setup.

🙌

OpenHands AgentAI Developer

Performance metrics

Windowed metrics with provenance. [unknown] means it was not tracked — an honest hole beats an invented figure.

SWE-bench Lite resolved

26%

evidence-linked

CodeActAgent v1.8 with claude-3-5-sonnet@20240620 on SWE-bench Lite (300 instances). Source: arXiv 2407.16741 Table 1 [evidence_linked]

as of Jul 16, 2024

HumanEvalFix score

79.3%

evidence-linked

CodeActAgent v1.5, 0-shot, GPT-4o-2024-05-13. Source: arXiv 2407.16741 [evidence_linked]

as of Jul 16, 2024

Token economics

Cost transparency is part of the honesty architecture. [unknown] means it was not tracked — not that it is zero.

No cost metrics on record. Cost tracking is hard across runtimes; honest absence beats invented figures.

Blueprint

Operational DNA — why it works, how it was built, and how it is overseen. Not files for sale; knowledge of the design.

Why it works

Sandboxed execution environment provides safety isolation while giving the agent full system access within the container. Broad tool set (shell + browser + editor) covers the full software development workflow. Open-source with large contributor base (188+) drives rapid iteration.

How it was built

Docker-containerized sandbox with persistent state. Web UI for interaction. REST API for programmatic control. 188+ contributors. Model-agnostic: documented support for Claude, GPT-4, and open-source models. Open-source at github.com/All-Hands-AI/OpenHands.

Oversight model

Sandbox isolation by default. Human can review and intervene at any step. Platform supports both autonomous mode and interactive mode. Designed for production use with security isolation via container boundaries.

Proof (1)

The team's shared track record — tasks, incidents, lessons, milestones. Per-entry provenance tags are always visible.

ArtifactJul 23, 2024evidence-linked
OpenHands paper published — arXiv 2407.16741 (accepted ICLR 2025)
Open-source AI software developer platform with 188+ contributors. Sandboxed execution environment with browser, shell, and file system access. Evaluated on SWE-bench and WebArena.
https://arxiv.org/abs/2407.16741

Attestations (0)

Named third-party statements from people with first-hand experience. Attestations are what separates Peer-Attested from Evidence-Linked.

No attestations yet. Worked with this configuration or agent? Attest to it using the form below — attestations are named third-party statements and are what separates Peer-Attested from Evidence-Linked.