Skip to content

OpenHands

OpenHands (formerly OpenDevin) is an MIT-licensed, open-source autonomous coding agent with a web interface and sandboxed Docker execution. Unlike terminal-based tools like Claude Code or OpenCode, OpenHands runs as a web application where you describe a task and the agent works autonomously — reading files, running commands, and browsing the web in an isolated container. It was published at ICLR 2025 and has 65k+ GitHub stars with 440+ contributors.

OpenHands is a web-based autonomous coding agent that can:

  • Read, write, and execute code in an isolated Docker sandbox
  • Browse the web to research documentation and APIs
  • Run shell commands (bash, git, package managers)
  • Connect directly to GitHub/GitLab issues and create PRs
  • Work autonomously for extended periods without human intervention
  • Use any LLM — cloud APIs (Anthropic, OpenAI, Google) or self-hosted (vLLM, Ollama, LiteLLM)
Browser (your laptop)
OpenHands Server (FastAPI + React, port 3000)
Sandbox Runtime (isolated Docker container per session)
LLM API (cloud or self-hosted vLLM)

Each coding session gets its own Docker container. The agent can’t access your host filesystem directly — it operates in the sandbox. This makes it safer than terminal-based agents for untrusted or experimental tasks.

FeatureClaude CodeMistral VibeOpenCodeOpenHands
InterfaceTerminal CLITerminal CLITerminal TUIWeb UI
ExecutionYour filesystemYour filesystemYour filesystemSandboxed Docker
AutonomousInteractiveInteractiveInteractiveYes (batch mode)
LicenseProprietaryApache 2.0MITMIT
Model supportClaude onlyAny OpenAI-compatAny (ai-sdk)Any
GitHub integrationVia gh CLIManualManualNative (issues → PRs)
MCP supportYesYesYesYes
Self-hostedNoYesYesYes
SWE-bench~80.8% (Opus)72.2% (Devstral 2)Model-dependent46.8-61.7%
Best forComplex tasksBulk/cheap workLSP-aware editingAutonomous batch work

Verdict: OpenHands is the right choice when you want a web interface, sandbox isolation, or autonomous batch processing on GitHub issues. For interactive coding, use Claude Code or OpenCode. For bulk grunt work with unlimited tokens, use Mistral Vibe.

  • Linux server with Docker and Docker Compose
  • NVIDIA GPU(s) with sufficient VRAM if self-hosting a model (40GB+ for Devstral Small)
  • Or: an API key for a cloud LLM provider
services:
openhands-app:
image: docker.all-hands.dev/all-hands-ai/openhands:1.4
pull_policy: always
container_name: openhands-app
environment:
- SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/openhands:1.4-nikolaik
- LOG_ALL_EVENTS=true
ports:
- "3000:3000"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- ~/.openhands-state:/.openhands-state
extra_hosts:
- "host.docker.internal:host-gateway"
stdin_open: true
tty: true
restart: unless-stopped
Terminal window
docker compose up -d

Wait 1-2 minutes, then access at http://localhost:3000.

Docker Socket Required

OpenHands spawns child Docker containers for each sandbox session. It needs /var/run/docker.sock mounted. Only run on trusted infrastructure.

Devstral was co-developed by Mistral AI and All-Hands-AI specifically for the OpenHands agent harness. It outperforms models 10-28x its size on SWE-bench when used with OpenHands.

ModelParamsLicenseVRAMSWE-bench (OpenHands)
Devstral Small 250724BApache 2.0~40GB46.8%
Devstral 2123BMistral Research~180GB (FP8)Higher (est.)
Terminal window
ollama pull devstral

Then in OpenHands Settings > LLM:

  • Custom Model: ollama/devstral
  • Base URL: http://host.docker.internal:11434
  • API Key: (leave blank)
  • VRAM: 48GB combined (24GB each)
  • NVLink: Supported (RTX 4090 does NOT support NVLink)
  • Power: 1000W+ PSU recommended (2x 350W TDP)
  • Best model: Devstral Small 24B (quantized or 4-bit)
  • Great value for a large VRAM pool
  • VRAM: 80-192GB
  • Best model: Devstral Small (full precision) or Devstral 2 123B (FP8)
  • For team-shared inference servers
services:
vllm-devstral:
container_name: vllm-devstral
image: vllm/vllm-openai:v0.19.0
restart: unless-stopped
volumes:
- ./models:/root/.cache/huggingface
environment:
- HUGGING_FACE_HUB_TOKEN=<your-hf-token>
ports:
- "8000:8000"
ipc: host
command: >
mistralai/Devstral-Small-2507
--served-model-name devstral
--tool-call-parser mistral
--enable-auto-tool-choice
--tensor-parallel-size 2
--gpu-memory-utilization 0.90
--max-model-len 64000
runtime: nvidia
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['0', '1']
capabilities:
- gpu

Adjust device_ids and --tensor-parallel-size for your GPU count. Single GPU: use ['0'] and --tensor-parallel-size 1.

Terminal window
docker compose up -d
docker logs -f vllm-devstral # wait for "Uvicorn running"

In the web UI, go to Settings > LLM > Advanced:

FieldValue
Custom Modelopenai/devstral
Base URLhttp://host.docker.internal:8000/v1
API Key(any value — vLLM ignores it by default)
Enable memory condensationtrue
  1. Settings > Integrations — connect to GitHub or GitLab for direct issue-to-PR workflows
  2. Settings > Applications — set git username and email for commits

OpenHands supports MCP servers natively via Settings > MCP Settings in the web UI. Both API-key and OAuth-based MCP servers work, including FastMCP servers like Notion MCP.

The agent produces a plan and waits for your approval before executing. Enable via the Planning Mode toggle in the chat interface. This prevents the agent from charging off in the wrong direction on ambiguous tasks.

SystemSWE-bench VerifiedNotes
Claude Code (Opus 4.6)80.8%Proprietary, highest score
Mistral Vibe (Devstral 2)72.2%Self-hosted, 123B model
OpenHands + inference scaling60.6%Multiple solutions, pick best
OpenHands + fine-tuned 235B61.7%Specialist model
Devstral Small 24B + OpenHands46.8%Runs on 1x RTX 4090, Apache 2.0
OpenHands + Claude 3.7 Sonnet~43-53%Standard single-pass

Devstral Small’s 46.8% is remarkable for a 24B model — it beats GPT-4.1-mini and DeepSeek-V3-0324 (671B) on this benchmark because it was specifically trained against the OpenHands harness.

  • Use Devstral — trained specifically for OpenHands, punches far above its weight
  • Enable memory condensation for local models — prevents context overflow
  • Planning Mode prevents runaway autonomous behavior
  • Tailscale works well for remote access to your OpenHands instance
  • GitHub/GitLab integration enables fire-and-forget issue resolution
IssueWorkaroundNotes
Agent loops on ambiguous tasksProvide clearer instructions; try Planning ModeInherent to autonomous agents
High token consumptionUse memory condensation; use local modelsCost issue with cloud APIs
VLLMException - No module named 'vllm'Use openai/ prefix, not native vLLM pathConfig must use OpenAI-compat URL
Ollama connection failsUse host.docker.internal, not localhostDocker networking
native_tool_calling errorsSet to false in Advanced settings for vLLMNot all models support native tool calls
  1. OpenHands for autonomous batch work (GitHub issues, backend tasks)
  2. Claude Code or OpenCode for interactive refinement
  3. Mistral Vibe for bulk grunt work with unlimited tokens