OpenHands

OpenHands (formerly OpenDevin) is an MIT-licensed, open-source autonomous coding agent with a web interface and sandboxed Docker execution. Unlike terminal-based tools like Claude Code or OpenCode, OpenHands runs as a web application where you describe a task and the agent works autonomously — reading files, running commands, and browsing the web in an isolated container. It was published at ICLR 2025 and has 65k+ GitHub stars with 440+ contributors.

What is OpenHands?

OpenHands is a web-based autonomous coding agent that can:

Read, write, and execute code in an isolated Docker sandbox
Browse the web to research documentation and APIs
Run shell commands (bash, git, package managers)
Connect directly to GitHub/GitLab issues and create PRs
Work autonomously for extended periods without human intervention
Use any LLM — cloud APIs (Anthropic, OpenAI, Google) or self-hosted (vLLM, Ollama, LiteLLM)

Architecture

Browser (your laptop)
  ↓
OpenHands Server (FastAPI + React, port 3000)
  ↓
Sandbox Runtime (isolated Docker container per session)
  ↓
LLM API (cloud or self-hosted vLLM)

Each coding session gets its own Docker container. The agent can’t access your host filesystem directly — it operates in the sandbox. This makes it safer than terminal-based agents for untrusted or experimental tasks.

Why OpenHands? (vs. Alternatives)

Feature	Claude Code	Mistral Vibe	OpenCode	OpenHands
Interface	Terminal CLI	Terminal CLI	Terminal TUI	Web UI
Execution	Your filesystem	Your filesystem	Your filesystem	Sandboxed Docker
Autonomous	Interactive	Interactive	Interactive	Yes (batch mode)
License	Proprietary	Apache 2.0	MIT	MIT
Model support	Claude only	Any OpenAI-compat	Any (ai-sdk)	Any
GitHub integration	Via `gh` CLI	Manual	Manual	Native (issues → PRs)
MCP support	Yes	Yes	Yes	Yes
Self-hosted	No	Yes	Yes	Yes
SWE-bench	~80.8% (Opus)	72.2% (Devstral 2)	Model-dependent	46.8-61.7%
Best for	Complex tasks	Bulk/cheap work	LSP-aware editing	Autonomous batch work

Verdict: OpenHands is the right choice when you want a web interface, sandbox isolation, or autonomous batch processing on GitHub issues. For interactive coding, use Claude Code or OpenCode. For bulk grunt work with unlimited tokens, use Mistral Vibe.

Installation

Prerequisites

Linux server with Docker and Docker Compose
NVIDIA GPU(s) with sufficient VRAM if self-hosting a model (40GB+ for Devstral Small)
Or: an API key for a cloud LLM provider

Docker Compose

services:
  openhands-app:
    image: docker.all-hands.dev/all-hands-ai/openhands:1.4
    pull_policy: always
    container_name: openhands-app
    environment:
      - SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/openhands:1.4-nikolaik
      - LOG_ALL_EVENTS=true
    ports:
      - "3000:3000"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - ~/.openhands-state:/.openhands-state
    extra_hosts:
      - "host.docker.internal:host-gateway"
    stdin_open: true
    tty: true
    restart: unless-stopped

docker compose up -d

Wait 1-2 minutes, then access at http://localhost:3000.

Docker Socket Required

OpenHands spawns child Docker containers for each sandbox session. It needs /var/run/docker.sock mounted. Only run on trusted infrastructure.

Devstral: Recommended Local Model

Devstral was co-developed by Mistral AI and All-Hands-AI specifically for the OpenHands agent harness. It outperforms models 10-28x its size on SWE-bench when used with OpenHands.

Model	Params	License	VRAM	SWE-bench (OpenHands)
Devstral Small 2507	24B	Apache 2.0	~40GB	46.8%
Devstral 2	123B	Mistral Research	~180GB (FP8)	Higher (est.)

Quick Start with Ollama

ollama pull devstral

Then in OpenHands Settings > LLM:

Custom Model: ollama/devstral
Base URL: http://host.docker.internal:11434
API Key: (leave blank)

Self-Hosted vLLM Setup

Hardware Considerations

Dual RTX 3090 (Budget, ~$800-1000 Used)

VRAM: 48GB combined (24GB each)
NVLink: Supported (RTX 4090 does NOT support NVLink)
Power: 1000W+ PSU recommended (2x 350W TDP)
Best model: Devstral Small 24B (quantized or 4-bit)
Great value for a large VRAM pool

Single B200 / H100 / A100 (Production)

VRAM: 80-192GB
Best model: Devstral Small (full precision) or Devstral 2 123B (FP8)
For team-shared inference servers

vLLM Docker Compose

services:
  vllm-devstral:
    container_name: vllm-devstral
    image: vllm/vllm-openai:v0.19.0
    restart: unless-stopped
    volumes:
      - ./models:/root/.cache/huggingface
    environment:
      - HUGGING_FACE_HUB_TOKEN=<your-hf-token>
    ports:
      - "8000:8000"
    ipc: host
    command: >
      mistralai/Devstral-Small-2507
      --served-model-name devstral
      --tool-call-parser mistral
      --enable-auto-tool-choice
      --tensor-parallel-size 2
      --gpu-memory-utilization 0.90
      --max-model-len 64000
    runtime: nvidia
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ['0', '1']
              capabilities:
                - gpu

Adjust device_ids and --tensor-parallel-size for your GPU count. Single GPU: use ['0'] and --tensor-parallel-size 1.

docker compose up -d
docker logs -f vllm-devstral  # wait for "Uvicorn running"

Configure OpenHands to Use vLLM

In the web UI, go to Settings > LLM > Advanced:

Field	Value
Custom Model	`openai/devstral`
Base URL	`http://host.docker.internal:8000/v1`
API Key	(any value — vLLM ignores it by default)
Enable memory condensation	`true`

Git Integration

Settings > Integrations — connect to GitHub or GitLab for direct issue-to-PR workflows
Settings > Applications — set git username and email for commits

MCP Support

OpenHands supports MCP servers natively via Settings > MCP Settings in the web UI. Both API-key and OAuth-based MCP servers work, including FastMCP servers like Notion MCP.

Planning Mode (Beta)

The agent produces a plan and waits for your approval before executing. Enable via the Planning Mode toggle in the chat interface. This prevents the agent from charging off in the wrong direction on ambiguous tasks.

SWE-bench Scores

System	SWE-bench Verified	Notes
Claude Code (Opus 4.6)	80.8%	Proprietary, highest score
Mistral Vibe (Devstral 2)	72.2%	Self-hosted, 123B model
OpenHands + inference scaling	60.6%	Multiple solutions, pick best
OpenHands + fine-tuned 235B	61.7%	Specialist model
Devstral Small 24B + OpenHands	46.8%	Runs on 1x RTX 4090, Apache 2.0
OpenHands + Claude 3.7 Sonnet	~43-53%	Standard single-pass

Devstral Small’s 46.8% is remarkable for a 24B model — it beats GPT-4.1-mini and DeepSeek-V3-0324 (671B) on this benchmark because it was specifically trained against the OpenHands harness.

Tips & Known Issues

Tips

Use Devstral — trained specifically for OpenHands, punches far above its weight
Enable memory condensation for local models — prevents context overflow
Planning Mode prevents runaway autonomous behavior
Tailscale works well for remote access to your OpenHands instance
GitHub/GitLab integration enables fire-and-forget issue resolution

Known Issues

Issue	Workaround	Notes
Agent loops on ambiguous tasks	Provide clearer instructions; try Planning Mode	Inherent to autonomous agents
High token consumption	Use memory condensation; use local models	Cost issue with cloud APIs
`VLLMException - No module named 'vllm'`	Use `openai/` prefix, not native vLLM path	Config must use OpenAI-compat URL
Ollama connection fails	Use `host.docker.internal`, not `localhost`	Docker networking
`native_tool_calling` errors	Set to `false` in Advanced settings for vLLM	Not all models support native tool calls

Recommended Workflow

OpenHands for autonomous batch work (GitHub issues, backend tasks)
Claude Code or OpenCode for interactive refinement
Mistral Vibe for bulk grunt work with unlimited tokens

Claude Code - Proprietary CLI agent (highest benchmark scores)
Mistral Vibe - Open-source CLI with self-hosted Devstral and orchestration patterns
OpenCode - Open-source TUI with LSP integration
Gemini Code - Google’s Gemini CLI
AI Agent Pricing - Cost comparison across agents

External Links

OpenHands Documentation
OpenHands GitHub - Source code (MIT)
Devstral Announcement - Mistral x All-Hands-AI collaboration
SWE-bench Live Leaderboard - Current benchmark standings
Vibe Coding Repository - Community rules and skills

OpenHands

OpenHands

What is OpenHands?

Architecture

Why OpenHands? (vs. Alternatives)

Installation

Prerequisites

Docker Compose

Devstral: Recommended Local Model

Quick Start with Ollama

Self-Hosted vLLM Setup

Hardware Considerations

Dual RTX 3090 (Budget, ~$800-1000 Used)

Single B200 / H100 / A100 (Production)

vLLM Docker Compose

Configure OpenHands to Use vLLM

Git Integration

MCP Support

Planning Mode (Beta)

SWE-bench Scores

Tips & Known Issues

Tips

Known Issues

Recommended Workflow

Related Resources

External Links