Skip to content

Mistral Vibe

Mistral Vibe is Mistral AI’s open-source (Apache 2.0) command-line interface for AI-assisted software development. Like Claude Code, it provides agentic coding capabilities — reading files, executing commands, and making changes to your codebase directly from the terminal. Unlike Claude Code, Vibe can be pointed at any OpenAI-compatible backend, including self-hosted models via vLLM, LiteLLM, or Open-WebUI.

Mistral Vibe is a terminal-based AI coding assistant that can:

  • Read and understand your entire codebase
  • Execute shell commands via a stateful bash session
  • Create, edit, and delete files with search-and-replace precision
  • Run grep (ripgrep) for content search across your project
  • Delegate tasks to subagents for parallel work
  • Connect to MCP servers for extended capabilities (git, filesystem, web fetch)
  • Use custom skills and agents for specialized workflows

Vibe ships with these tools out of the box — no plugins required:

ToolDescriptionClaude Code Equivalent
read_fileRead file contentsRead
write_fileCreate or overwrite filesWrite
search_replaceDiff-patch style editingEdit
grepripgrep content searchGrep
bashStateful shell sessionBash
taskSpawn subagent for parallel workAgent
ask_user_questionPrompt for user inputAskUserQuestion
webfetchFetch URL contentWebFetch
websearchSearch the webWebSearch
FeatureClaude CodeMistral Vibe
LicenseProprietaryApache 2.0 (open source)
ModelClaude Opus/Sonnet (cloud only)Any OpenAI-compatible endpoint
Self-hostedNoYes (vLLM, LiteLLM, Ollama, etc.)
Cost$20-200/month subscriptionFree (self-host) or pay-per-token
Context1M tokensDepends on model (128K+ with Devstral)
PluginsMarketplace + communitySkills + MCP servers
SubagentsYes (Agent tool)Yes (task tool + TOML agents)
Project rulesCLAUDE.md (hierarchical)AGENTS.md (project root only)
SWE-bench~79.6% (Sonnet 4.6)72.2% (Devstral 2 123B)

Verdict: Claude Code is more polished and has higher benchmark scores. Vibe wins on openness, self-hosting, and cost control. If your organization can’t send code to Anthropic’s cloud, Vibe with a self-hosted model is the answer. If you can, consider using both — Claude Code for complex tasks, Vibe for routine work on your own hardware.

FeatureGemini CLIMistral Vibe
Context2M+ tokensModel-dependent (128K-256K typical)
Self-hostedNo (Google cloud only)Yes
SubagentsYes (.gemini/agents/)Yes (.vibe/agents/)
MCP supportLimitedFull (stdio + HTTP transports)
LicenseProprietaryApache 2.0

For more on Gemini CLI, see the Gemini Code guide.

vs. Le Chat (chat.mistral.ai — $15/month web UI)

Section titled “vs. Le Chat (chat.mistral.ai — $15/month web UI)”

chat.mistral.ai (“Le Chat”) is Mistral’s hosted web product — the same role claude.ai plays for Anthropic or chatgpt.com plays for OpenAI. Le Chat Pro is $14.99/month (commonly rounded to “$15/mo”), notably cheaper than Claude Pro or ChatGPT Plus at $20/mo. Pro gives you:

  • Unlimited chat with Mistral Large, Codestral, Pixtral, and other flagship Mistral models
  • Document upload and analysis, web search, image generation (Flux), Canvas (in-browser code editor)
  • Custom agents and a project workspace
  • Voice mode in supported regions
Le Chat Pro (web, $15/mo)Mistral Vibe (CLI)
Cost$14.99/mo flatFree (self-host) or pay-per-token (Mistral API)
InterfaceBrowserTerminal + your editor
Reads your repoUpload files manuallyYes — full filesystem access
Runs shell commandsNo (Canvas sandbox only)Yes — real shell on your machine
Edits files in placeNo (copy/paste out)Yes
MCP / subagents / skillsNoYes
Best forQ&A, brainstorming, one-off snippetsMulti-file refactors, agentic edits, automation

They complement each other — they don’t compete. Pay $15/mo for Le Chat if you want a polished web UI for chat and brainstorm tasks, and keep Vibe CLI for the actual coding work where you need the model to act on your repo. Many people run both.

Terminal window
curl -LsSf https://mistral.ai/vibe/install.sh | bash

Or via uv (the fast Python package manager):

Terminal window
uv tool install mistral-vibe

Or via pip:

Terminal window
pip install mistral-vibe

Prerequisite: Python 3.12+. Check with python3 --version.

After installation, verify:

Terminal window
vibe --version

PATH Setup

If you get command not found: vibe, add ~/.local/bin to your PATH:

Terminal window
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.zshrc
source ~/.zshrc

Vibe’s configuration lives in TOML files:

  • Global: ~/.vibe/config.toml — applies to all projects
  • Per-project: .vibe/config.toml in the project root — overrides global
  • API keys: ~/.vibe/.env — loaded automatically

Vibe works against any OpenAI-compatible chat-completions endpoint. The two paths most users pick:

Option A — Official Mistral APIOption B — IrregularChat self-hosted
Endpointhttps://api.mistral.ai/v1Community Open-WebUI instance
AuthMistral API key from console.mistral.aiOpen-WebUI API key (group membership required)
CostPay-per-token (La Plateforme billing)Free for IrregularChat members
ModelsAll Mistral models (medium, large, codestral, etc.)Whatever the gateway exposes (currently Mistral Medium 3.5)
NetworkPublic internet → MistralTailscale / public VPN to community GPU host
Best forAnyone — works immediately with a credit cardMembers who want zero marginal cost

You can keep both configured side-by-side and switch via the active_model setting or --agent.

For users without IrregularChat backend access, or anyone who wants pay-as-you-go directly from Mistral. Pricing lives at mistral.ai/pricing — Mistral Medium is the recommended balance of cost vs. capability for Vibe’s agentic loop.

  1. Get a Mistral API key:

    • Sign up at console.mistral.ai
    • Workspace → API Keys → Create new key
    • Key starts with no fixed prefix (opaque token)
  2. Create ~/.vibe/.env:

    Terminal window
    MISTRAL_API_KEY=<your-mistral-key>
  3. Create ~/.vibe/config.toml:

active_model = "mistral-medium-3.5"
enable_telemetry = false
auto_compact_threshold = 200000
api_timeout = 720.0
[[providers]]
name = "mistral"
api_base = "https://api.mistral.ai/v1"
api_key_env_var = "MISTRAL_API_KEY"
api_style = "openai"
backend = "mistral"
[[models]]
name = "mistral-medium-latest"
provider = "mistral"
alias = "mistral-medium-3.5"
temperature = 0.2
auto_compact_threshold = 200000
thinking = "off"
# Pricing as of 2026-05 — verify at mistral.ai/pricing before relying on these for budget tracking
input_price = 0.4 # $/MTok
output_price = 2.0 # $/MTok
[[models]]
name = "codestral-latest"
provider = "mistral"
alias = "codestral"
temperature = 0.2
input_price = 0.2
output_price = 0.6
  1. Smoke test:
    Terminal window
    vibe -p "reply with exactly: pong" --max-turns 1 --output text
    # → pong

Option B — IrregularChat self-hosted backend

Section titled “Option B — IrregularChat self-hosted backend”

Our community runs Mistral Medium 3.5 (128B) on dedicated GPU infrastructure via Open-WebUI. (We previously ran Devstral 2 123B — the alias was switched in May 2026; see the changelog at the bottom of this section.) To connect:

  1. Get an API key:

    • Log into Open-WebUI (ask your admin for the URL)
    • You must be added to the api access group by an admin
    • Go to Settings > Account > Show API Keys > Generate new API key (not a JWT)
    • Copy the key — it starts with sk-
  2. Create ~/.vibe/.env:

    DF_API_KEY=<your-api-key>
  3. Create ~/.vibe/config.toml:

active_model = "mistral"
enable_telemetry = false
# Context & UI
auto_compact_threshold = 200000
context_warnings = true
api_timeout = 720.0
# Project context injection
include_commit_signature = true
include_project_context = true
[project_context]
default_commit_count = 3
timeout_seconds = 2.0
# ── Provider ──────────────────────────────────
[[providers]]
name = "df"
api_base = "https://your-openwebui-instance.example.com/api"
api_key_env_var = "DF_API_KEY"
api_style = "openai"
backend = "generic"
# ── Model ─────────────────────────────────────
[[models]]
name = "mistral-medium" # must match a model_name in the LiteLLM gateway
provider = "df"
alias = "mistral"
temperature = 0.2
auto_compact_threshold = 200000
thinking = "off"
input_price = 0.0
output_price = 0.0
# :::caution[Don't use `devstral-123b`]
# The old `devstral-123b` alias was removed from the gateway in May 2026.
# If your config still says `name = "devstral-123b"` you will get:
# `400 Bad Request ... InternalServerError ... Connection error`
# Use `name = "mistral-medium"` as shown above.
# :::
# ── Tool Permissions ──────────────────────────
[tools.bash]
permission = "ask"
default_timeout = 120
max_output_bytes = 8000
allowlist = [
"git", "ls", "cat", "echo", "pwd", "which", "python", "python3",
"pip", "uv", "docker", "docker compose", "curl", "jq", "rg", "grep",
"npm", "npx", "node", "make", "cargo", "go",
]
denylist = ["rm -rf /", "dd", "mkfs"]
sensitive_patterns = ["sudo"]
[tools.read_file]
permission = "always"
[tools.grep]
permission = "always"
[tools.search_replace]
permission = "ask"
[tools.write_file]
permission = "ask"
max_write_bytes = 64000
create_parent_dirs = true
sensitive_patterns = ["**/.env", "**/.env.*"]
[tools.webfetch]
permission = "ask"
default_timeout = 30
max_content_bytes = 60000
[tools.task]
permission = "always"
# ── Session Logging ───────────────────────────
[session_logging]
enabled = true
save_dir = "~/.vibe/logs"
session_prefix = "session"

Every config key can also be set via environment variable with the VIBE_ prefix:

Config KeyEnv VariableDefaultDescription
active_modelVIBE_ACTIVE_MODELModel alias to use
auto_compact_thresholdVIBE_AUTO_COMPACT_THRESHOLD200000Token count before auto-compaction
api_timeoutVIBE_API_TIMEOUT720.0HTTP timeout in seconds
context_warningsVIBE_CONTEXT_WARNINGSfalseWarn when approaching context limit
vim_keybindingsVIBE_VIM_KEYBINDINGSfalseEnable vim bindings in TUI
autocopy_to_clipboardfalseCopy last response to clipboard
enable_auto_updatetrueAuto-update Vibe
PermissionBehavior
"always"Tool runs without confirmation
"ask"Prompts for approval each time
"never"Tool is disabled

Best practice: Set read_file, grep, and task to "always" (safe, read-only operations). Keep bash, write_file, and search_replace at "ask" to prevent unintended changes.

MCP (Model Context Protocol) servers extend Vibe with additional tools. They run as local subprocesses and communicate over stdio or HTTP.

Terminal window
# Node.js ≥18 (for filesystem MCP)
node --version
# uv (for Python MCP servers)
which uvx || curl -LsSf https://astral.sh/uv/install.sh | sh

Add these to your config.toml:

# ── MCP Servers ───────────────────────────────
# Filesystem — directory tree, glob search, move, metadata
[[mcp_servers]]
name = "fs"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/your/projects"]
startup_timeout_sec = 15
tool_timeout_sec = 60
sampling_enabled = false
# Git — status, diff, log, branch, commit, checkout
[[mcp_servers]]
name = "git"
transport = "stdio"
command = "uvx"
args = ["mcp-server-git"]
startup_timeout_sec = 15
tool_timeout_sec = 60
sampling_enabled = false
# Web fetch — URL to Markdown
[[mcp_servers]]
name = "web"
transport = "stdio"
command = "uvx"
args = ["mcp-server-fetch"]
startup_timeout_sec = 15
tool_timeout_sec = 30
sampling_enabled = false
Vibe NeedBuilt-in ToolMCP ServerMCP Tools Added
Read filesread_filefs (richer)read_text_file, read_multiple_files, read_media_file
Write fileswrite_filefswrite_file, edit_file
Find files by namefssearch_files, directory_tree
Search file contentgrep (ripgrep)Already covered
Shell commandsbash (stateful)Already covered
Git operationsgitgit_status, git_diff, git_log, git_commit, git_add, git_branch, etc.
Fetch URLswebfetchwebfetch (HTML → Markdown)

When true (the default), an MCP server can call back into the LLM during tool execution. Set to false for simple servers (filesystem, git, fetch) to save tokens and reduce latency.

Skills are Markdown files that provide instructions, rules, or workflows. They live at:

  • Project-local: .vibe/skills/<name>/SKILL.md (committed to repo)
  • User-global: ~/.vibe/skills/<name>/SKILL.md
TypeFrontmatterBehavior
Slash commanduser-invocable: trueTriggered by typing /skill-name
Ambient contextuser-invocable: falseLoaded automatically as background context

Create ~/.vibe/skills/code-review/SKILL.md:

---
name: code-review
description: Structured code review on current diff
allowed-tools: read_file bash grep
user-invocable: true
---
# Code Review
Review the current git diff for:
1. Security vulnerabilities (injection, secrets, auth bypass)
2. Correctness (logic errors, edge cases, error handling)
3. Operations (config changes shipped? rollback plan?)
4. Style (conventional commits, no drive-by changes)
Process:
1. Run `git diff --staged` or `git diff HEAD~1`
2. Read each changed file in full context
3. Report by severity: CRITICAL / WARNING / NOTE

Invoke it with /code-review in a Vibe session.

Create .vibe/skills/safety-rules/SKILL.md in your project root:

---
name: safety-rules
description: Core safety rules for infrastructure operations
allowed-tools: read_file bash grep
user-invocable: false
---
# Safety Rules
- rsync --delete: ALWAYS --dry-run first
- .env files: backup before replacing
- Never force push to main without approval
- Verify database schema before writing queries
- Fail fast on missing env vars

This loads automatically whenever Vibe runs in that project directory.

Agents are TOML files that override Vibe’s config for specific use cases. They live at ~/.vibe/agents/<name>.toml.

display_name = "Infrastructure"
description = "Docker, SSH, and server management with careful permissions"
safety = "destructive"
agent_type = "agent"
auto_approve = false
enabled_tools = ["read_file", "grep", "bash", "write_file", "search_replace", "task"]
display_name = "Read Only"
description = "Safe exploration — read files, search, run safe commands"
safety = "safe"
agent_type = "agent"
auto_approve = true
enabled_tools = ["read_file", "grep", "bash"]

Use with: vibe --agent infra or vibe --agent readonly.

Vibe ships with several agents:

AgentModeDescription
defaultStandardNormal interactive mode
planPlan-firstRequires plan approval before execution
accept-editsEdit-focusedAuto-approves file edits
auto-approveAutonomousApproves all tool calls
exploreSubagentRead-only exploration subagent
leanInstallableUses Leanstral model with thinking = "high"

AGENTS.md — Your Project’s AI Rules File

Section titled “AGENTS.md — Your Project’s AI Rules File”

AGENTS.md is Vibe’s equivalent of Claude Code’s CLAUDE.md. Place it at the root of your project and Vibe reads it automatically as context.

# Project Name
description: Brief description of the project
## Safety Rules
- rsync --delete: ALWAYS --dry-run first
- .env files: backup before replacing/removing
- Never force push to main without approval
- Verify database schema before writing queries
## Stack
- Runtime: [your runtime]
- Database: [your database]
- Frontend: [your frontend]
## Conventions
- Conventional commits: type(scope): description
- Timezone: America/New_York (never assume UTC)
- Fail fast on missing env vars
## Common Commands
npm run dev # Local development
./deploy.sh # Production deploy (always --dry-run first)
FeatureCLAUDE.md (Claude Code)AGENTS.md (Vibe)
Location~/.claude/CLAUDE.md (global) + project rootProject root only
HierarchicalYes (parent dirs cascade)No (root only)
FormatFreeform MarkdownYAML-flavored Markdown
Domain rules~/.claude/rules/*.md with glob matching.vibe/skills/ (ambient, no globs)
Per-file scopingGlob patterns in frontmatterNot supported
Supplement.vibe/skills/ with user-invocable: false

A production multi-model setup uses a gateway pattern:

Vibe CLI / Open-WebUI (browser)
LiteLLM Gateway (routes by model name)
├── mistral-medium → vLLM (GPU 6-7, FP8, TP=2) ← IrregularChat default coding model
├── irregularchat → vLLM (GPU 0, Gemma 4 31B)
└── other-model → vLLM (GPU N)

Or for direct access (simpler, recommended for tool calling with Vibe CLI):

Vibe (your machine) → vLLM (direct) → GPU(s)

For Devstral models, three flags are required for tool calling to work:

Terminal window
vllm serve mistralai/Devstral-2-123B-Instruct-2512 \
--tool-call-parser mistral \
--enable-auto-tool-choice \
--tensor-parallel-size 2 \
--quantization fp8 \
--port 8080
FlagWhy Required
--tool-call-parser mistralWithout this, vLLM rejects tool call schemas with Pydantic validation errors
--enable-auto-tool-choiceLets the model decide when to use tools
--tensor-parallel-size NSplit model across N GPUs (123B needs 2+ GPUs)
--quantization fp8Fits 123B on 2x GPUs with ~178GB VRAM
ModelSizeLicenseMin HardwareSWE-benchvLLM Image
Devstral 2123BMistral Research (revenue cap)2x H100/A100/B20072.2%vllm/vllm-openai:v0.19.0
Devstral Small 224BApache 2.01x RTX 4090 (24GB)~55%vllm/vllm-openai:v0.19.0
Gemma 412B-31BApache 2.01x RTX 4090 (24GB-31B)Custom image required (see below)

Devstral 2 Licensing

Devstral 2 (123B) has a revenue restriction — commercial use by organizations with >$20M monthly revenue requires a separate license from Mistral. Devstral Small 2 (24B) is fully Apache 2.0 with no restrictions.

For a production-ready self-hosted setup:

services:
vllm:
container_name: vllm-devstral
image: vllm/vllm-openai:v0.19.0
restart: unless-stopped
volumes:
- /path/to/models:/root/.cache/huggingface
ipc: host
command: >
mistralai/Devstral-2-123B-Instruct-2512
--served-model-name devstral
--quantization fp8
--tool-call-parser mistral
--enable-auto-tool-choice
--tensor-parallel-size 2
--gpu-memory-utilization 0.95
--max-num-seqs 4
runtime: nvidia
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['0', '1']
capabilities: [gpu]
ports:
- "8080:8000"

Then point Vibe at it:

[[providers]]
name = "local"
api_base = "http://localhost:8080/v1"
api_key_env_var = "VLLM_API_KEY" # vLLM doesn't require a key, but Vibe needs the field
api_style = "openai"
backend = "generic"

Set a dummy key in ~/.vibe/.env:

VLLM_API_KEY=not-needed

For teams that want both a browser UI and CLI access, add Open-WebUI with LiteLLM as a gateway:

services:
litellm:
image: ghcr.io/berriai/litellm:main-latest
ports:
- "4000:4000"
volumes:
- ./config.yaml:/app/config.yaml:ro
environment:
LITELLM_MASTER_KEY: ${LITELLM_MASTER_KEY}
DATABASE_URL: postgresql://litellm:${POSTGRES_PASSWORD}@litellm-db:5432/litellm
extra_hosts:
- "host.docker.internal:host-gateway"
command: ["--config", "/app/config.yaml", "--port", "4000"]
restart: unless-stopped
open-webui:
image: ghcr.io/open-webui/open-webui:main
ports:
- "3000:8080"
volumes:
- open-webui-data:/app/backend/data
environment:
OPENAI_API_BASE_URL: http://litellm:4000/v1
OPENAI_API_KEY: ${LITELLM_MASTER_KEY}
ENABLE_OLLAMA_API: "false"
WEBUI_AUTH: "true"
WEBUI_NAME: "Your AI Instance"
WEBUI_SECRET_KEY: ${WEBUI_SECRET_KEY}
BYPASS_MODEL_ACCESS_CONTROL: "true"
depends_on:
- litellm
restart: unless-stopped

LiteLLM config (config.yaml) routes model names to vLLM backends:

model_list:
- model_name: devstral-123b
litellm_params:
model: hosted_vllm/devstral # must match --served-model-name
api_base: http://host.docker.internal:8080/v1
api_key: none
litellm_settings:
drop_params: true # prevents 422 from unsupported params
request_timeout: 600 # 10 min for long coding tasks
  • The model field must use the vLLM --served-model-name, not the filesystem path. hosted_vllm/devstral works; hosted_vllm//workspace/models/Devstral-2-123B-Instruct-2512 does not.
  • drop_params: true is essential — it silently drops parameters the backend doesn’t support instead of returning 422 errors.
  • host.docker.internal resolves to the Docker host — use this to reach vLLM containers from inside the LiteLLM container.
  • Temperature 0.2 is the community-recommended setting for coding tasks with Devstral
  • /compact manually compresses conversation context — use it after long investigations
  • Subagents can run tasks in parallel via the task tool — think of them like Claude Code’s Agent tool
  • Per-project config: Drop a .vibe/config.toml in any repo to override global settings (different model, different tools)
  • System prompts: Create ~/.vibe/prompts/<name>.md and set system_prompt_id = "<name>" in config
  • Session logs: Stored at ~/.vibe/logs/ when enabled — useful for reviewing what Vibe did
IssueWorkaroundStatus
Ctrl+C breaks message alternationUse /clear insteadOpen (#255)
Tool calls fail through Open-WebUI proxyPoint Vibe directly at vLLM or LiteLLMBy design
Tool calls fail on LM StudioUse vLLM with --tool-call-parser mistralConfirmed (#124)
“Generating…” hangs indefinitelyRestart Vibe sessionOpen (#415)
TUI rendering breaks in some terminalsUse Alacritty, Ghostty, Kitty, or WezTermBy design
Non-admin users see no models in Open-WebUISet BYPASS_MODEL_ACCESS_CONTROL=trueBy design (since v0.4)
vLLM “model type not recognized” for new modelsBuild custom image with pip install --upgrade transformersGemma 4, etc.
LiteLLM 404 “model does not exist”Use --served-model-name in config, not filesystem pathConfig issue
Sessions lost on Open-WebUI restartSet WEBUI_SECRET_KEY in environmentConfig issue

When running your own model, “cost” is GPU time rather than API tokens:

  • --max-num-seqs 2-4 limits concurrent requests (prevents OOM on large models)
  • --gpu-memory-utilization 0.95 maximizes VRAM usage (safe when GPUs are dedicated)
  • auto_compact_threshold = 200000 prevents context from growing unbounded
  • max_output_bytes = 8000 on bash tool prevents long command outputs from bloating context
  • Run a compaction model (smaller/faster) for auto-compaction if available on the same endpoint

Claude Code + Vibe Orchestration (Best of Both Worlds)

Section titled “Claude Code + Vibe Orchestration (Best of Both Worlds)”

The most powerful way to use Vibe isn’t standalone — it’s as a workhorse dispatched by Claude Code. Claude Code has superior reasoning, planning, and code review but is limited by subscription tokens. Vibe on a self-hosted model has unlimited tokens but weaker orchestration. Together: Claude’s brain + Vibe’s unlimited hands.

┌─────────────────────────────────────────────────────────────┐
│ Claude Code (Brain) │
│ Plans → Dispatches → Reviews → Synthesizes → Commits │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Vibe -p │ │ Vibe -p │ │ Vibe -p │ (Parallel) │
│ │ Agent 1 │ │ Agent 2 │ │ Agent 3 │ │
│ │ files │ │ tests │ │ docs │ │
│ │ A-D │ │ E-H │ │ I-L │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ ↓ ↓ ↓ │
│ Claude Code reviews all results, fixes integration issues │
└─────────────────────────────────────────────────────────────┘

Vibe’s -p flag runs it in headless/programmatic mode: auto-approves all tools, executes the prompt, outputs the result, and exits. This makes Vibe behave like a function call — prompt in, result out — perfect for dispatch from Claude Code.

Terminal window
# Single-shot execution — auto-approves all tools, runs to completion, exits
vibe -p "your prompt here" --workdir /path/to/project --output text --max-turns 25
# With tool restrictions (safer for research — can't modify files)
vibe -p "your prompt" --enabled-tools "read_file" --enabled-tools "grep" --output text
# JSON output for structured/parseable results
vibe -p "your prompt" --output json --max-turns 10
FlagDescription
-p "prompt"Headless mode. Auto-approves ALL tools. Runs and exits.
--workdir DIRSet working directory (always set this)
--max-turns NLimit assistant turns. 10-25 for research, 25-50 for implementation.
--output text|json|streamingOutput format
--enabled-tools TOOLRestrict available tools. Supports globs (bash*) and regex (re:.*).
--agent NAMEUse a custom agent profile
Task TypeWhoWhy
Planning & architectureClaude CodeSuperior multi-step reasoning, weighs tradeoffs
Code review & validationClaude CodeBetter judgment on quality, security, patterns
Synthesis & final decisionsClaude CodeIntegrates results from multiple agents
Commit messages & git opsClaude CodeCraft precise conventional commits
File generation (new files)VibeUnlimited tokens, follows templates well
Bulk edits (many files)Vibe (parallel)Each agent handles a subset of files
Research (docs, APIs, codebase)VibeReads entire docs without token pressure
Test writingVibeRepetitive work, follows patterns
DocumentationVibeGood at following style guides with unlimited context
Bug investigationVibe (gather) → Claude (diagnose)Vibe reads 20 files, Claude interprets

When tasks touch different files, run multiple Vibe agents simultaneously. Claude Code spawns them as background bash commands or parallel subagents:

Terminal window
# Agent 1: Implement auth module
vibe -p "Create src/lib/auth.ts with login, logout, and session functions.
Follow patterns in src/lib/database.ts." \
--workdir /path/to/project --max-turns 30 --output text &
# Agent 2: Write tests (different files)
vibe -p "Write tests for src/lib/utils.ts at src/tests/utils.test.ts.
Use vitest. Cover all exported functions." \
--workdir /path/to/project --max-turns 25 --output text &
# Agent 3: Generate docs (different files)
vibe -p "Document all exported functions in src/lib/ to docs/api.md.
Follow the existing style in docs/README.md." \
--workdir /path/to/project --max-turns 20 --output text &
wait # All three complete in parallel

Claude Code then reviews all results, checks for integration issues, and makes targeted fixes.

When step 2 needs step 1’s output, chain them:

Terminal window
# Step 1: Research (read-only, safe)
FINDINGS=$(vibe -p "Read src/api/ and list all endpoints, their methods, and parameters." \
--workdir /path/to/project \
--enabled-tools "read_file" --enabled-tools "grep" \
--max-turns 10 --output text)
# Claude Code reads $FINDINGS, plans the implementation, then:
# Step 2: Implement (based on research)
vibe -p "Based on these existing endpoints: [paste findings]
Add a new POST /api/users/reset-password endpoint following the same patterns." \
--workdir /path/to/project --max-turns 35 --output text

Restrict Vibe to read-only tools for pure investigation:

Terminal window
vibe -p "Read all files in src/components/ and src/lib/.
Find everywhere that calls the 'authenticate' function.
Report: which files, which line numbers, what arguments are passed." \
--workdir /path/to/project \
--enabled-tools "read_file" --enabled-tools "grep" --enabled-tools "bash" \
--max-turns 15 --output text

The --enabled-tools restriction means Vibe literally cannot modify files — defense-in-depth on top of headless mode.

Scenario: Implement a new “Reset Password” feature across API, frontend, and tests.

  1. Claude Code plans — breaks the feature into 4 independent tasks
  2. Claude Code dispatches parallel Vibe agents:
    • Agent 1: Create src/api/reset-password.ts (backend endpoint)
    • Agent 2: Create src/components/ResetPasswordForm.tsx (frontend)
    • Agent 3: Write src/tests/reset-password.test.ts (tests)
    • Agent 4: Update docs/api.md with new endpoint docs
  3. All 4 agents run simultaneously on unlimited Vibe tokens
  4. Claude Code reviews all generated files for:
    • Do imports resolve correctly across modules?
    • Does the frontend call the right API endpoint?
    • Do tests cover the actual implementation (not just stubs)?
    • Any security issues (input validation, auth checks)?
  5. Claude Code fixes integration issues (or dispatches targeted Vibe fixes)
  6. Claude Code commits with a conventional commit message
ApproachToken CostTime
Claude Code does everything~500K tokens ($2-10 depending on plan)1 session
Vibe does everythingFree (self-hosted) but weaker planningMay loop/fail
Claude orchestrates + Vibe implements~50K Claude tokens + unlimited VibeBest of both

Claude Code’s token spend drops by ~90% because it only handles planning, review, and synthesis — the three things it’s best at. All the heavy file reading, generation, and bulk edits happen on Vibe’s unlimited self-hosted backend.

cc-vibe — Using Claude Code with the Mistral API

Section titled “cc-vibe — Using Claude Code with the Mistral API”

A complementary setup to the orchestration pattern above: instead of Claude Code (Anthropic cloud) → Vibe (Mistral), run Claude Code itself on Mistral for everyday work, then escalate to cloud Claude only for hard tasks. Many community members alias this as cc-vibe.

The catch: Claude Code speaks the Anthropic Messages API, but api.mistral.ai speaks Mistral’s chat-completions format. They are not wire-compatible. You need a translator in front of Mistral. The standard choice is a local LiteLLM proxy.

┌──────────────┐ Anthropic ┌───────────────┐ Mistral chat ┌──────────────────┐
│ Claude Code │ Messages API │ LiteLLM proxy │ /v1/completions │ api.mistral.ai │
│ (cc-vibe) │ ─────────────► │ localhost │ ───────────────► │ (Official API) │
└──────────────┘ :4000 └───────────────┘ └──────────────────┘

1. Install LiteLLM:

Terminal window
uv tool install 'litellm[proxy]'
# or: pip install --user 'litellm[proxy]'

2. Configure the proxy at ~/.vibe/litellm-config.yaml:

# Mistral → Anthropic-compatible translator
# Exposes Mistral models at http://localhost:4000/v1/messages (Anthropic format).
model_list:
- model_name: mistral-small
litellm_params:
model: mistral/mistral-small-latest
api_key: os.environ/MISTRAL_API_KEY
- model_name: mistral-medium
litellm_params:
model: mistral/mistral-medium-latest
api_key: os.environ/MISTRAL_API_KEY
- model_name: mistral-large
litellm_params:
model: mistral/mistral-large-latest
api_key: os.environ/MISTRAL_API_KEY
- model_name: codestral
litellm_params:
model: mistral/codestral-latest
api_key: os.environ/MISTRAL_API_KEY
litellm_settings:
drop_params: true # silently drop unsupported params (Mistral rejects some)
set_verbose: false
general_settings:
master_key: os.environ/LITELLM_MASTER_KEY

3. Add keys to ~/.vibe/.env (the vibe CLI already reads this file):

Terminal window
MISTRAL_API_KEY=<your-mistral-key> # used by both Vibe and the LiteLLM proxy upstream
LITELLM_MASTER_KEY=sk-vibe-<random> # used by Claude Code to authenticate TO the proxy

The distinction matters and is the #1 setup mistake: MISTRAL_API_KEY authenticates the proxy to Mistral. LITELLM_MASTER_KEY authenticates Claude Code to the proxy. They are not interchangeable — the proxy will reject MISTRAL_API_KEY, and Mistral will reject LITELLM_MASTER_KEY.

4. Start the proxy (leave it running in the background, e.g. via launchd, systemd, or a tmux pane):

Terminal window
source ~/.vibe/.env
litellm --config ~/.vibe/litellm-config.yaml --port 4000 &

5. Point Claude Code at it. Easiest path is the community claude-switch helper, with this entry in ~/.claude-backends.env:

Terminal window
VIBE_API_KEY_FILE="$HOME/.vibe/.env"
VIBE_BASE_URL="http://localhost:4000"
VIBE_MODEL="mistral-medium"
VIBE_MODEL_NAME="Mistral Medium (Official API via LiteLLM)"

…and a switch_vibe() that exports ANTHROPIC_BASE_URL=$VIBE_BASE_URL and ANTHROPIC_AUTH_TOKEN=$LITELLM_MASTER_KEY. Then:

Terminal window
alias cc-vibe='source claude-switch vibe && claude --dangerously-skip-permissions --teammate-mode auto'
cc-vibe # Claude Code now talking to Mistral Medium via the local proxy

6. Smoke test the round-trip:

Terminal window
source claude-switch vibe
claude -p "reply with exactly: cc-vibe-ok" --output-format text
# → cc-vibe-ok

For the IrregularChat self-hosted backend (Open-WebUI / LiteLLM gateway): no local proxy needed — point ANTHROPIC_BASE_URL directly at the community gateway (which already exposes Anthropic-compatible endpoints) and use your Open-WebUI API key as ANTHROPIC_AUTH_TOKEN. See Claude Code with Self-Hosted Models → Setup with LiteLLM Gateway.

Both tools work as dispatch targets, but they have different strengths:

CapabilityVibeOpenCode
--workdir flagYesNo (must cd first)
LSP diagnosticsNoYes (TS, Go, Rust, Python)
Session continuityNo (stateless)Yes (--continue)
Cold start0.49s0.85s
JSONL cost/tokensNo (internal only)Yes (per-step events)
File writingYes (writes + shows text)Yes (writes + shows text)
Temp directory supportWorks (--workdir)Fails (needs project root)
Cost budget limit--max-priceNo

Rule of thumb: Use Vibe for one-shot dispatch to any directory. Use OpenCode for multi-step TypeScript/Go work where LSP matters. See the OpenCode orchestration guide for the OpenCode-specific pattern.

  • vibe CLI installed and on PATH (~/.local/bin/vibe)
  • ~/.vibe/.env with your API key configured
  • ~/.vibe/config.toml with provider and model configured
  • Self-hosted backend must handle concurrent requests (--max-num-seqs on vLLM ≥ number of parallel agents)