Skip to content

CLI + IDE Coding Agents — Plan Pricing vs Usage Limits

CLI + IDE Coding Agents — Plan Pricing vs Token/Context & Usage Limits

Section titled “CLI + IDE Coding Agents — Plan Pricing vs Token/Context & Usage Limits”

Snapshot date: 2026-02-14

Comparing AI coding agents is messy because most vendors do not publish token caps per plan, especially for consumer-oriented CLI and IDE add‑ons. This page distills what is actually disclosed, flags what is not stated, and links to vendor primary sources so you can cite them in acquisition paperwork or budget briefings. Use it alongside the Claude Code Guide, Gemini Code Guide, OpenHands Guide, and the Full-Stack Development with AI workflow for deeper operational context.


  • Plan allowance unit: Vendors express limits as tokens, requests, credits, or raw dollars. Pay attention to the unit before comparing plans.
  • Context window vs. quota: Some vendors publish context window sizes (max tokens in a single prompt) but not recurring monthly allowances.
  • “Not published” ≠ unlimited: When a doc says limits are not published, that only means the vendor hasn’t shared the numbers publicly.
  • Shared pools: Several tools share quotas between CLI/IDE agents and their main chat apps. If you burn through Claude Code, you also reduce Claude desktop/web usage.

  • Plan inclusion: Claude Code ships with Claude Pro and Claude Max subscriptions and shares usage limits with Claude web/desktop/mobile.
    Source: Anthropic Support
  • Token behavior: Anthropic confirms Claude Code consumes tokens and offers cost guidance, but does not publish a consumer-facing token cap table.
    Source: Claude Code – Costs
  • Context window (consumer): Anthropic describes automatic context management for paid Claude plans but does not state a fixed window size in the help article.
    Source: How large is the context window on paid Claude plans?
  • Enterprise/API note: The Claude API exposes beta 1,000,000 token windows on Opus 4.6 and Sonnet 4.x for certain org tiers, but this is separate from consumer Claude Code limits.
    Source: Build with Claude – Context windows
  • Pricing: Consumer pricing lives at claude.com/pricing; team/enterprise pricing details are in the Claude Code Funding guide.

Wiki-ready takeaways

  • ✅ Included with Pro/Max and shares limits with the main Claude app
  • ⚠️ No published “X tokens/day” number for consumer CLI usage
  • ✅ API/org tiers offer 1M-token beta windows (separate channel)
  • Plan inclusion: Codex CLI runs locally and can authenticate via ChatGPT Plus/Pro/Business/Edu/Enterprise or by API key.
    Source: Codex CLI docs
  • Pricing: OpenAI explains that some Codex modes track against ChatGPT plan entitlements, while API-key access is pay-per-token using standard GPT pricing.
    Source: Codex pricing
  • Model guidance: OpenAI recommends gpt-5.3-codex for most CLI work; Codex-tuned models appear in the API catalog.
    Source: Codex models and Prompting guide
  • Context window publication: OpenAI does not publish a CLI-specific context window number. Published limits instead focus on workspace model sizes (e.g., GPT-5.1 Thinking at 196K tokens in ChatGPT Enterprise/Edu).
    Source: ChatGPT Enterprise and Edu models & limits
  • Usage caps: ChatGPT help center articles reference weekly message allocations and higher context limits for GPT-5 Thinking, but no table maps those caps directly to Codex CLI.
    Source: ChatGPT usage limits

Wiki-ready takeaways

  • ✅ Codex CLI is bundled across the ChatGPT paid family or can run on an API key
  • ⚠️ Numeric CLI context window caps are not stated publicly
  • ✅ Enterprise/Edu workspaces document GPT model context windows (128K / 196K) but that data must be cited carefully because it is not branded as “Codex CLI limits”

1.3 Gemini CLI (Google Gemini Code Assist)

Section titled “1.3 Gemini CLI (Google Gemini Code Assist)”
  • Context window: Google Cloud’s Gemini Code Assist docs state 1,000,000 token context window for “local codebase awareness.”
    Source: Gemini quotas
  • Quota pool: Gemini CLI and “agent mode” share the same quota pool; a single CLI prompt may call multiple backend requests.
    Source: Gemini quotas
  • Requests per user:
    • Standard edition: 120 requests/user/minute, 1,500 requests/user/day
    • Enterprise edition: 120 requests/user/minute, 2,000 requests/user/day
      Source: Gemini quotas
  • Broader API rate limits: Google’s AI for Developers docs expand on RPM/TPM/RPD tiers for Gemini APIs.
    Source: Gemini API rate limits
  • Pricing context: Public pricing is fragmented across consumer “Gemini Advanced/AI Pro” bundles and Google Cloud Standard vs Enterprise editions. For CLI/agent comparisons, the Cloud quota doc is the authoritative table.

Wiki-ready takeaways

  • ✅ 1M token local context is explicitly documented
  • ✅ Requests/day and requests/minute caps are public
  • ✅ CLI + agent mode use the same quota pool, so workflows must budget for both

  • Context windows: Cursor typically runs at 200K token context windows. “Max Mode” expands to each model’s maximum (including 1M-token Gemini 3 Pro).
    Sources: Max Mode docs and Cursor models
  • Usage allowance: Cursor Pro bundles $20/month of “frontier model usage” priced at API rates; you can top up at cost.
    Source: June 2025 pricing update
  • Higher tier: Cursor Ultra costs $200/month and offers “20× more usage than Pro,” again denominated in dollar-equivalent usage pools rather than tokens.
    Source: New tier announcement
  • Token disclosure: Cursor has not published a per-plan “N tokens/month” chart; usage depends on which model you route through Max Mode.
  • Credits, not tokens: Windsurf plans grant monthly prompt credits: Free 25, Pro 500, Teams 500/user, Enterprise 1,000/user (and higher upon request).
    Source: Windsurf pricing
  • Consumption model: Credits are deducted per agent interaction, with multipliers based on the chosen model.
    Source: Windsurf usage docs
  • Context window disclosure: Windsurf’s “Fast Context” feature explains retrieval behavior but does not publish a numeric token window per plan.
    Source: Fast Context

2.3 GitHub Copilot (Coding agent features)

Section titled “2.3 GitHub Copilot (Coding agent features)”
  • Agent availability: GitHub’s coding agent capabilities are available on Copilot Pro, Pro+, Business, and Enterprise (depending on rollout).
    Source: About the coding agent
  • Billing unit: Copilot tracks requests and premium requests, not raw tokens. Organizations can allocate premium request budgets per seat.
    Sources: Copilot premium requests and Request definition
  • Token context: GitHub does not publish token window sizes. Practical planning is based on how many premium agent requests are in your monthly allocation.

ProductAgent TypePlan allowance unitPublished allowancePublished context window tokensNotes
Claude CodeCLIShared usage limits (Claude app + CLI)Included with Claude Pro / MaxConsumer CLI: not published; API beta 1M tokens on Opus/Sonnet tiersKeep team/enterprise purchasing nuances in Claude Code Funding
Codex CLICLIChatGPT plan entitlements or API pay-per-tokenIncluded with ChatGPT Plus/Pro/Business/Edu/Enterprise; API billed per tokenNot publicly published for the CLIWorkspace docs cite GPT-5.x context sizes (128K / 196K) but not branded as Codex CLI
Gemini CLI (Code Assist)CLIRequests per user (shared with agent mode)Standard: 1,500 req/day & 120 req/min; Enterprise: 2,000 req/day & 120 req/min1,000,000 tokens (local codebase awareness)Quotas documented in Google Cloud Gemini guide
CursorIDEDollar-equivalent usage poolPro: $20/mo included; Ultra: 20× ProDefault 200K; Max Mode up to model max (1M+)Token allowance varies with chosen model cost
WindsurfIDECredits per monthFree 25; Pro 500; Teams 500/user; Enterprise 1,000/userNot published (model-dependent)Credits deducted per agent call with multipliers
GitHub CopilotIDE/AgentRequests / premium requestsPremium request budgets per org planNot publishedLimits measured in request budgets, not tokens

Source: Maintained by the IrregularChat community. Submit PRs (or sheet updates) when vendors change pricing or limits.

ProductPlan / Price (USD)Allowance metricPublished context windowNotesSource
Claude CodePro $20/mo, Max $35/mo (consumer)Shared Claude usage pool across web/desktop/CLIConsumer CLI: Not published; API beta 1M tokens on Opus/Sonnet tiersIncluded with Pro/Max; auto context managementClaude pricing · Context window FAQ
Codex CLIChatGPT Plus $20/mo, ChatGPT Pro $50/mo, Business/Edu/Enterprise = contractChatGPT plan entitlements or API pay-per-tokenCLI-specific cap not published; workspace docs cite GPT-5.1 (128K) / GPT-5.1 Thinking (196K)Runs locally with ChatGPT auth or API keyCodex pricing · Enterprise limits
Gemini CLI (Code Assist)Standard & Enterprise seats billed via Google CloudStandard: 1,500 req/day + 120 rpm; Enterprise: 2,000 req/day + 120 rpm (shared with agent mode)1,000,000 tokens local codebase awarenessRequests are pooled between CLI + agent workflowsGemini quotas
CursorPro $20/mo (includes $20 API usage), Ultra $200/mo (20× usage)Dollar-based pool spent at API pricing; top-ups sold at costDefault 200K; Max Mode up to each model max (1M+ when supported)Usage varies with chosen model mixJune 2025 pricing · New tier
WindsurfFree, Pro, Teams, Enterprise (see pricing page for current USD rates)Monthly credits: 25 / 500 / 500 user / 1,000 userNot published (model-dependent via Fast Context)Credits burn faster on larger modelsWindsurf pricing · Usage docs
GitHub CopilotPro/Pro+/Business/Enterprise (per-seat, listed on GitHub pricing)Requests & premium requests per seat/org budgetNot publishedCoding agent availability depends on tier; request budgets throttle heavy workflowsCopilot agent · Premium requests

  1. Budget translation: Convert each vendor’s allowance (credits, requests, dollar pools) into the workload you care about (e.g., “How many full repo refactors per month?”). Capture those assumptions in your SOP so new operators know the tradeoffs.
  2. Pair with workflow guides: Use this page as a factual reference, then hand teammates to the Full-Stack Development with AI playbook for end-to-end harness design.
  3. Document floor vs. ceiling: Vendors reserve the right to throttle during abuse spikes. Keep the authoritative links above handy when leadership needs proof of the published numbers.
  4. Cross-check quarterly: Pricing and quota disclosures change frequently. Set a reminder to refresh these notes when revisiting procurement or renewing subscriptions.

Need more pricing nuance (especially for Anthropic enterprise seats)? See the Claude Code Funding & Subscriptions guide.