Skip to content

Best LLM Gateways of 2026

Updated · 7 picks · live pricing · affiliate disclosure

Production-gateway with caching, prompt management, and guardrails across 40+ providers.

BEST OVERALL8.1/10Save $360/yr

Portkey

Production-gateway with caching, prompt management, and guardrails across 40+ providers.

Developer Free 10K req/mo; cancel anytime

How it stacks up

  • Production $49/mo + 100K logs

    vs OpenRouter Pro $20

  • OSS Apache 2.0 March 2026

    vs LiteLLM Cloud Pro $50/user

  • Caching + guardrails + obs

    vs Helicone Pro $79

#2
OpenRouter7.9/10

From $20/mo

View
#3
Vercel AI Gateway7.8/10

From $20/mo

View

All picks at a glance

#PickBest forStartingScore
1PortkeyBest production-gateway with caching, fallbacks, and guardrails$49.00/mo8.1/10
2OpenRouterBest overall LLM gateway, the indie-dev brand reference$20.00/mo7.9/10
3Vercel AI GatewayBest frontend-stack-native gateway, bundled with Vercel$20.00/mo7.8/10
4LiteLLMBest open-source LLM gateway proxy, Apache 2 / MIT$50.00/mo7.5/10
5HeliconeBest observability-first LLM gateway, one-line proxy$79.00/mo6.6/10
6Cloudflare AI GatewayBest edge-native LLM gateway, Cloudflare Workers$5.00/mo4.6/10
7LangfuseBest open-source LLM observability with self-host$29.00/mo4.2/10

Quick pick by use case

If you only have thirty seconds, find your situation below and skip to that pick.

Compare all 7 picks

Top spec
#1Portkey8.1/10$49.00/mo$588.00/yrSave $360/yrProduction $49/mo + 100K logs
#2OpenRouter7.9/10$20.00/moSave $708/yrPro $20/mo + Slack
#3Vercel AI Gateway7.8/10$20.00/mo$240.00/yrSave $708/yrPro $20/user bundled
#4LiteLLM7.5/10$50.00/mo$600.00/yrSave $348/yrOSS free Apache 2 / MIT
#5Helicone6.6/10$79.00/mo$948.00/yrPro $79/mo + unlimited seats
#6Cloudflare AI Gateway4.6/10$200.00/mo$2,400.00/yr$1,452/yr moreWorkers Paid $5/mo
#7Langfuse4.2/10$199.00/mo$2,388.00/yr$1,440/yr moreCore $29/mo + 100K obs
#1

Portkey

8.1/10Save $360/yr

Best production-gateway with caching, fallbacks, and guardrails

Production-gateway with caching, prompt management, and guardrails across 40+ providers.

PlanMonthlyAnnualWhat you get
Open SourceFreeApache 2.0 self-hosted gateway with Universal API, retries, routing, guardrails, automatic fallbacks, basic dashboard, and load balancing; free forever (open-sourced March 2026)
Developer FreeFreeFree hosted tier with 10K recorded logs a month, Universal API, key management, 3 prompt templates, and basic observability
Production$49.00/mo$588.00/yr$49 a month for 100K logs with unlimited prompt templates, alerts, LLM guardrails, semantic caching, RBAC, and overages at $9 per additional 100K (up to 3M)
Enterprise$2,000.00/mo$24,000.00/yrCustom contract with 10M-plus logs, custom retention, advanced guardrails, SSO, VPC hosting, SOC 2 Type II, GDPR, and HIPAA

Portkey is the production-gateway pick. Founded in 2023 in San Francisco. The wedge is full feature breadth: caching, fallbacks, prompt management, guardrails, RBAC, SSO, and observability across 40+ providers. Portkey open-sourced their full gateway under Apache 2.0 on March 24, 2026.

Open Source Apache 2.0 is free forever with Universal API, retries, routing, guardrails, fallbacks, and load balancing for self-host. Developer Free hosted covers 10K logs a month with key management and basic observability. Production at $49 a month covers 100K logs with unlimited prompt templates, semantic caching, LLM guardrails, and RBAC; overages bill at $9 per 100K up to 3M (the realistic SMB paid entry). Enterprise covers 10M-plus logs, VPC hosting, SSO, SOC 2 Type II, GDPR, and HIPAA.

The trade-off versus OpenRouter: Portkey ships the gateway features that matter at production scale (cache reduces token spend; fallbacks survive provider outages; guardrails enforce content policies) but indie-dev brand recognition is narrower. OpenRouter wins the head-term reader by name; Portkey wins the production-team reader by feature breadth plus the open-source self-host option.

Pros

  • Open-sourced Apache 2.0 March 2026; full self-host gateway free forever
  • Caching, fallbacks, prompt management, guardrails, and observability on one dashboard
  • Production at $49/mo covers 100K logs with $9 overages up to 3M requests
  • Cloud handles 1T-plus tokens a day across 24,000-plus organizations as of March 2026
  • Enterprise covers VPC hosting with SOC 2 Type II, GDPR, and HIPAA

Cons

  • Indie-dev brand recognition narrower than OpenRouter; head-term reader skews mainstream-name
  • Self-host requires you to run and monitor the gateway; managed cloud is cleaner operationally
Production $49/mo + 100K logsOSS Apache 2.0 March 2026Caching + guardrails + obsDeveloper Free 10K req/mo; cancel anytime

Best for: Production teams that need caching, guardrails, and observability on one dashboard. Production $49/mo entry; OSS self-host free forever.

Routing
9
Latency
9
DX
8
Value
8
Support
8
#2

OpenRouter

7.9/10Save $708/yr

Best overall LLM gateway, the indie-dev brand reference

300+ models from 60+ providers behind a single OpenAI-compatible endpoint; founded 2023.

PlanMonthlyWhat you get
FreeFreeFree $1 credit with access to 300+ models from 60+ providers behind a unified OpenAI-compatible API
Pay-as-you-goFreePass-through provider pricing with zero markup on inference; a small Stripe fee applies on credit purchases
Pro$20.00/mo$20 a month credit minimum with higher rate limits, Slack support, and the indie-dev paid entry
Enterprise$1,000.00/moCustom contract with dedicated routing, private deploys, SLA, and audit logs

OpenRouter is the api-aggregator and the indie-dev brand reference for LLM gateways, founded in 2023 in San Francisco. The wedge is uniquely-true: 300+ models from 60+ providers behind a single OpenAI-compatible endpoint with pay-as-you-go credit-balance pricing and pass-through provider rates (zero markup on inference tokens).

Free gives every account a $1 starter credit with full access to all models for evaluation. Pay-as-you-go bills consumed tokens at the underlying provider price with no inference markup; a small Stripe processing fee applies on credit purchases. Pro at $20 a month is a credit minimum that unlocks higher rate limits and Slack support; the realistic indie-dev paid entry. Enterprise at $1,000 a month covers dedicated routing, private deploys, an SLA, and audit logs.

The trade-off is feature breadth, not pricing. OpenRouter ships fallbacks and basic rate limiting but not caching, guardrails, or prompt management; for high-volume production workloads the same workflow on Portkey Production at $49 a month adds caching that often pays for itself in token savings. OpenRouter is the right call when model variety matters more than gateway feature breadth and when you do not want to manage your own provider keys.

Pros

  • 300+ models from 60+ providers behind one OpenAI-compatible endpoint
  • $1 free credit with full model access for evaluation; no credit card required
  • Pass-through provider pricing with zero markup on inference tokens
  • Auto-fallback between providers when one returns errors or rate limits
  • Founded 2023; the most-recognized LLM gateway name among indie developers

Cons

  • No caching, no guardrails, no prompt management; lighter than Portkey or Helicone
  • Small Stripe processing fee on credit purchases (~5%) is separate from inference cost
Pro $20/mo + SlackPay-as-you-go 5% markup300+ models, 60+ providersFree $1 credit; cancel anytime

Best for: Indie developers and small teams building their first LLM apps who want one endpoint for 300+ models. Pro $20/mo credit minimum entry.

Routing
7
Latency
8
DX
10
Value
9
Support
7
#3

Vercel AI Gateway

7.8/10Save $708/yr

Best frontend-stack-native gateway, bundled with Vercel

Bundled with Vercel platform plans with no token markup; launched 2024 with AI SDK v5.

PlanMonthlyAnnualWhat you get
HobbyFreeFree with the Vercel Hobby plan with hundreds of models, no token markup, and built-in observability
Pro$20.00/mo$240.00/yr$20 per user a month bundled with the Vercel Pro plan with load balancing, fallbacks, and budgets
Enterprise$2,000.00/mo$24,000.00/yrCustom contract with private clusters, SAML SSO, audit logs, dedicated support, and SLA

Vercel AI Gateway is the frontend-stack-native pick. Launched in 2024 by Vercel (founded 2015 in San Francisco). The wedge is uniquely-true: bundled with Vercel platform plans and tightly integrated with the AI SDK v5/v6 ecosystem; no token markup; hundreds of models behind a unified endpoint with BYOK support.

Hobby is free with the Vercel Hobby plan with hundreds of models, no markup, BYOK, and built-in observability. Pro at $20 per user a month is bundled with Vercel Pro and adds load balancing, fallbacks, budgets, and spend monitoring; the realistic developer paid entry. Enterprise covers private clusters, SAML SSO, audit logs, and SLA.

The trade-off is ecosystem lock-in. Vercel AI Gateway is excellent if your stack is already on Vercel and you use the AI SDK; the integration is the cleanest in the category and tokens cost the same as direct provider pricing. Outside the Vercel ecosystem, the gateway's value drops because the AI SDK integration is the load-bearing differentiator. Teams not on Vercel should pick OpenRouter or Portkey instead. Composite math wins not by feature breadth but by zero markup and the bundled price.

Pros

  • Zero token markup; tokens cost the same as direct provider pricing
  • Tightly integrated with the Vercel AI SDK v5/v6 for the cleanest Next.js stack experience
  • Pro at $20/user/mo bundled with Vercel Pro covers load balancing, fallbacks, and budgets
  • BYOK supported: bring your own provider keys without surrendering control
  • Hundreds of models behind a unified endpoint with built-in spend monitoring

Cons

  • Ecosystem lock-in: AI SDK integration is the load-bearing differentiator outside Vercel platform
  • Less feature breadth than Portkey: no guardrails, narrower prompt management
Pro $20/user bundledHundreds of models, 0% markupAI SDK v5/v6 nativeFree with Vercel Hobby; cancel anytime

Best for: Next.js stack teams already on Vercel who want the cleanest AI SDK integration with no token markup. Pro $20/user/mo bundled with Vercel Pro.

Routing
8
Latency
9
DX
10
Value
9
Support
8
#4

LiteLLM

7.5/10Save $348/yr

Best open-source LLM gateway proxy, Apache 2 / MIT

Apache 2 / MIT-licensed Python proxy translating 100+ provider APIs to OpenAI Chat Completions; YC W23.

PlanMonthlyAnnualWhat you get
Open SourceFreeApache 2 / MIT-licensed self-hosted Python proxy that translates 100+ provider APIs to the OpenAI Chat Completions schema; free forever
Cloud FreeFreeFree hosted tier with limited request volume and standard provider integrations
Cloud Pro$50.00/mo$600.00/yr$50 per user a month with cost tracking, budgets, team management, and virtual API keys
Enterprise$2,000.00/mo$24,000.00/yrCustom contract with self-hosted enterprise, SSO, SOC 2, and HIPAA available

LiteLLM is the open-source-MIT-proxy pick. Founded in 2023 by BerriAI Inc (Y Combinator W23) in San Francisco. The wedge is uniquely-true: Apache 2 / MIT-licensed Python proxy that translates 100+ provider APIs to the OpenAI Chat Completions schema. Run it as a self-hosted service in front of your model providers and your application code targets the OpenAI SDK only.

Open Source is Apache 2 / MIT-licensed and free forever with self-hosted Python, 100+ provider integrations, and community support. Cloud Free is a hosted tier with limited request volume. Cloud Pro at $50 per user a month covers cost tracking, budgets, team management, and virtual API keys; the realistic team paid entry. Enterprise covers self-hosted enterprise, SSO, SOC 2, and HIPAA available.

The trade-off versus closed gateways: LiteLLM is the right tool when self-host control matters (regulated industries, on-prem deploys, code transparency) but the OSS Python proxy requires Python operational maturity. Many teams run LiteLLM behind their own application proxy and get the same routing benefits as Portkey or Helicone without the per-request cost. Cost-conscious teams at scale often migrate from OpenRouter to self-hosted LiteLLM.

Pros

  • Apache 2 / MIT-licensed Python proxy; 100+ provider integrations free forever
  • Self-host on your own infrastructure for full code control and zero markup
  • Cloud Pro at $50/user covers cost tracking, budgets, and virtual API keys
  • Drop-in replacement at the application level: OpenAI SDK calls work unchanged
  • Y Combinator W23; the most-recognized OSS LLM proxy among Python developers

Cons

  • Self-hosted Python proxy requires Python operational maturity to run reliably
  • Cloud Pro $50/user/mo scales fast for larger teams; self-host is cheaper above 5 users
OSS free Apache 2 / MITCloud Pro $50/user/mo100+ provider integrationsOSS Apache 2 / MIT free forever; cancel Cloud anytime

Best for: Python-stack teams who want an OSS proxy for full code control and zero markup. OSS free self-host or Cloud Pro $50/user/mo entry.

Routing
10
Latency
8
DX
7
Value
10
Support
7
#5

Helicone

6.6/10

Best observability-first LLM gateway, one-line proxy

One-line drop-in proxy with cost analytics and HQL query language; YC W23.

PlanMonthlyAnnualWhat you get
HobbyFreeFree 10K requests a month with 1 GB storage, single seat, and one organization
Pro$79.00/mo$948.00/yr$79 a month with unlimited seats, alerts, reports, and the HQL query language; the realistic SMB paid entry
Team$799.00/mo$9,588.00/yr$799 a month with 5 organizations, SOC-2 + HIPAA compliance, and dedicated Slack support
Enterprise$2,000.00/mo$24,000.00/yrCustom contract with custom MSA, SAML SSO, on-prem deployment, and bulk cloud discounts

Helicone is the observability-first pick. Founded in 2023 (Y Combinator W23) in San Francisco. The wedge is uniquely-true: one-line drop-in proxy that logs requests, tracks cost per session/user/property, and ships an HQL query language for analytics. Set the base URL once and every OpenAI SDK call gets logged with full request/response, latency, and token cost.

Hobby is free with 10K requests a month, 1 GB storage, and a single seat. Pro at $79 a month covers unlimited seats, alerts, reports, and HQL; the realistic SMB paid entry. Team at $799 a month covers 5 organizations, SOC-2, HIPAA compliance, and dedicated Slack support. Enterprise covers custom MSA, SAML SSO, on-prem deployment, and bulk discounts.

The trade-off versus full gateways: Helicone is observability-led with caching and fallbacks added later; the primary use case is logging and cost analytics, not routing. The Pro tier repriced from $20 to $79 a month in 2025 (a 295 percent increase). Compared to Langfuse Core $29 a month, Helicone Pro is more expensive but ships gateway proxying alongside the observability. Choose Helicone when you want observability AND gateway in one product.

Pros

  • One-line drop-in proxy: set base URL once, every OpenAI SDK call gets logged
  • Cost analytics per session, user, and property with HQL query language
  • Pro at $79/mo covers unlimited seats with alerts and reports
  • Team at $799/mo unlocks 5 organizations, SOC-2, and HIPAA
  • Y Combinator W23; popular among teams that want observability and gateway combined

Cons

  • Pro repriced from $20 to $79/mo in 2025 (295% increase); document trajectory in cons
  • Observability-first not gateway-first: caching and fallbacks added later than Portkey
Pro $79/mo + unlimited seatsTeam $799/mo + SOC-2One-line proxy + HQLHobby free 10K req/mo; cancel anytime

Best for: Teams that want observability and gateway in one product with cost tracking per session/user/property. Pro $79/mo entry.

Routing
9
Latency
8
DX
9
Value
7
Support
8
#6

Cloudflare AI Gateway

4.6/10$1,452/yr more

Best edge-native LLM gateway, Cloudflare Workers

Gateway bundled with Cloudflare Workers across 300+ data centers; launched 2023.

PlanMonthlyAnnualWhat you get
FreeFreeFree 10K requests a day with caching, analytics, rate limits, and OpenAI/Anthropic/Workers AI provider support
Workers Paid$5.00/mo$60.00/yr$5 a month for the Workers Paid plan with higher request limits and edge-native gateway access
Workers Standard$200.00/mo$2,400.00/yr$200 a month for Workers Standard with 50M Workers requests included and unlimited AI Gateway calls
Enterprise$2,000.00/mo$24,000.00/yrCustom contract with dedicated regions, private connect, SLA, and audit logs

Cloudflare AI Gateway is the edge-native pick. Launched in 2023 by Cloudflare (founded 2009 in San Francisco). The wedge is uniquely-true: gateway bundled with Cloudflare Workers across the 300+ data center edge network. Calls run from the closest Cloudflare PoP to the user with cached responses served at the edge.

Free covers 10K requests a day with caching, analytics, rate limits, and OpenAI/Anthropic/Workers AI provider support. Workers Paid at $5 a month covers higher request limits and edge-native gateway access; the realistic developer paid entry. Workers Standard at $200 a month bundles 50M Workers requests with unlimited AI Gateway calls. Enterprise covers dedicated regions, private connect, SLA, and audit logs.

The page score uses Workers Standard $200 because the tier name contains the substring that catches the standard match; this is a 3,900 percent overshoot from the realistic Workers Paid $5 entry, the largest single overshoot in this category and one of the largest in the system. The cons block acknowledges the gap. Cloudflare AI Gateway is the right call when you already run Cloudflare Workers and edge-native gateway latency matters; outside the Workers ecosystem, OpenRouter or Portkey ship more gateway features.

Pros

  • Edge-native: calls run from the closest Cloudflare PoP across 300+ data centers
  • Free 10K requests a day with caching, analytics, and rate limits
  • Workers Paid at $5/mo is the cheapest paid LLM gateway entry in the category
  • Tight integration with Cloudflare Workers for serverless edge LLM apps
  • Cloudflare brand and infrastructure: 99.99 percent uptime SLA on Enterprise

Cons

  • Page score uses Workers Standard $200, while realistic Workers Paid entry is $5 (3,900 percent gap)
  • Lighter feature set than Portkey or Helicone; gateway is bundled rather than featured
Workers Paid $5/moFree 10K req/dayEdge across 300+ PoPsFree 10K req/day; cancel anytime

Best for: Teams already running Cloudflare Workers who want edge-native LLM gateway latency. Workers Paid $5/mo entry; free 10K requests a day.

Routing
8
Latency
10
DX
7
Value
8
Support
7
#7

Langfuse

4.2/10$1,440/yr more

Best open-source LLM observability with self-host

MIT-licensed OSS LLM observability with self-host option; YC W23 in Berlin.

PlanMonthlyAnnualWhat you get
HobbyFreeFree 50K observations a month with all platform features (limits apply), 30-day data retention, and 2 users
Core$29.00/mo$348.00/yr$29 a month for 100K observations with unlimited users, 90-day retention, and in-app support
Pro$199.00/mo$2,388.00/yr$199 a month with unlimited history, 3-year data access, SOC 2, ISO 27001, HIPAA, and 20K req/min throughput
Enterprise$2,499.00/mo$29,988.00/yr$2,499 a month with audit logs, SCIM API, custom rate limits, dedicated support engineer, and uptime SLAs

Langfuse is the open-source-observability pick. Founded in 2023 (Y Combinator W23) in Berlin Germany. The wedge is uniquely-true: MIT-licensed OSS LLM observability with self-host option; tracing, evals, prompt management, and analytics on a project that you can run on your own infrastructure or use as Cloud.

Hobby is free with 50K observations a month, all platform features (with limits), 30-day data retention, and 2 users. Core at $29 a month covers 100K observations with unlimited users and 90-day retention; the realistic team paid entry. Pro at $199 a month adds 3-year retention, SOC 2, ISO 27001, HIPAA, and 20K req/min throughput. Enterprise at $2,499 a month covers audit logs, SCIM API, dedicated support, and SLAs.

The trade-off versus Helicone: Langfuse is observability-only (no gateway proxying) and ships as MIT OSS so you can self-host for unlimited usage. Helicone is observability-plus-gateway in one product. Choose Langfuse when self-host matters and gateway routing belongs to a separate layer (LiteLLM as proxy, Langfuse as observability). The Pro tier overshoots Core ($199 vs $29) by 586 percent because the heuristic matches the Pro tier name; the realistic team budget is Core $29 a month.

Pros

  • MIT-licensed OSS with self-host option for unlimited observation volume
  • Hobby free 50K observations a month with all features and 2 users
  • Core at $29/mo covers 100K observations with unlimited users and 90-day retention
  • Pro at $199/mo unlocks SOC 2, ISO 27001, HIPAA, and 3-year data retention
  • Berlin-based; the most-popular OSS LLM observability project (~6K GitHub stars)

Cons

  • Page score uses Pro $199, while realistic Core entry is $29 (586 percent gap)
  • Observability-only; gateway routing requires a separate proxy layer (LiteLLM, OpenRouter)
Core $29/mo + 100K obsPro $199 + SOC 2MIT OSS self-hostHobby free 50K obs/mo; MIT OSS self-host free forever

Best for: Teams that want MIT-licensed observability they can self-host with unlimited observation volume. Hobby free 50K obs or Core $29/mo entry.

Routing
10
Latency
8
DX
8
Value
9
Support
7

How we picked

Each pick gets a transparent composite score from price, features, free-tier availability, and editor fit. Pricing flows from our live database, so when a vendor changes prices the score updates here too.

We weight price 40 percent, features 30, free tier 15, fit 15. The largest tier-name overshoot in the system fires here: Cloudflare Workers Standard $200 vs Workers Paid $5 (a 3,900 percent gap). Langfuse Pro $199 vs Core $29 is 586 percent. Realistic developer budget: $0 OSS to $79 Helicone Pro at SMB scale; $49 Portkey Production with overages or $799 Helicone Team at production scale.

We don't claim "30,000 hours of testing." Our methodology is the formula above plus the editor's published verdict for each pick. Verifiable, auditable, and updated when the underlying data changes.

Why trust Subrupt

We're a subscription tracker first, a buying guide second. Every claim on this page is something you can check.

By use case

Best open-source MIT-licensed proxy

LiteLLM

Read the full review →

Best edge-native gateway

Cloudflare AI Gateway

Read the full review →

Best observability-first gateway

Helicone

Read the full review →

Cheapest LLM gateway

Vercel AI Gateway

Read the full review →

Best overall LLM gateway

OpenRouter

Read the full review →

Didn't make the list

Cut because the open-source observability wedge overlaps Langfuse without a unique tile flag; great for teams that want simpler analytics at $20/mo for 30K events. (France, 2023.)

Cut because the prompt-eval MLOps platform positioning is a different category from gateway routing; great for teams committed to eval-first workflows at $500/mo plus usage. (US, 2023.)

Cut because the full MLOps platform pricing ($300/mo plus usage) is enterprise-tier; great for teams running on-prem Kubernetes who want one platform for serving plus gateway. (US, 2022.)

Cut because the eval-first positioning and $249/mo entry overlaps Vellum at lower price; great for AI teams whose primary workflow is dataset-driven evaluation. (US, 2023.)

How to choose your LLM Gateways

Seven kinds of product compete for one head term

The 'best LLM gateways' search covers seven shapes for different jobs. OpenRouter Pro at $20/mo is the indie-dev brand reference with 300+ models from 60+ providers behind a single OpenAI-compatible endpoint and pass-through provider pricing (zero inference markup). Portkey Production at $49/mo is the production-gateway pick with caching, fallbacks, prompt management, guardrails, and observability across 40+ providers (full gateway open-sourced Apache 2.0 March 2026). LiteLLM Cloud Pro at $50/user is the Apache 2 / MIT-licensed Python proxy translating 100+ provider APIs. Helicone Pro at $79/mo is the observability-first one-line proxy with cost analytics. Vercel AI Gateway Pro at $20/user is the frontend-stack-native option bundled with Vercel. Cloudflare AI Gateway is edge-native with 10K free requests a day. Langfuse Core at $29/mo is the MIT-licensed open-source observability pick.

Why the page score sometimes shows the upgrade tier

Many LLM gateways use custom tier names (Hobby, Workers Paid, Workers Standard, Core, Pro) that our pricing heuristic resolves by matching against common tier labels (Pro, Premium, Standard). When the upgrade tier name matches and the entry tier name does not, the upgrade tier becomes the displayed typical price. Tier-name overshoots in this category include the largest single overshoot in the system: Cloudflare Workers Standard $200 vs Workers Paid $5 (a 3,900 percent gap). Langfuse Pro $199 vs Core $29 is a 586 percent gap. Portkey resolves cleanly because Production maps to $49 (the realistic SMB entry). The realistic developer budget is $0 OSS to $79/mo at the entry tier; production teams running >1M requests pay $49 with overages on Portkey Production or $799 on Helicone Team. The cons block on each pick acknowledges the gap.

Token billing models: pass-through, BYOK, and platform bundles

Most LLM gateways use one of three token billing models. Pass-through providers (OpenRouter, Vercel AI Gateway) charge no markup on inference; you pay the underlying provider rate and a small Stripe fee on credit purchases for OpenRouter. BYOK gateways (Portkey, Helicone, LiteLLM, Cloudflare AI Gateway) require you to bring provider API keys; the gateway charges only the platform subscription, so you negotiate token rates directly with OpenAI, Anthropic, or Google. Platform-bundled gateways (Vercel, Cloudflare) include the gateway in your existing platform plan; tokens still pass through at provider rates. Compute total cost: gateway subscription plus provider tokens. A team spending $5,000/mo on tokens pays Portkey $49 for the gateway plus $0 markup; spending $1,000/mo on OpenRouter pays $20 Pro plus $0 inference markup. Low-volume indie devs pay almost nothing; high-volume teams should compute breakeven before committing to a model.

OSS proxy vs OSS observability: pick the right OSS layer

Two open-source picks here serve different layers. LiteLLM is an OSS proxy (Apache 2 / MIT) that translates 100+ provider APIs to the OpenAI Chat Completions schema; you put it in front of your model providers and your application code targets only OpenAI SDK. Langfuse is OSS observability (MIT) that traces, evaluates, and analyzes calls; you point your proxy or application at it and get logs and analytics. Both can self-host on your own infrastructure for unlimited usage. The right combination at scale is often LiteLLM as the proxy layer and Langfuse as the observability layer, both self-hosted, with zero per-request fees. Closed gateways (Portkey, Helicone, Vercel AI Gateway) bundle proxy and observability in one product but charge per-request or per-user. Choose the OSS combination when self-host control and cost-at-scale matter.

Production-gateway features: caching, fallbacks, guardrails, RBAC

At production scale the gateway features that matter most are caching (reduces token spend on repeat queries), fallbacks (survives provider outages), guardrails (enforces content policies), and RBAC plus SSO (controls team access). Portkey ships all four on Production $49/mo (with $9 overages up to 3M requests) and on the open-sourced Apache 2.0 self-host. Helicone Pro $79 ships caching and fallbacks but no native guardrails. OpenRouter ships fallbacks and basic rate limiting but no caching or guardrails. Vercel AI Gateway ships caching and fallbacks bundled with Vercel platform plans. Cloudflare ships caching and edge-native fallbacks. LiteLLM self-hosted gives you all features but you build the dashboards yourself. The decision: at SMB scale OpenRouter or Vercel AI Gateway is enough; at >1M requests/mo or in regulated industries (HIPAA, SOC 2), Portkey or Helicone Team becomes load-bearing.

When NOT to add an LLM gateway

LLM gateways are the right tool for some teams and the wrong tool for others. Skip a gateway when these patterns apply. First, you make 100 LLM calls a day total and one provider key against the OpenAI or Anthropic SDK directly covers everything; a gateway adds operational complexity without delivering value. Second, your LLM workload is a one-time batch job (data labeling, content generation campaign) and OpenAI Batch API or Anthropic Message Batches at 50 percent markdown is cheaper than gateway routing. Third, your stack is an existing AWS Bedrock or Azure OpenAI deployment with VPC peering and you do not need cross-provider routing. Fourth, your only model is a self-hosted Llama or Mistral on your own GPUs; gateway routing benefits assume cross-provider switching. Fifth, you are still in the model-selection phase and have not chosen a primary provider; pick a model first, then add a gateway when scale or feature breadth justifies it.

Frequently asked questions

Are these prices guaranteed not to change?

Vendor pricing changes regularly. Rates here are what each vendor advertises in May 2026. Helicone repriced Pro from $20 to $79/mo in 2025 (a 295 percent increase). Langfuse Core launched at $29/mo in 2024 below the Hobby Free 50K observation limit. Vercel AI Gateway graduated from beta to GA in 2024 bundled with Vercel platform plans. Portkey added guardrails and RBAC to Pro in 2025. LiteLLM crossed 100 providers in 2024. Verify the current rate on the vendor site before signing up.

Does Subrupt earn a commission from any of these picks?

We track which picks have approved affiliate programs in our database, and the FTC disclosure block at the top of every guide names which ones currently have a click-tracking partnership. Affiliate revenue does not change ranking. The composite math runs against the same weights for every pick regardless of partnership. Picks without an affiliate program appear in the lineup based on editorial fit only.

Why is OpenRouter ranked first if Portkey wins the scoring math?

Portkey wins the raw composite at neutral fit because Production at $49 a month is cheap relative to the feature breadth (caching, fallbacks, prompt management, guardrails, observability across 40+ providers). We list OpenRouter first because it is the brand reference for indie-dev LLM gateways with 300+ models from 60+ providers behind a single endpoint. Portkey at picks 2 is the production-gateway upgrade; not the mainstream indie-dev head-term default.

What is the cheapest LLM gateway for a small team?

LiteLLM Open Source (Apache 2 / MIT) is genuinely free for self-hosted deployment; you pay infrastructure costs only. Cloudflare AI Gateway Free covers 10K requests a day at no cost. OpenRouter Free starts every account with $1 credit and full model access. Cheapest paid: Cloudflare Workers Paid $5/mo; Vercel AI Gateway Pro $20/user bundled; OpenRouter Pro $20/mo. Realistic 5-dev team budget: $0 (LiteLLM OSS self-host) to $100/mo (Vercel Pro x 5).

Why are LiteLLM, Helicone, and Langfuse all Y Combinator W23?

The LLM gateway category emerged in late 2022 and early 2023 as developers building on OpenAI hit the same problems: provider lock-in, no cost visibility, no caching, no fallbacks. Y Combinator funded several teams that winter (W23) at different layers: LiteLLM (OSS proxy), Helicone (observability-first proxy), Langfuse (OSS observability). OpenRouter and Portkey launched outside YC in the same window. The timing aligned with ChatGPT launch (Nov 2022) and the GPT-4 API release (March 2023).

Self-hosted LiteLLM vs Cloud LiteLLM: when does the OSS path pay off?

Self-hosted LiteLLM (Apache 2 / MIT) is free forever; Cloud LiteLLM Pro is $50 per user a month. For a 5-person team Cloud Pro costs $250/mo plus token spend; self-hosted costs only Python infrastructure (a $5-20 VPS for SMB scale). Self-host pays off when team size exceeds 3 people, when self-host control matters, or when token volume makes $250/mo gateway fee a meaningful share of LLM spend. Cloud Pro is cleaner operationally; self-host requires Python ops maturity.

Vercel AI Gateway vs OpenRouter on total cost: which costs less?

Both pass through token costs at provider rates with zero inference markup. Vercel AI Gateway requires Vercel Pro $20/user/mo. OpenRouter has no platform fee on Pay-as-you-go, $20/mo Pro for higher limits, and a small Stripe fee (~5 percent) on credit purchases. For solo devs the difference is rarely meaningful. For teams already on Vercel Pro the gateway is free incremental. Off Vercel, heavy credit top-ups make OpenRouter Stripe fees a line item; budget accordingly.

EU data residency: which picks support EU-only deployment?

Langfuse is Berlin-based with default EU residency on Cloud; self-host on EU infrastructure gives full control. Helicone Cloud is US-default; Enterprise covers EU on-prem. Portkey Cloud has multi-region with EU on Pro/Enterprise. LiteLLM self-hosted gives full data-residency control. OpenRouter routes via cheapest provider (US-default). Cloudflare and Vercel both run from EU PoPs. Strict EU buyers default to Langfuse with LiteLLM self-hosted.

Best LLM gateway for production at >1M requests a month?

Production-gateway picks dominate above 1M requests/mo. Portkey Production at $49/mo includes 100K logs with $9 overages up to 3M; cache hit reduces token spend 20-40 percent. The Apache 2.0 open-source Portkey gateway (March 2026) self-hosts the same routing for free. Helicone Team at $799/mo covers 5 orgs with SOC-2 and HIPAA. LiteLLM self-hosted scales linearly with infrastructure. Pick Portkey or Helicone for managed production; OSS Portkey or LiteLLM for cost-at-scale.

How often is this guide updated?

We re-review pricing annually at minimum, with mid-year refreshes when major vendor announcements happen. Helicone Pro reprice 2025, Langfuse Core launch 2024, Vercel AI Gateway GA 2024, Portkey guardrails launch 2025, and LiteLLM provider count milestones 2024 each triggered same-week catalog updates. Verify current rates on the vendor site before signing up. The lastReviewed date reflects the most recent editorial pass.

Subrupt Editorial

The team behind subrupt.com. We track subscriptions, surface cheaper alternatives, and publish buying guides where the score formula is on the page so you can recompute it yourself. We do not claim 30,000 hours of testing. What we claim is live pricing from our database, a transparent composite score, and honest savings math against a category baseline.

Last reviewed

Citations

Affiliate disclosure: Subrupt earns a commission when you switch to a service through our recommendation links. This never changes the price you pay. We only recommend services where there's a real cost or feature advantage for you, and our picks are based on the data on this page, not on which programs pay the most.

Related buying guides

Track your subscriptions on Subrupt

Add the LLM Gateways you pay for and see how much you'd save by switching.

Open dashboard

More buying guides

Independent rankings for the subscriptions worth paying for.

See all guides