Best LLM Gateways of 2026

Updated May 2, 2026 · 7 picks · live pricing · affiliate disclosure

Production-gateway with caching, prompt management, and guardrails across 40+ providers.

BEST OVERALL8.1/10Save $360/yr

Portkey

Production-gateway with caching, prompt management, and guardrails across 40+ providers.

Developer Free 10K req/mo; cancel anytime

Try Portkey See full review

How it stacks up

Production $49/mo + 100K logs
vs OpenRouter Pro $20
OSS Apache 2.0 March 2026
vs LiteLLM Cloud Pro $50/user
Caching + guardrails + obs
vs Helicone Pro $79

OpenRouter7.9/10

From $20/mo

View

Vercel AI Gateway7.8/10

From $20/mo

View

#	Pick	Best for	Starting	Score
1	Portkey	Best production-gateway with caching, fallbacks, and guardrails	$49.00/mo	8.1/10
2	OpenRouter	Best overall LLM gateway, the indie-dev brand reference	$20.00/mo	7.9/10
3	Vercel AI Gateway	Best frontend-stack-native gateway, bundled with Vercel	$20.00/mo	7.8/10
4	LiteLLM	Best open-source LLM gateway proxy, Apache 2 / MIT	$50.00/mo	7.5/10
5	Helicone	Best observability-first LLM gateway, one-line proxy	$79.00/mo	6.6/10
6	Cloudflare AI Gateway	Best edge-native LLM gateway, Cloudflare Workers	$5.00/mo	4.6/10
7	Langfuse	Best open-source LLM observability with self-host	$29.00/mo	4.2/10

Quick pick by use case

If you only have thirty seconds, find your situation below and skip to that pick.

If I want one endpoint for 300+ models without managing provider keysOpenRouterOpenRouter Pro at $20/mo with 300+ models from 60+ providers and pass-through provider pricing. The indie-dev brand reference.If I run LLM apps at production scale and need caching, guardrails, and RBACPortkeyPortkey Production at $49/mo with caching, fallbacks, prompt management, observability, guardrails, and RBAC; full Apache 2.0 OSS self-host since March 2026.If I want an Apache 2 / MIT-licensed OSS Python proxy I can self-hostLiteLLMLiteLLM Open Source free forever or Cloud Pro $50/user. 100+ provider integrations on the OpenAI Chat Completions schema.If I want observability and gateway in one product with cost analyticsHeliconeHelicone Pro at $79/mo with one-line proxy, cost per session/user/property, and HQL query language for analytics.If My stack is on Vercel and I use the AI SDKVercel AI GatewayVercel AI Gateway Pro $20/user bundled with Vercel Pro. Zero token markup; cleanest AI SDK integration in the category.If I run Cloudflare Workers and want edge-native LLM latencyCloudflare AI GatewayCloudflare AI Gateway Workers Paid $5/mo with edge across 300+ data centers; free 10K requests a day on Cloudflare Workers.

Compare all 7 picks

					Top spec
#1Portkey	8.1/10	$49.00/mo	$588.00/yr	Save $360/yr	Production $49/mo + 100K logs
#2OpenRouter	7.9/10	$20.00/mo	—	Save $708/yr	Pro $20/mo + Slack
#3Vercel AI Gateway	7.8/10	$20.00/mo	$240.00/yr	Save $708/yr	Pro $20/user bundled
#4LiteLLM	7.5/10	$50.00/mo	$600.00/yr	Save $348/yr	OSS free Apache 2 / MIT
#5Helicone	6.6/10	$79.00/mo	$948.00/yr	—	Pro $79/mo + unlimited seats
#6Cloudflare AI Gateway	4.6/10	$200.00/mo	$2,400.00/yr	$1,452/yr more	Workers Paid $5/mo
#7Langfuse	4.2/10	$199.00/mo	$2,388.00/yr	$1,440/yr more	Core $29/mo + 100K obs

Portkey

8.1/10Save $360/yr

Best production-gateway with caching, fallbacks, and guardrails

Try Portkey See Portkey alternatives

Production-gateway with caching, prompt management, and guardrails across 40+ providers.

Plan	Monthly	Annual	What you get
Open Source	Free	—	Apache 2.0 self-hosted gateway with Universal API, retries, routing, guardrails, automatic fallbacks, basic dashboard, and load balancing; free forever (open-sourced March 2026)
Developer Free	Free	—	Free hosted tier with 10K recorded logs a month, Universal API, key management, 3 prompt templates, and basic observability
Production	$49.00/mo	$588.00/yr	$49 a month for 100K logs with unlimited prompt templates, alerts, LLM guardrails, semantic caching, RBAC, and overages at $9 per additional 100K (up to 3M)
Enterprise	$2,000.00/mo	$24,000.00/yr	Custom contract with 10M-plus logs, custom retention, advanced guardrails, SSO, VPC hosting, SOC 2 Type II, GDPR, and HIPAA

Portkey is the production-gateway pick. Founded in 2023 in San Francisco. The wedge is full feature breadth: caching, fallbacks, prompt management, guardrails, RBAC, SSO, and observability across 40+ providers. Portkey open-sourced their full gateway under Apache 2.0 on March 24, 2026.

Open Source Apache 2.0 is free forever with Universal API, retries, routing, guardrails, fallbacks, and load balancing for self-host. Developer Free hosted covers 10K logs a month with key management and basic observability. Production at $49 a month covers 100K logs with unlimited prompt templates, semantic caching, LLM guardrails, and RBAC; overages bill at $9 per 100K up to 3M (the realistic SMB paid entry). Enterprise covers 10M-plus logs, VPC hosting, SSO, SOC 2 Type II, GDPR, and HIPAA.

The trade-off versus OpenRouter: Portkey ships the gateway features that matter at production scale (cache reduces token spend; fallbacks survive provider outages; guardrails enforce content policies) but indie-dev brand recognition is narrower. OpenRouter wins the head-term reader by name; Portkey wins the production-team reader by feature breadth plus the open-source self-host option.

Pros

Open-sourced Apache 2.0 March 2026; full self-host gateway free forever
Caching, fallbacks, prompt management, guardrails, and observability on one dashboard
Production at $49/mo covers 100K logs with $9 overages up to 3M requests
Cloud handles 1T-plus tokens a day across 24,000-plus organizations as of March 2026
Enterprise covers VPC hosting with SOC 2 Type II, GDPR, and HIPAA

Cons

Indie-dev brand recognition narrower than OpenRouter; head-term reader skews mainstream-name
Self-host requires you to run and monitor the gateway; managed cloud is cleaner operationally

Production $49/mo + 100K logsOSS Apache 2.0 March 2026Caching + guardrails + obsDeveloper Free 10K req/mo; cancel anytime

Best for: Production teams that need caching, guardrails, and observability on one dashboard. Production $49/mo entry; OSS self-host free forever.

Routing: 9
Latency: 9
DX: 8
Value: 8
Support: 8

Try Portkey

OpenRouter

7.9/10Save $708/yr

Best overall LLM gateway, the indie-dev brand reference

Try OpenRouter See OpenRouter alternatives

300+ models from 60+ providers behind a single OpenAI-compatible endpoint; founded 2023.

Plan	Monthly	What you get
Free	Free	Free $1 credit with access to 300+ models from 60+ providers behind a unified OpenAI-compatible API
Pay-as-you-go	Free	Pass-through provider pricing with zero markup on inference; a small Stripe fee applies on credit purchases
Pro	$20.00/mo	$20 a month credit minimum with higher rate limits, Slack support, and the indie-dev paid entry
Enterprise	$1,000.00/mo	Custom contract with dedicated routing, private deploys, SLA, and audit logs

OpenRouter is the api-aggregator and the indie-dev brand reference for LLM gateways, founded in 2023 in San Francisco. The wedge is uniquely-true: 300+ models from 60+ providers behind a single OpenAI-compatible endpoint with pay-as-you-go credit-balance pricing and pass-through provider rates (zero markup on inference tokens).

Free gives every account a $1 starter credit with full access to all models for evaluation. Pay-as-you-go bills consumed tokens at the underlying provider price with no inference markup; a small Stripe processing fee applies on credit purchases. Pro at $20 a month is a credit minimum that unlocks higher rate limits and Slack support; the realistic indie-dev paid entry. Enterprise at $1,000 a month covers dedicated routing, private deploys, an SLA, and audit logs.

The trade-off is feature breadth, not pricing. OpenRouter ships fallbacks and basic rate limiting but not caching, guardrails, or prompt management; for high-volume production workloads the same workflow on Portkey Production at $49 a month adds caching that often pays for itself in token savings. OpenRouter is the right call when model variety matters more than gateway feature breadth and when you do not want to manage your own provider keys.

Pros

300+ models from 60+ providers behind one OpenAI-compatible endpoint
$1 free credit with full model access for evaluation; no credit card required
Pass-through provider pricing with zero markup on inference tokens
Auto-fallback between providers when one returns errors or rate limits
Founded 2023; the most-recognized LLM gateway name among indie developers

Cons

No caching, no guardrails, no prompt management; lighter than Portkey or Helicone
Small Stripe processing fee on credit purchases (~5%) is separate from inference cost

Pro $20/mo + SlackPay-as-you-go 5% markup300+ models, 60+ providersFree $1 credit; cancel anytime

Best for: Indie developers and small teams building their first LLM apps who want one endpoint for 300+ models. Pro $20/mo credit minimum entry.

Routing: 7
Latency: 8
DX: 10
Value: 9
Support: 7

Try OpenRouter

Vercel AI Gateway

7.8/10Save $708/yr

Best frontend-stack-native gateway, bundled with Vercel

Try Vercel AI Gateway See Vercel AI Gateway alternatives

Bundled with Vercel platform plans with no token markup; launched 2024 with AI SDK v5.

Plan	Monthly	Annual	What you get
Hobby	Free	—	Free with the Vercel Hobby plan with hundreds of models, no token markup, and built-in observability
Pro	$20.00/mo	$240.00/yr	$20 per user a month bundled with the Vercel Pro plan with load balancing, fallbacks, and budgets
Enterprise	$2,000.00/mo	$24,000.00/yr	Custom contract with private clusters, SAML SSO, audit logs, dedicated support, and SLA

Vercel AI Gateway is the frontend-stack-native pick. Launched in 2024 by Vercel (founded 2015 in San Francisco). The wedge is uniquely-true: bundled with Vercel platform plans and tightly integrated with the AI SDK v5/v6 ecosystem; no token markup; hundreds of models behind a unified endpoint with BYOK support.

Hobby is free with the Vercel Hobby plan with hundreds of models, no markup, BYOK, and built-in observability. Pro at $20 per user a month is bundled with Vercel Pro and adds load balancing, fallbacks, budgets, and spend monitoring; the realistic developer paid entry. Enterprise covers private clusters, SAML SSO, audit logs, and SLA.

The trade-off is ecosystem lock-in. Vercel AI Gateway is excellent if your stack is already on Vercel and you use the AI SDK; the integration is the cleanest in the category and tokens cost the same as direct provider pricing. Outside the Vercel ecosystem, the gateway's value drops because the AI SDK integration is the load-bearing differentiator. Teams not on Vercel should pick OpenRouter or Portkey instead. Composite math wins not by feature breadth but by zero markup and the bundled price.

Pros

Zero token markup; tokens cost the same as direct provider pricing
Tightly integrated with the Vercel AI SDK v5/v6 for the cleanest Next.js stack experience
Pro at $20/user/mo bundled with Vercel Pro covers load balancing, fallbacks, and budgets
BYOK supported: bring your own provider keys without surrendering control
Hundreds of models behind a unified endpoint with built-in spend monitoring

Cons

Ecosystem lock-in: AI SDK integration is the load-bearing differentiator outside Vercel platform
Less feature breadth than Portkey: no guardrails, narrower prompt management

Pro $20/user bundledHundreds of models, 0% markupAI SDK v5/v6 nativeFree with Vercel Hobby; cancel anytime

Best for: Next.js stack teams already on Vercel who want the cleanest AI SDK integration with no token markup. Pro $20/user/mo bundled with Vercel Pro.

Routing: 8
Latency: 9
DX: 10
Value: 9
Support: 8

Try Vercel AI Gateway

LiteLLM

7.5/10Save $348/yr

Best open-source LLM gateway proxy, Apache 2 / MIT

Try LiteLLM See LiteLLM alternatives

Apache 2 / MIT-licensed Python proxy translating 100+ provider APIs to OpenAI Chat Completions; YC W23.

Plan	Monthly	Annual	What you get
Open Source	Free	—	Apache 2 / MIT-licensed self-hosted Python proxy that translates 100+ provider APIs to the OpenAI Chat Completions schema; free forever
Cloud Free	Free	—	Free hosted tier with limited request volume and standard provider integrations
Cloud Pro	$50.00/mo	$600.00/yr	$50 per user a month with cost tracking, budgets, team management, and virtual API keys
Enterprise	$2,000.00/mo	$24,000.00/yr	Custom contract with self-hosted enterprise, SSO, SOC 2, and HIPAA available

LiteLLM is the open-source-MIT-proxy pick. Founded in 2023 by BerriAI Inc (Y Combinator W23) in San Francisco. The wedge is uniquely-true: Apache 2 / MIT-licensed Python proxy that translates 100+ provider APIs to the OpenAI Chat Completions schema. Run it as a self-hosted service in front of your model providers and your application code targets the OpenAI SDK only.

Open Source is Apache 2 / MIT-licensed and free forever with self-hosted Python, 100+ provider integrations, and community support. Cloud Free is a hosted tier with limited request volume. Cloud Pro at $50 per user a month covers cost tracking, budgets, team management, and virtual API keys; the realistic team paid entry. Enterprise covers self-hosted enterprise, SSO, SOC 2, and HIPAA available.

The trade-off versus closed gateways: LiteLLM is the right tool when self-host control matters (regulated industries, on-prem deploys, code transparency) but the OSS Python proxy requires Python operational maturity. Many teams run LiteLLM behind their own application proxy and get the same routing benefits as Portkey or Helicone without the per-request cost. Cost-conscious teams at scale often migrate from OpenRouter to self-hosted LiteLLM.

Pros

Apache 2 / MIT-licensed Python proxy; 100+ provider integrations free forever
Self-host on your own infrastructure for full code control and zero markup
Cloud Pro at $50/user covers cost tracking, budgets, and virtual API keys
Drop-in replacement at the application level: OpenAI SDK calls work unchanged
Y Combinator W23; the most-recognized OSS LLM proxy among Python developers

Cons

Self-hosted Python proxy requires Python operational maturity to run reliably
Cloud Pro $50/user/mo scales fast for larger teams; self-host is cheaper above 5 users

OSS free Apache 2 / MITCloud Pro $50/user/mo100+ provider integrationsOSS Apache 2 / MIT free forever; cancel Cloud anytime

Best for: Python-stack teams who want an OSS proxy for full code control and zero markup. OSS free self-host or Cloud Pro $50/user/mo entry.

Routing: 10
Latency: 8
DX: 7
Value: 10
Support: 7

Try LiteLLM

Helicone

6.6/10

Best observability-first LLM gateway, one-line proxy

Try Helicone See Helicone alternatives

One-line drop-in proxy with cost analytics and HQL query language; YC W23.

Plan	Monthly	Annual	What you get
Hobby	Free	—	Free 10K requests a month with 1 GB storage, single seat, and one organization
Pro	$79.00/mo	$948.00/yr	$79 a month with unlimited seats, alerts, reports, and the HQL query language; the realistic SMB paid entry
Team	$799.00/mo	$9,588.00/yr	$799 a month with 5 organizations, SOC-2 + HIPAA compliance, and dedicated Slack support
Enterprise	$2,000.00/mo	$24,000.00/yr	Custom contract with custom MSA, SAML SSO, on-prem deployment, and bulk cloud discounts

Helicone is the observability-first pick. Founded in 2023 (Y Combinator W23) in San Francisco. The wedge is uniquely-true: one-line drop-in proxy that logs requests, tracks cost per session/user/property, and ships an HQL query language for analytics. Set the base URL once and every OpenAI SDK call gets logged with full request/response, latency, and token cost.

Hobby is free with 10K requests a month, 1 GB storage, and a single seat. Pro at $79 a month covers unlimited seats, alerts, reports, and HQL; the realistic SMB paid entry. Team at $799 a month covers 5 organizations, SOC-2, HIPAA compliance, and dedicated Slack support. Enterprise covers custom MSA, SAML SSO, on-prem deployment, and bulk discounts.

The trade-off versus full gateways: Helicone is observability-led with caching and fallbacks added later; the primary use case is logging and cost analytics, not routing. The Pro tier repriced from $20 to $79 a month in 2025 (a 295 percent increase). Compared to Langfuse Core $29 a month, Helicone Pro is more expensive but ships gateway proxying alongside the observability. Choose Helicone when you want observability AND gateway in one product.

Pros

One-line drop-in proxy: set base URL once, every OpenAI SDK call gets logged
Cost analytics per session, user, and property with HQL query language
Pro at $79/mo covers unlimited seats with alerts and reports
Team at $799/mo unlocks 5 organizations, SOC-2, and HIPAA
Y Combinator W23; popular among teams that want observability and gateway combined

Cons

Pro repriced from $20 to $79/mo in 2025 (295% increase); document trajectory in cons
Observability-first not gateway-first: caching and fallbacks added later than Portkey

Pro $79/mo + unlimited seatsTeam $799/mo + SOC-2One-line proxy + HQLHobby free 10K req/mo; cancel anytime

Best for: Teams that want observability and gateway in one product with cost tracking per session/user/property. Pro $79/mo entry.

Routing: 9
Latency: 8
DX: 9
Value: 7
Support: 8

Try Helicone

Cloudflare AI Gateway

4.6/10$1,452/yr more

Best edge-native LLM gateway, Cloudflare Workers

Try Cloudflare AI Gateway See Cloudflare AI Gateway alternatives

Gateway bundled with Cloudflare Workers across 300+ data centers; launched 2023.

Plan	Monthly	Annual	What you get
Free	Free	—	Free 10K requests a day with caching, analytics, rate limits, and OpenAI/Anthropic/Workers AI provider support
Workers Paid	$5.00/mo	$60.00/yr	$5 a month for the Workers Paid plan with higher request limits and edge-native gateway access
Workers Standard	$200.00/mo	$2,400.00/yr	$200 a month for Workers Standard with 50M Workers requests included and unlimited AI Gateway calls
Enterprise	$2,000.00/mo	$24,000.00/yr	Custom contract with dedicated regions, private connect, SLA, and audit logs

Cloudflare AI Gateway is the edge-native pick. Launched in 2023 by Cloudflare (founded 2009 in San Francisco). The wedge is uniquely-true: gateway bundled with Cloudflare Workers across the 300+ data center edge network. Calls run from the closest Cloudflare PoP to the user with cached responses served at the edge.

Free covers 10K requests a day with caching, analytics, rate limits, and OpenAI/Anthropic/Workers AI provider support. Workers Paid at $5 a month covers higher request limits and edge-native gateway access; the realistic developer paid entry. Workers Standard at $200 a month bundles 50M Workers requests with unlimited AI Gateway calls. Enterprise covers dedicated regions, private connect, SLA, and audit logs.

The page score uses Workers Standard $200 because the tier name contains the substring that catches the standard match; this is a 3,900 percent overshoot from the realistic Workers Paid $5 entry, the largest single overshoot in this category and one of the largest in the system. The cons block acknowledges the gap. Cloudflare AI Gateway is the right call when you already run Cloudflare Workers and edge-native gateway latency matters; outside the Workers ecosystem, OpenRouter or Portkey ship more gateway features.

Pros

Edge-native: calls run from the closest Cloudflare PoP across 300+ data centers
Free 10K requests a day with caching, analytics, and rate limits
Workers Paid at $5/mo is the cheapest paid LLM gateway entry in the category
Tight integration with Cloudflare Workers for serverless edge LLM apps
Cloudflare brand and infrastructure: 99.99 percent uptime SLA on Enterprise

Cons

Page score uses Workers Standard $200, while realistic Workers Paid entry is $5 (3,900 percent gap)
Lighter feature set than Portkey or Helicone; gateway is bundled rather than featured

Workers Paid $5/moFree 10K req/dayEdge across 300+ PoPsFree 10K req/day; cancel anytime

Best for: Teams already running Cloudflare Workers who want edge-native LLM gateway latency. Workers Paid $5/mo entry; free 10K requests a day.

Routing: 8
Latency: 10
DX: 7
Value: 8
Support: 7

Try Cloudflare AI Gateway

Langfuse

4.2/10$1,440/yr more

Best open-source LLM observability with self-host

Try Langfuse See Langfuse alternatives

MIT-licensed OSS LLM observability with self-host option; YC W23 in Berlin.

Plan	Monthly	Annual	What you get
Hobby	Free	—	Free 50K observations a month with all platform features (limits apply), 30-day data retention, and 2 users
Core	$29.00/mo	$348.00/yr	$29 a month for 100K observations with unlimited users, 90-day retention, and in-app support
Pro	$199.00/mo	$2,388.00/yr	$199 a month with unlimited history, 3-year data access, SOC 2, ISO 27001, HIPAA, and 20K req/min throughput
Enterprise	$2,499.00/mo	$29,988.00/yr	$2,499 a month with audit logs, SCIM API, custom rate limits, dedicated support engineer, and uptime SLAs

Langfuse is the open-source-observability pick. Founded in 2023 (Y Combinator W23) in Berlin Germany. The wedge is uniquely-true: MIT-licensed OSS LLM observability with self-host option; tracing, evals, prompt management, and analytics on a project that you can run on your own infrastructure or use as Cloud.

Hobby is free with 50K observations a month, all platform features (with limits), 30-day data retention, and 2 users. Core at $29 a month covers 100K observations with unlimited users and 90-day retention; the realistic team paid entry. Pro at $199 a month adds 3-year retention, SOC 2, ISO 27001, HIPAA, and 20K req/min throughput. Enterprise at $2,499 a month covers audit logs, SCIM API, dedicated support, and SLAs.

The trade-off versus Helicone: Langfuse is observability-only (no gateway proxying) and ships as MIT OSS so you can self-host for unlimited usage. Helicone is observability-plus-gateway in one product. Choose Langfuse when self-host matters and gateway routing belongs to a separate layer (LiteLLM as proxy, Langfuse as observability). The Pro tier overshoots Core ($199 vs $29) by 586 percent because the heuristic matches the Pro tier name; the realistic team budget is Core $29 a month.

Pros

MIT-licensed OSS with self-host option for unlimited observation volume
Hobby free 50K observations a month with all features and 2 users
Core at $29/mo covers 100K observations with unlimited users and 90-day retention
Pro at $199/mo unlocks SOC 2, ISO 27001, HIPAA, and 3-year data retention
Berlin-based; the most-popular OSS LLM observability project (~6K GitHub stars)

Cons

Page score uses Pro $199, while realistic Core entry is $29 (586 percent gap)
Observability-only; gateway routing requires a separate proxy layer (LiteLLM, OpenRouter)

Core $29/mo + 100K obsPro $199 + SOC 2MIT OSS self-hostHobby free 50K obs/mo; MIT OSS self-host free forever

Best for: Teams that want MIT-licensed observability they can self-host with unlimited observation volume. Hobby free 50K obs or Core $29/mo entry.

Routing: 10
Latency: 8
DX: 8
Value: 9
Support: 7

Try Langfuse

How we picked

Each pick gets a transparent composite score from price, features, free-tier availability, and editor fit. Pricing flows from our live database, so when a vendor changes prices the score updates here too.

We weight price 40 percent, features 30, free tier 15, fit 15. The largest tier-name overshoot in the system fires here: Cloudflare Workers Standard $200 vs Workers Paid $5 (a 3,900 percent gap). Langfuse Pro $199 vs Core $29 is 586 percent. Realistic developer budget: $0 OSS to $79 Helicone Pro at SMB scale; $49 Portkey Production with overages or $799 Helicone Team at production scale.

40%
Price
Cheaper relative to category average ranks higher.
30%
Features
How many of the category-specific features the pick claims.
15%
Free tier
A free tier earns full points; no free tier earns zero.
15%
Editor fit
How well an LLM gateway fits a head-term developer reader: routing depth (providers + fallbacks), latency overhead, observability + cost tracking, developer experience, and price-fit at the realistic entry tier.

We don't claim "30,000 hours of testing." Our methodology is the formula above plus the editor's published verdict for each pick. Verifiable, auditable, and updated when the underlying data changes.

Why trust Subrupt

We're a subscription tracker first, a buying guide second. Every claim on this page is something you can check.

Live pricing. Prices come from our own database, refreshed as vendors update them. When a price moves, the composite score moves with it.
Public methodology. The score is a published formula, not a vibe. The weights are listed right above this block, and you can recompute them yourself.
Honest savings math. Savings are computed against a category baseline, not against the vendor's own list price. We don't inflate the headline.
Affiliate disclosure on every page. When we earn a commission we say so. The editor's pick order is decided by the score, not by who pays the most.

By use case

Best open-source MIT-licensed proxy

LiteLLM

Read the full review →

Try LiteLLM

Best edge-native gateway

Cloudflare AI Gateway

Read the full review →

Try Cloudflare AI Gateway

Best observability-first gateway

Helicone

Read the full review →

Try Helicone

Cheapest LLM gateway

Vercel AI Gateway

Read the full review →

Try Vercel AI Gateway

Best overall LLM gateway

OpenRouter

Read the full review →

Try OpenRouter

Didn't make the list

Lunary

Cut because the open-source observability wedge overlaps Langfuse without a unique tile flag; great for teams that want simpler analytics at $20/mo for 30K events. (France, 2023.)

Vellum

Cut because the prompt-eval MLOps platform positioning is a different category from gateway routing; great for teams committed to eval-first workflows at $500/mo plus usage. (US, 2023.)

TrueFoundry

Cut because the full MLOps platform pricing ($300/mo plus usage) is enterprise-tier; great for teams running on-prem Kubernetes who want one platform for serving plus gateway. (US, 2022.)

Braintrust

Cut because the eval-first positioning and $249/mo entry overlaps Vellum at lower price; great for AI teams whose primary workflow is dataset-driven evaluation. (US, 2023.)

How to choose your LLM Gateways

Seven kinds of product compete for one head term

The 'best LLM gateways' search covers seven shapes for different jobs. OpenRouter Pro at $20/mo is the indie-dev brand reference with 300+ models from 60+ providers behind a single OpenAI-compatible endpoint and pass-through provider pricing (zero inference markup). Portkey Production at $49/mo is the production-gateway pick with caching, fallbacks, prompt management, guardrails, and observability across 40+ providers (full gateway open-sourced Apache 2.0 March 2026). LiteLLM Cloud Pro at $50/user is the Apache 2 / MIT-licensed Python proxy translating 100+ provider APIs. Helicone Pro at $79/mo is the observability-first one-line proxy with cost analytics. Vercel AI Gateway Pro at $20/user is the frontend-stack-native option bundled with Vercel. Cloudflare AI Gateway is edge-native with 10K free requests a day. Langfuse Core at $29/mo is the MIT-licensed open-source observability pick.

Why the page score sometimes shows the upgrade tier

Many LLM gateways use custom tier names (Hobby, Workers Paid, Workers Standard, Core, Pro) that our pricing heuristic resolves by matching against common tier labels (Pro, Premium, Standard). When the upgrade tier name matches and the entry tier name does not, the upgrade tier becomes the displayed typical price. Tier-name overshoots in this category include the largest single overshoot in the system: Cloudflare Workers Standard $200 vs Workers Paid $5 (a 3,900 percent gap). Langfuse Pro $199 vs Core $29 is a 586 percent gap. Portkey resolves cleanly because Production maps to $49 (the realistic SMB entry). The realistic developer budget is $0 OSS to $79/mo at the entry tier; production teams running >1M requests pay $49 with overages on Portkey Production or $799 on Helicone Team. The cons block on each pick acknowledges the gap.

Token billing models: pass-through, BYOK, and platform bundles

Most LLM gateways use one of three token billing models. Pass-through providers (OpenRouter, Vercel AI Gateway) charge no markup on inference; you pay the underlying provider rate and a small Stripe fee on credit purchases for OpenRouter. BYOK gateways (Portkey, Helicone, LiteLLM, Cloudflare AI Gateway) require you to bring provider API keys; the gateway charges only the platform subscription, so you negotiate token rates directly with OpenAI, Anthropic, or Google. Platform-bundled gateways (Vercel, Cloudflare) include the gateway in your existing platform plan; tokens still pass through at provider rates. Compute total cost: gateway subscription plus provider tokens. A team spending $5,000/mo on tokens pays Portkey $49 for the gateway plus $0 markup; spending $1,000/mo on OpenRouter pays $20 Pro plus $0 inference markup. Low-volume indie devs pay almost nothing; high-volume teams should compute breakeven before committing to a model.

OSS proxy vs OSS observability: pick the right OSS layer

Two open-source picks here serve different layers. LiteLLM is an OSS proxy (Apache 2 / MIT) that translates 100+ provider APIs to the OpenAI Chat Completions schema; you put it in front of your model providers and your application code targets only OpenAI SDK. Langfuse is OSS observability (MIT) that traces, evaluates, and analyzes calls; you point your proxy or application at it and get logs and analytics. Both can self-host on your own infrastructure for unlimited usage. The right combination at scale is often LiteLLM as the proxy layer and Langfuse as the observability layer, both self-hosted, with zero per-request fees. Closed gateways (Portkey, Helicone, Vercel AI Gateway) bundle proxy and observability in one product but charge per-request or per-user. Choose the OSS combination when self-host control and cost-at-scale matter.

Production-gateway features: caching, fallbacks, guardrails, RBAC

At production scale the gateway features that matter most are caching (reduces token spend on repeat queries), fallbacks (survives provider outages), guardrails (enforces content policies), and RBAC plus SSO (controls team access). Portkey ships all four on Production $49/mo (with $9 overages up to 3M requests) and on the open-sourced Apache 2.0 self-host. Helicone Pro $79 ships caching and fallbacks but no native guardrails. OpenRouter ships fallbacks and basic rate limiting but no caching or guardrails. Vercel AI Gateway ships caching and fallbacks bundled with Vercel platform plans. Cloudflare ships caching and edge-native fallbacks. LiteLLM self-hosted gives you all features but you build the dashboards yourself. The decision: at SMB scale OpenRouter or Vercel AI Gateway is enough; at >1M requests/mo or in regulated industries (HIPAA, SOC 2), Portkey or Helicone Team becomes load-bearing.

When NOT to add an LLM gateway

LLM gateways are the right tool for some teams and the wrong tool for others. Skip a gateway when these patterns apply. First, you make 100 LLM calls a day total and one provider key against the OpenAI or Anthropic SDK directly covers everything; a gateway adds operational complexity without delivering value. Second, your LLM workload is a one-time batch job (data labeling, content generation campaign) and OpenAI Batch API or Anthropic Message Batches at 50 percent markdown is cheaper than gateway routing. Third, your stack is an existing AWS Bedrock or Azure OpenAI deployment with VPC peering and you do not need cross-provider routing. Fourth, your only model is a self-hosted Llama or Mistral on your own GPUs; gateway routing benefits assume cross-provider switching. Fifth, you are still in the model-selection phase and have not chosen a primary provider; pick a model first, then add a gateway when scale or feature breadth justifies it.

Frequently asked questions

Are these prices guaranteed not to change?

Vendor pricing changes regularly. Rates here are what each vendor advertises in May 2026. Helicone repriced Pro from $20 to $79/mo in 2025 (a 295 percent increase). Langfuse Core launched at $29/mo in 2024 below the Hobby Free 50K observation limit. Vercel AI Gateway graduated from beta to GA in 2024 bundled with Vercel platform plans. Portkey added guardrails and RBAC to Pro in 2025. LiteLLM crossed 100 providers in 2024. Verify the current rate on the vendor site before signing up.

Does Subrupt earn a commission from any of these picks?

We track which picks have approved affiliate programs in our database, and the FTC disclosure block at the top of every guide names which ones currently have a click-tracking partnership. Affiliate revenue does not change ranking. The composite math runs against the same weights for every pick regardless of partnership. Picks without an affiliate program appear in the lineup based on editorial fit only.

Why is OpenRouter ranked first if Portkey wins the scoring math?

Portkey wins the raw composite at neutral fit because Production at $49 a month is cheap relative to the feature breadth (caching, fallbacks, prompt management, guardrails, observability across 40+ providers). We list OpenRouter first because it is the brand reference for indie-dev LLM gateways with 300+ models from 60+ providers behind a single endpoint. Portkey at picks 2 is the production-gateway upgrade; not the mainstream indie-dev head-term default.

What is the cheapest LLM gateway for a small team?

LiteLLM Open Source (Apache 2 / MIT) is genuinely free for self-hosted deployment; you pay infrastructure costs only. Cloudflare AI Gateway Free covers 10K requests a day at no cost. OpenRouter Free starts every account with $1 credit and full model access. Cheapest paid: Cloudflare Workers Paid $5/mo; Vercel AI Gateway Pro $20/user bundled; OpenRouter Pro $20/mo. Realistic 5-dev team budget: $0 (LiteLLM OSS self-host) to $100/mo (Vercel Pro x 5).

Why are LiteLLM, Helicone, and Langfuse all Y Combinator W23?

The LLM gateway category emerged in late 2022 and early 2023 as developers building on OpenAI hit the same problems: provider lock-in, no cost visibility, no caching, no fallbacks. Y Combinator funded several teams that winter (W23) at different layers: LiteLLM (OSS proxy), Helicone (observability-first proxy), Langfuse (OSS observability). OpenRouter and Portkey launched outside YC in the same window. The timing aligned with ChatGPT launch (Nov 2022) and the GPT-4 API release (March 2023).

Self-hosted LiteLLM vs Cloud LiteLLM: when does the OSS path pay off?

Self-hosted LiteLLM (Apache 2 / MIT) is free forever; Cloud LiteLLM Pro is $50 per user a month. For a 5-person team Cloud Pro costs $250/mo plus token spend; self-hosted costs only Python infrastructure (a $5-20 VPS for SMB scale). Self-host pays off when team size exceeds 3 people, when self-host control matters, or when token volume makes $250/mo gateway fee a meaningful share of LLM spend. Cloud Pro is cleaner operationally; self-host requires Python ops maturity.

Vercel AI Gateway vs OpenRouter on total cost: which costs less?

Both pass through token costs at provider rates with zero inference markup. Vercel AI Gateway requires Vercel Pro $20/user/mo. OpenRouter has no platform fee on Pay-as-you-go, $20/mo Pro for higher limits, and a small Stripe fee (~5 percent) on credit purchases. For solo devs the difference is rarely meaningful. For teams already on Vercel Pro the gateway is free incremental. Off Vercel, heavy credit top-ups make OpenRouter Stripe fees a line item; budget accordingly.

EU data residency: which picks support EU-only deployment?

Langfuse is Berlin-based with default EU residency on Cloud; self-host on EU infrastructure gives full control. Helicone Cloud is US-default; Enterprise covers EU on-prem. Portkey Cloud has multi-region with EU on Pro/Enterprise. LiteLLM self-hosted gives full data-residency control. OpenRouter routes via cheapest provider (US-default). Cloudflare and Vercel both run from EU PoPs. Strict EU buyers default to Langfuse with LiteLLM self-hosted.

Best LLM gateway for production at >1M requests a month?

Production-gateway picks dominate above 1M requests/mo. Portkey Production at $49/mo includes 100K logs with $9 overages up to 3M; cache hit reduces token spend 20-40 percent. The Apache 2.0 open-source Portkey gateway (March 2026) self-hosts the same routing for free. Helicone Team at $799/mo covers 5 orgs with SOC-2 and HIPAA. LiteLLM self-hosted scales linearly with infrastructure. Pick Portkey or Helicone for managed production; OSS Portkey or LiteLLM for cost-at-scale.

How often is this guide updated?

We re-review pricing annually at minimum, with mid-year refreshes when major vendor announcements happen. Helicone Pro reprice 2025, Langfuse Core launch 2024, Vercel AI Gateway GA 2024, Portkey guardrails launch 2025, and LiteLLM provider count milestones 2024 each triggered same-week catalog updates. Verify current rates on the vendor site before signing up. The lastReviewed date reflects the most recent editorial pass.

Subrupt Editorial

The team behind subrupt.com. We track subscriptions, surface cheaper alternatives, and publish buying guides where the score formula is on the page so you can recompute it yourself. We do not claim 30,000 hours of testing. What we claim is live pricing from our database, a transparent composite score, and honest savings math against a category baseline.

Last reviewed May 2, 2026

Citations

Affiliate disclosure: Subrupt earns a commission when you switch to a service through our recommendation links. This never changes the price you pay. We only recommend services where there's a real cost or feature advantage for you, and our picks are based on the data on this page, not on which programs pay the most.

Related buying guides

Buying guide

Best Threat Intelligence Platforms of 2026

Read guide

Buying guide

Best VPNs of 2026

Read guide

Buying guide

Best Free VPNs of 2026

Read guide

Track your subscriptions on Subrupt

Add the LLM Gateways you pay for and see how much you'd save by switching.

Open dashboard

More buying guides

Independent rankings for the subscriptions worth paying for.

See all guides

Portkey

All picks at a glance

Quick pick by use case

Compare all 7 picks

Pros

Cons

Pros

Cons

Pros

Cons

Pros

Cons

Pros

Cons

Pros

Cons

Pros

Cons

How we picked

Why trust Subrupt

By use case

Best open-source MIT-licensed proxy

Best edge-native gateway

Best observability-first gateway

Cheapest LLM gateway

Best overall LLM gateway

Didn't make the list

How to choose your LLM Gateways

Seven kinds of product compete for one head term

Why the page score sometimes shows the upgrade tier

Token billing models: pass-through, BYOK, and platform bundles

OSS proxy vs OSS observability: pick the right OSS layer

Production-gateway features: caching, fallbacks, guardrails, RBAC

When NOT to add an LLM gateway

Frequently asked questions

Related buying guides

Track your subscriptions on Subrupt

More buying guides