For OpenAI

Proxies for the full OpenAI API surface

Chat, Embeddings, DALL-E, Realtime, Assistants — all covered by the same header-based gateway routing. Residential for regional eval, ISP for multi-turn Assistants, datacenter for bulk Embeddings.

Updated 23 April 2026

Recommended exit classes

Recommended country anchors

Where this differs from the ChatGPT landing

The ChatGPT API landing is focused on the Chat Completions surface specifically (the primary consumer- branded surface). This page covers the broader OpenAI API platform: Embeddings, DALL-E image generation, the Realtime API for voice applications, and the Assistants API for agent-like multi-turn interactions.

Surface-by-surface routing shape

Chat Completions

See ChatGPT API landing for the detailed configuration. Residential + per-request for regional eval; ISP

sticky for multi-turn agent benchmarks.

Embeddings

High-volume, latency-tolerant, cost-sensitive. Datacenter is the right class. Concurrency is the binding constraint — the OpenAI Embeddings API rate limits per account, not per IP, so the proxy layer serves to distribute connection load, not rate-limit bypass.

def embed(texts: list[str]):
    return httpx.post(
        "https://api.openai.com/v1/embeddings",
        json={"model": "text-embedding-3-large", "input": texts},
        headers={
            "Authorization": f"Bearer {API_KEY}",
            "X-Squad-Class": "datacenter",
            "X-Squad-Country": "us",
        },
        proxies=PROXY,
        timeout=60,
    ).json()

DALL-E (images.generations)

Evaluation of regional content policy for image generation is similar to the Chat eval pattern. Residential per-country for authentic origin; per-request rotation.

Realtime API

Low-latency voice. Datacenter via us-east-1 is the right shape; residential adds latency that hurts voice UX measurements. The Realtime API itself has carrier-origin sensitivity in some deployments, so for specifically-mobile realtime eval, switch to 4G mobile routing — see 4G mobile.

Assistants API

Multi-turn by definition. ISP + sticky-30m is the right shape for eval that runs the same assistant across many turns.

Plans that fit

See pricing. Large-scale Embeddings workloads are typically bottlenecked on OpenAI-side rate limits, not SquadProxy bandwidth, so the Solo or Team plan usually covers the proxy side. Continuous eval fleets across multiple surfaces scale into Team or Lab.

Pricing

Pricing — plans sized for OpenAI workloads

Every plan includes access to all 5 exit classes across our 10 focus countries — quotas vary by plan. The size you need scales with your eval cadence and concurrency.

Solo

For individual researchers running evaluation scripts and prototype RAG pipelines.

$149/ month

or $1,430/year (save 20%)

50 GB residential · unlimited datacenter · 200 concurrent sessions

✓Access to all 5 exit classes · 10 focus countries
✓50 GB residential · unlimited datacenter
✓5 static ISP IPs · 5 GB 4G mobile
✓1 seat · 200 concurrent sessions
✓Python + Node SDK + REST API
✓Per-request metering (not time-based)
✓Email support (24h response, business days)
✓Overage: $3/GB residential · $6/GB mobile

Start with Solo

Best for

Solo researchers
Evaluation scripts
Prototype RAG

Team

Lab

For academic labs, eval consortia, and frontier model companies running sustained workloads.

$2,999/ month

or $28,790/year (save 20%)

2 TB residential · unlimited DC · 50 GB 4G + 20 GB 5G · 3,000 concurrent sessions

✓Access to all 5 exit classes · 10 countries on 4 continents
✓2 TB residential · unlimited datacenter
✓100 static ISP IPs · 50 GB 4G + 20 GB 5G mobile
✓50 seats ($19/mo per extra seat) · 3,000 concurrent sessions
✓Dedicated gateway lane (bypasses shared-pool queues on us-east-1 + eu-west-1)
✓99.95% uptime SLA
✓Dedicated Slack channel (1h response, business hours)
✓Custom BGP prefix on request (additional fees apply)
✓Overage: $2.50/GB residential · $5/GB mobile

Start with Lab

Best for

Academic labs
Large eval consortia
Frontier model companies

Enterprise

Custom contracts with dedicated infrastructure, volume pricing, and research-grade SLAs.

Custom pricing

Custom (from 5 TB/mo residential) · unlimited concurrent sessions

✓Volume pricing from 5 TB/mo residential
✓Dedicated BGP prefix + ASN announcement
✓Unlimited concurrent sessions · unlimited seats
✓99.99% uptime SLA with financial credits
✓Named Technical Account Manager + 24/7 on-call paging
✓Custom AUP, DPA, on-site deployment option
✓Research / academic discount (30–50% off Team or Lab)
✓Annual contract · wire, ACH, USDC/USDT/BTC settlement

Contact research team

Best for

Frontier labs
Eval consortia
Enterprise AI

All plans include 14-day refund, single endpoint with regional failover, HTTP(S) + SOCKS5 on every exit class, access to all 5 exit classes and all 10 focus countries, and Python + Node SDKs. Concurrent sessions = simultaneous TCP sessions through the gateway. Overage warnings fire at 80% and 100%; traffic continues only if overage billing is enabled on your account.

Other API landings

Routing traffic for a different AI API?

Start routing OpenAI traffic through SquadProxy

Real ASNs, real edge capacity, and an engineer who answers your Slack the first time.

See pricing Contact sales

Proxies for the full OpenAI API surface

Where this differs from the ChatGPT landing

Surface-by-surface routing shape

Chat Completions

Embeddings

DALL-E (images.generations)

Realtime API

Assistants API

Plans that fit

Related

Solo

Team

Lab

Enterprise

Routing traffic for a different AI API?

Start routing OpenAI traffic through SquadProxy