GPU CLI

Hosted Proxy Quickstart

Create a shared OpenAI-compatible endpoint for your organization

Hosted Proxy Quickstart

This is the shortest happy path for getting an org-scoped hosted proxy online.

Use this guide when you want:

  • one shared OpenAI-compatible base URL for your team
  • API keys that look like sk-gpu-*
  • a portal-managed deployment instead of a laptop-local proxy

This flow assumes you are the org owner or an org admin.

What You Need

  • GPU CLI installed
  • gpu login completed
  • permission to create or manage an organization

If you have not created an organization yet, do that first.

Supported Client APIs

The hosted proxy gives your org one shared public URL and supports these client-facing APIs:

  • OpenAI GET /v1/models
  • OpenAI POST /v1/chat/completions
  • OpenAI POST /v1/responses
  • Anthropic POST /v1/messages

Authentication is accepted as either:

  • Authorization: Bearer sk-gpu-...
  • x-api-key: sk-gpu-...

That means the same hosted proxy can work with OpenAI-style clients and Anthropic-style clients.

1. Create or Switch to the Organization

Create a new org:

gpu org create "My Team"

Or switch to an existing org:

gpu org switch my-team

All hosted-proxy commands act on the active org context.

2. Create the Hosted Proxy

gpu proxy create --hosted

This asks the portal control plane to provision a shared hosted proxy for the active organization.

3. Wait Until It Is Ready

Check deployment status:

gpu proxy hosted status

Once it reports ready, print the connection details:

gpu proxy hosted connect-info

You should end up with an HTTPS base URL that looks like:

https://<your-org-hosted-proxy>/v1

4. Create a Team API Key

Generate an API key for yourself or a teammate:

gpu proxy keys create --name "alice"

The CLI prints the new sk-gpu-* key once. Copy it into your password manager or secret store right away.

5. Point a Client at the Hosted Proxy

Export the URL from connect-info and the key you just created:

export OPENAI_BASE_URL=https://<your-org-hosted-proxy>/v1
export OPENAI_API_KEY=sk-gpu-xxxxxxxx

OpenAI compatibility

Quick connectivity check:

curl -sS -i \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  "$OPENAI_BASE_URL/models"

If your org has not attached any models yet, the request can still succeed with an empty model list.

OpenAI Chat Completions example:

curl -sS \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  "$OPENAI_BASE_URL/chat/completions" \
  -d '{
    "model": "qwen3.5:9b",
    "messages": [{"role": "user", "content": "Say hello"}]
  }'

OpenAI Responses example:

curl -sS \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  "$OPENAI_BASE_URL/responses" \
  -d '{
    "model": "qwen3.5:9b",
    "input": "Say hello"
  }'

Anthropic compatibility

Anthropic-style clients can point at the same deployment:

export ANTHROPIC_BASE_URL=https://<your-org-hosted-proxy>/v1
export ANTHROPIC_API_KEY=sk-gpu-xxxxxxxx

Anthropic Messages example:

curl -sS \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  "$ANTHROPIC_BASE_URL/messages" \
  -d '{
    "model": "qwen3.5:9b",
    "max_tokens": 128,
    "messages": [{"role": "user", "content": "Say hello"}]
  }'

6. Share With the Team

For each teammate or automation identity:

  • create a separate API key with gpu proxy keys create
  • give them the shared OPENAI_BASE_URL
  • give them only their own sk-gpu-* key

That keeps access revocable per user or workflow.

Common Follow-Ups

Check current deployment state:

gpu proxy hosted status --json

List existing keys:

gpu proxy keys list

Revoke a key:

gpu proxy keys revoke --name "alice"

Upgrade the hosted daemon to your installed GPU CLI version:

gpu proxy hosted upgrade --yes

Destroy the hosted proxy when you no longer need it:

gpu proxy hosted destroy

On this page