Hosted Proxy Quickstart
Create a shared OpenAI-compatible endpoint for your organization
Hosted Proxy Quickstart
This is the shortest happy path for getting an org-scoped hosted proxy online.
Use this guide when you want:
- one shared OpenAI-compatible base URL for your team
- API keys that look like
sk-gpu-* - a portal-managed deployment instead of a laptop-local proxy
This flow assumes you are the org owner or an org admin.
What You Need
- GPU CLI installed
gpu logincompleted- permission to create or manage an organization
If you have not created an organization yet, do that first.
Supported Client APIs
The hosted proxy gives your org one shared public URL and supports these client-facing APIs:
- OpenAI
GET /v1/models - OpenAI
POST /v1/chat/completions - OpenAI
POST /v1/responses - Anthropic
POST /v1/messages
Authentication is accepted as either:
Authorization: Bearer sk-gpu-...x-api-key: sk-gpu-...
That means the same hosted proxy can work with OpenAI-style clients and Anthropic-style clients.
1. Create or Switch to the Organization
Create a new org:
gpu org create "My Team"Or switch to an existing org:
gpu org switch my-teamAll hosted-proxy commands act on the active org context.
2. Create the Hosted Proxy
gpu proxy create --hostedThis asks the portal control plane to provision a shared hosted proxy for the active organization.
3. Wait Until It Is Ready
Check deployment status:
gpu proxy hosted statusOnce it reports ready, print the connection details:
gpu proxy hosted connect-infoYou should end up with an HTTPS base URL that looks like:
https://<your-org-hosted-proxy>/v14. Create a Team API Key
Generate an API key for yourself or a teammate:
gpu proxy keys create --name "alice"The CLI prints the new sk-gpu-* key once. Copy it into your password manager or secret store right away.
5. Point a Client at the Hosted Proxy
Export the URL from connect-info and the key you just created:
export OPENAI_BASE_URL=https://<your-org-hosted-proxy>/v1
export OPENAI_API_KEY=sk-gpu-xxxxxxxxOpenAI compatibility
Quick connectivity check:
curl -sS -i \
-H "Authorization: Bearer $OPENAI_API_KEY" \
"$OPENAI_BASE_URL/models"If your org has not attached any models yet, the request can still succeed with an empty model list.
OpenAI Chat Completions example:
curl -sS \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
"$OPENAI_BASE_URL/chat/completions" \
-d '{
"model": "qwen3.5:9b",
"messages": [{"role": "user", "content": "Say hello"}]
}'OpenAI Responses example:
curl -sS \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
"$OPENAI_BASE_URL/responses" \
-d '{
"model": "qwen3.5:9b",
"input": "Say hello"
}'Anthropic compatibility
Anthropic-style clients can point at the same deployment:
export ANTHROPIC_BASE_URL=https://<your-org-hosted-proxy>/v1
export ANTHROPIC_API_KEY=sk-gpu-xxxxxxxxAnthropic Messages example:
curl -sS \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
"$ANTHROPIC_BASE_URL/messages" \
-d '{
"model": "qwen3.5:9b",
"max_tokens": 128,
"messages": [{"role": "user", "content": "Say hello"}]
}'6. Share With the Team
For each teammate or automation identity:
- create a separate API key with
gpu proxy keys create - give them the shared
OPENAI_BASE_URL - give them only their own
sk-gpu-*key
That keeps access revocable per user or workflow.
Common Follow-Ups
Check current deployment state:
gpu proxy hosted status --jsonList existing keys:
gpu proxy keys listRevoke a key:
gpu proxy keys revoke --name "alice"Upgrade the hosted daemon to your installed GPU CLI version:
gpu proxy hosted upgrade --yesDestroy the hosted proxy when you no longer need it:
gpu proxy hosted destroy