GPU CLI

Commands Reference

Current reference for public GPU CLI commands

Commands Reference

This page tracks the public GPU CLI surface that ships today. Use gpu --help for a quick overview, gpu --help-all for advanced and hidden commands, and LLM Inference for the routed LLM workflow.

gpu run

Run any command on a remote GPU pod.

gpu run <COMMAND>

Common examples

gpu run python train.py
gpu run -d python long_job.py
gpu run -p 8080:8080 python server.py
gpu run --gpu-type "RTX 4090" python train.py
gpu run --attach job_abc123

Key flags

FlagDescription
--attach, -a <JOB_ID>Reattach to an existing job
--detach, -dSubmit the job and return immediately
--status, -sShow current pod state and recent jobs
--cancel <JOB_ID>Cancel a running job
--tail, -n <N>Show last N lines when attaching to a completed job
--interactive, -iAllocate a PTY and keep stdin open
--gpu-type <TYPE>Override GPU type for this run
--gpu-count <N>Request multiple GPUs (1-8)
--min-vram <GB>Set fallback VRAM floor
--rebuildForce pod recreation when the Dockerfile has changed
--output, -o <PATH>Override synced output paths
--no-outputDisable output syncing
--syncWait for output sync before exiting
--outputsShow live daemon-managed output sync events
--output-summaryShow a brief sync summary on completion
--no-summarySuppress the end-of-run summary banner
--show-syncShow detailed sync progress
--force-syncIgnore change detection and do a full sync
--remote-path <PATH>Override the remote workspace path
--publish, -p <[LOCAL:]REMOTE>Publish remote ports to localhost
--no-port-forwardDisable automatic port detection
--no-persistent-proxyStop the local proxy when the pod stops
--env, -e <KEY=VALUE>Set environment variables
--jsonJSON output after submit; only works with --detach

gpu attach

Reattach to a job by ID. This is a focused alias for gpu run --attach.

gpu attach job_abc123
gpu attach job_abc123 --tail 50

gpu cp

Copy files to or from the remote workspace.

gpu cp model.pt gpu:/workspace/
gpu cp gpu:/workspace/results/ ./results
gpu cp -r ./data gpu:/workspace/data/

Use the gpu: prefix, or the shorthand : prefix, for the remote side.

gpu login

Authenticate with GPU CLI in your browser.

gpu login
FlagDescription
--timeout <SECONDS>Browser auth timeout (default: 300)

gpu logout

Remove the stored browser session.

gpu logout
FlagDescription
--yes, -y, --forceSkip confirmation

gpu auth

Manage provider and model-hub credentials.

auth login

Authenticate with your cloud provider profile.

gpu auth login
gpu auth login --profile dev
gpu auth login --generate-ssh-keys

auth logout

Remove provider credentials.

gpu auth logout

auth status

Show current auth state.

gpu auth status

auth add

Add HuggingFace or Civitai credentials for model downloads.

gpu auth add hf
gpu auth add civitai --token <VALUE>

auth remove

Remove model-hub credentials.

gpu auth remove hf

auth hubs

List configured model-hub credentials.

gpu auth hubs

gpu init

Create a gpu.jsonc file for the current project.

gpu init
FlagDescription
--gpu-type <TYPE>Default GPU type
--max-price <PRICE>Maximum hourly price
--profile <NAME>Profile to use
--encryption / --no-encryptionToggle volume encryption
--force, -fReinitialize an existing config

gpu inventory

List available GPUs and pricing.

gpu inventory
gpu inventory --available
gpu inventory --min-vram 24 --max-price 1.00
gpu inventory --json

gpu status

Show project status, active pods, recent jobs, and cost information.

gpu status
gpu status --json

gpu dashboard

Open the TUI dashboard for pods and jobs.

gpu dashboard

Keybindings: j/k to move, Tab to switch panels, Enter to expand, a to attach, c to cancel, s to stop, e for events, ? for help, q to quit.

gpu logs

Stream or inspect unified logs for jobs, sync events, hooks, and agent output.

gpu logs
gpu logs --job job_abc123 --tail 50
gpu logs --follow --type lifecycle
gpu logs --json

gpu stop

Stop the active pod immediately.

gpu stop
gpu stop pod_abc123 --no-sync
FlagDescription
[POD_ID]Optional pod ID
--yes, -y, --forceSkip confirmation
--no-syncSkip final output sync
--jsonOutput JSON

gpu use

Run a template directly or resume a template session.

gpu use ollama
gpu use vllm
gpu use owner/repo@v1.0.0
gpu use

Flags

FlagDescription
--name <NAME>Override the session or project name
--yesSkip interactive prompts
--dry-runShow what would be created
--input <KEY=VALUE>Provide template input values

See LLM Inference for the built-in Ollama and vLLM template paths.

gpu llm

Launch the routed LLM workflow for Ollama or vLLM.

gpu llm run
gpu llm run --ollama --model deepseek-r1:8b -y
gpu llm run --vllm --url meta-llama/Llama-3.1-8B-Instruct -y
gpu llm info deepseek-r1:70b
gpu llm info --url meta-llama/Llama-3.1-8B-Instruct --json

Use LLM Inference for setup, port layout, and wake-on-request routing behavior.

gpu comfyui

Run ComfyUI workflows from the curated workflow catalog.

gpu comfyui list
gpu comfyui info flux_schnell
gpu comfyui validate flux_schnell --gpu-type "RTX 4090"
gpu comfyui run flux_schnell --volume-id vol_abc123
gpu comfyui generate "a cat astronaut on mars"
gpu comfyui stop --all

gpu notebook

Run Marimo notebooks on GPU pods.

gpu notebook
gpu notebook train.py
gpu notebook train.py --run
gpu notebook --new analysis

Common flags

FlagDescription
--run, -rServe in read-only app mode
--new <NAME>Create a new notebook
--gpu-type <TYPE>Override GPU selection
--volume-id <ID>Use a specific network volume
--no-volumeUse ephemeral storage
--yes, -ySkip confirmation

gpu volume

Manage network volumes.

volume list

gpu volume list
gpu volume list --detailed
gpu volume list --json

volume create

gpu volume create --name my-models --size 200
gpu volume create --datacenter US-TX-3 --set-global

volume delete

gpu volume delete my-models --force

volume extend

gpu volume extend my-models --size 300

volume set-global

gpu volume set-global my-models

volume status

gpu volume status
gpu volume status --volume my-models --json

volume migrate

gpu volume migrate my-models --to US-OR-1

volume sync

gpu volume sync source-volume dest-volume
gpu volume sync source-volume dest-volume --method rsync

volume cancel-migration

gpu volume cancel-migration transfer_abc123

gpu vault

Manage encrypted vault storage for sensitive outputs.

vault list

gpu vault list
gpu vault list --project my-project --json

vault export

gpu vault export checkpoints/model.pt ./model.pt
gpu vault export generated/image.png ~/Desktop/ --force

vault stats

gpu vault stats
gpu vault stats --project my-project --json

gpu proxy

Manage the local proxy router for OpenAI-compatible model access.

proxy start

gpu proxy start

Starts the local proxy listeners. By default the LLM proxy listens on 127.0.0.1:4000 and the diffusion proxy listens on 127.0.0.1:4001.

proxy stop

gpu proxy stop

Stops all proxy listeners and health checks.

proxy status

gpu proxy status
gpu proxy status --json

Shows whether the proxy is running, the configured listen addresses, and registered model counts.

proxy models

gpu proxy models
gpu proxy models --json

Lists registered models, backend counts, service types, and healthy backend counts.

proxy register

gpu proxy register --pod-id llm-ollama-qwen3-5-9b --model qwen3.5:9b --port 11434 --type ollama
gpu proxy register --pod-id abc123 --model meta-llama/Llama-3-8B --port 8000 --type vllm

Manually registers a backend when you need to attach an existing service to the local proxy. gpu llm run --register-proxy does this automatically for normal LLM launches.

proxy deregister

gpu proxy deregister --pod-id llm-ollama-qwen3-5-9b --model qwen3.5:9b
gpu proxy deregister --pod-id abc123 --model meta-llama/Llama-3-8B

Removes a backend from the proxy router.

gpu config

Inspect or validate configuration.

gpu config show
gpu config validate
gpu config schema
gpu config get default_profile
gpu config set updates.auto_update false

gpu template

Browse official templates or clear the local template cache.

gpu template list
gpu template list --json
gpu template clear-cache -y

gpu org

Manage organizations, membership, sub-accounts, and service accounts.

gpu org list
gpu org create "My Team"
gpu org delete my-team
gpu org switch my-team
gpu org invite alice@example.com --role admin
gpu org service-account create --name ci
gpu org service-account revoke sa_abc123

See Organizations & Service Accounts for the current model, including headless automation.

gpu serverless

Deploy and manage RunPod Serverless endpoints.

gpu serverless deploy
gpu serverless list --json
gpu serverless status <ENDPOINT_ID>
gpu serverless delete
gpu serverless warm <ENDPOINT_ID> --gpu
gpu serverless logs <ENDPOINT_ID>
gpu serverless template delete

serverless status

gpu serverless status <ENDPOINT_ID>
gpu serverless status <ENDPOINT_ID> --json

serverless logs

gpu serverless logs <ENDPOINT_ID>
gpu serverless logs <ENDPOINT_ID> --tail 100
gpu serverless logs <ENDPOINT_ID> --status error

gpu serverless logs currently points you to the RunPod dashboard rather than streaming or filtering logs inside the CLI, but these are still the public CLI flags.

serverless warm

gpu serverless warm <ENDPOINT_ID> --gpu
gpu serverless warm <ENDPOINT_ID> --cpu
gpu serverless warm <ENDPOINT_ID> --timeout 900

Current behavior

  • gpu serverless status and gpu serverless warm should be treated as endpoint-ID-driven commands.
  • gpu serverless delete supports name lookup and interactive selection when you omit the endpoint.
  • gpu serverless warm --gpu is the supported warmup path today.
  • gpu serverless warm --cpu, deploy-time --warm, and deploy-time --write-ids are not wired through the current runtime.
  • gpu serverless logs currently points you to the RunPod dashboard rather than streaming filtered logs in the CLI.

Use Serverless Endpoints for templates, config shape, and request examples.

gpu start

Start a new GPU pod explicitly instead of letting gpu run manage pod lifecycle for you.

gpu start
gpu start --gpu-type "NVIDIA RTX 6000 Ada"
gpu start --min-vram 24 --max-price 1.50

Key flags

FlagDescription
--gpu-type <TYPE>Pin a specific GPU type
--gpu-count <N>Request multiple GPUs
--max-price <PRICE>Limit hourly spend during selection
--min-vram <GB>Set a VRAM floor
--cloud-type <TYPE>Choose secure, community, or all
--region <REGION>Restrict to a region or datacenter
--docker-image <IMAGE>Override the image
--forceSkip reuse checks and create a new pod

gpu instances

Inspect or terminate provider-side instances directly.

gpu instances list
gpu instances list --project my-project
gpu instances show pod_abc123
gpu instances terminate pod_abc123 -y

gpu desktop

Manage the desktop app.

gpu desktop install
gpu desktop install --channel beta
gpu desktop uninstall -y

gpu update

Update GPU CLI to the latest version or inspect update availability.

gpu update
gpu update --check
gpu update --target-version 1.2.3
gpu update --dismiss

gpu upgrade

Open the pricing flow to upgrade your subscription.

gpu upgrade

gpu changelog

View release notes by version or version range.

gpu changelog
gpu changelog 0.8.0
gpu changelog --from 0.7.0 --to 0.8.0

gpu daemon

Manage the background daemon.

gpu daemon status
gpu daemon start
gpu daemon restart
gpu daemon logs --follow

gpu doctor

Run a setup diagnostic.

gpu doctor
gpu doctor --json

gpu issue

Submit an issue report with optional daemon logs.

gpu issue
gpu issue "sync not working"
gpu issue --no-logs -y "bug report"

gpu agent-docs

Print the agent-focused CLI reference to stdout.

gpu agent-docs
gpu agent-docs | head -50

gpu support

Open the GPU CLI community support link in your browser.

gpu support

Global flags

These flags work with all commands:

FlagDescription
-v, --verboseIncrease logging verbosity (-v, -vv, -vvv)
-q, --quietMinimal output
--progress-style <STYLE>Progress display style: panel, pipeline, minimal, verbose
--help-allShow all commands including hidden ones
--no-auto-updateDisable the update check for this invocation

On this page