Commands Reference
Current reference for public GPU CLI commands
Commands Reference
This page tracks the public GPU CLI surface that ships today. Use gpu --help for a quick overview, gpu --help-all for advanced and hidden commands, and LLM Inference for the routed LLM workflow.
gpu run
Run any command on a remote GPU pod.
gpu run <COMMAND>Common examples
gpu run python train.py
gpu run -d python long_job.py
gpu run -p 8080:8080 python server.py
gpu run --gpu-type "RTX 4090" python train.py
gpu run --attach job_abc123Key flags
| Flag | Description |
|---|---|
--attach, -a <JOB_ID> | Reattach to an existing job |
--detach, -d | Submit the job and return immediately |
--status, -s | Show current pod state and recent jobs |
--cancel <JOB_ID> | Cancel a running job |
--tail, -n <N> | Show last N lines when attaching to a completed job |
--interactive, -i | Allocate a PTY and keep stdin open |
--gpu-type <TYPE> | Override GPU type for this run |
--gpu-count <N> | Request multiple GPUs (1-8) |
--min-vram <GB> | Set fallback VRAM floor |
--rebuild | Force pod recreation when the Dockerfile has changed |
--output, -o <PATH> | Override synced output paths |
--no-output | Disable output syncing |
--sync | Wait for output sync before exiting |
--outputs | Show live daemon-managed output sync events |
--output-summary | Show a brief sync summary on completion |
--no-summary | Suppress the end-of-run summary banner |
--show-sync | Show detailed sync progress |
--force-sync | Ignore change detection and do a full sync |
--remote-path <PATH> | Override the remote workspace path |
--publish, -p <[LOCAL:]REMOTE> | Publish remote ports to localhost |
--no-port-forward | Disable automatic port detection |
--no-persistent-proxy | Stop the local proxy when the pod stops |
--env, -e <KEY=VALUE> | Set environment variables |
--json | JSON output after submit; only works with --detach |
gpu attach
Reattach to a job by ID. This is a focused alias for gpu run --attach.
gpu attach job_abc123
gpu attach job_abc123 --tail 50gpu cp
Copy files to or from the remote workspace.
gpu cp model.pt gpu:/workspace/
gpu cp gpu:/workspace/results/ ./results
gpu cp -r ./data gpu:/workspace/data/Use the gpu: prefix, or the shorthand : prefix, for the remote side.
gpu login
Authenticate with GPU CLI in your browser.
gpu login| Flag | Description |
|---|---|
--timeout <SECONDS> | Browser auth timeout (default: 300) |
gpu logout
Remove the stored browser session.
gpu logout| Flag | Description |
|---|---|
--yes, -y, --force | Skip confirmation |
gpu auth
Manage provider and model-hub credentials.
auth login
Authenticate with your cloud provider profile.
gpu auth login
gpu auth login --profile dev
gpu auth login --generate-ssh-keysauth logout
Remove provider credentials.
gpu auth logoutauth status
Show current auth state.
gpu auth statusauth add
Add HuggingFace or Civitai credentials for model downloads.
gpu auth add hf
gpu auth add civitai --token <VALUE>auth remove
Remove model-hub credentials.
gpu auth remove hfauth hubs
List configured model-hub credentials.
gpu auth hubsgpu init
Create a gpu.jsonc file for the current project.
gpu init| Flag | Description |
|---|---|
--gpu-type <TYPE> | Default GPU type |
--max-price <PRICE> | Maximum hourly price |
--profile <NAME> | Profile to use |
--encryption / --no-encryption | Toggle volume encryption |
--force, -f | Reinitialize an existing config |
gpu inventory
List available GPUs and pricing.
gpu inventory
gpu inventory --available
gpu inventory --min-vram 24 --max-price 1.00
gpu inventory --jsongpu status
Show project status, active pods, recent jobs, and cost information.
gpu status
gpu status --jsongpu dashboard
Open the TUI dashboard for pods and jobs.
gpu dashboardKeybindings: j/k to move, Tab to switch panels, Enter to expand, a to attach, c to cancel, s to stop, e for events, ? for help, q to quit.
gpu logs
Stream or inspect unified logs for jobs, sync events, hooks, and agent output.
gpu logs
gpu logs --job job_abc123 --tail 50
gpu logs --follow --type lifecycle
gpu logs --jsongpu stop
Stop the active pod immediately.
gpu stop
gpu stop pod_abc123 --no-sync| Flag | Description |
|---|---|
[POD_ID] | Optional pod ID |
--yes, -y, --force | Skip confirmation |
--no-sync | Skip final output sync |
--json | Output JSON |
gpu use
Run a template directly or resume a template session.
gpu use ollama
gpu use vllm
gpu use owner/repo@v1.0.0
gpu useFlags
| Flag | Description |
|---|---|
--name <NAME> | Override the session or project name |
--yes | Skip interactive prompts |
--dry-run | Show what would be created |
--input <KEY=VALUE> | Provide template input values |
See LLM Inference for the built-in Ollama and vLLM template paths.
gpu llm
Launch the routed LLM workflow for Ollama or vLLM.
gpu llm run
gpu llm run --ollama --model deepseek-r1:8b -y
gpu llm run --vllm --url meta-llama/Llama-3.1-8B-Instruct -y
gpu llm info deepseek-r1:70b
gpu llm info --url meta-llama/Llama-3.1-8B-Instruct --jsonUse LLM Inference for setup, port layout, and wake-on-request routing behavior.
gpu comfyui
Run ComfyUI workflows from the curated workflow catalog.
gpu comfyui list
gpu comfyui info flux_schnell
gpu comfyui validate flux_schnell --gpu-type "RTX 4090"
gpu comfyui run flux_schnell --volume-id vol_abc123
gpu comfyui generate "a cat astronaut on mars"
gpu comfyui stop --allgpu notebook
Run Marimo notebooks on GPU pods.
gpu notebook
gpu notebook train.py
gpu notebook train.py --run
gpu notebook --new analysisCommon flags
| Flag | Description |
|---|---|
--run, -r | Serve in read-only app mode |
--new <NAME> | Create a new notebook |
--gpu-type <TYPE> | Override GPU selection |
--volume-id <ID> | Use a specific network volume |
--no-volume | Use ephemeral storage |
--yes, -y | Skip confirmation |
gpu volume
Manage network volumes.
volume list
gpu volume list
gpu volume list --detailed
gpu volume list --jsonvolume create
gpu volume create --name my-models --size 200
gpu volume create --datacenter US-TX-3 --set-globalvolume delete
gpu volume delete my-models --forcevolume extend
gpu volume extend my-models --size 300volume set-global
gpu volume set-global my-modelsvolume status
gpu volume status
gpu volume status --volume my-models --jsonvolume migrate
gpu volume migrate my-models --to US-OR-1volume sync
gpu volume sync source-volume dest-volume
gpu volume sync source-volume dest-volume --method rsyncvolume cancel-migration
gpu volume cancel-migration transfer_abc123gpu vault
Manage encrypted vault storage for sensitive outputs.
vault list
gpu vault list
gpu vault list --project my-project --jsonvault export
gpu vault export checkpoints/model.pt ./model.pt
gpu vault export generated/image.png ~/Desktop/ --forcevault stats
gpu vault stats
gpu vault stats --project my-project --jsongpu proxy
Manage the local proxy router for OpenAI-compatible model access.
proxy start
gpu proxy startStarts the local proxy listeners. By default the LLM proxy listens on 127.0.0.1:4000 and the diffusion proxy listens on 127.0.0.1:4001.
proxy stop
gpu proxy stopStops all proxy listeners and health checks.
proxy status
gpu proxy status
gpu proxy status --jsonShows whether the proxy is running, the configured listen addresses, and registered model counts.
proxy models
gpu proxy models
gpu proxy models --jsonLists registered models, backend counts, service types, and healthy backend counts.
proxy register
gpu proxy register --pod-id llm-ollama-qwen3-5-9b --model qwen3.5:9b --port 11434 --type ollama
gpu proxy register --pod-id abc123 --model meta-llama/Llama-3-8B --port 8000 --type vllmManually registers a backend when you need to attach an existing service to the local proxy. gpu llm run --register-proxy does this automatically for normal LLM launches.
proxy deregister
gpu proxy deregister --pod-id llm-ollama-qwen3-5-9b --model qwen3.5:9b
gpu proxy deregister --pod-id abc123 --model meta-llama/Llama-3-8BRemoves a backend from the proxy router.
gpu config
Inspect or validate configuration.
gpu config show
gpu config validate
gpu config schema
gpu config get default_profile
gpu config set updates.auto_update falsegpu template
Browse official templates or clear the local template cache.
gpu template list
gpu template list --json
gpu template clear-cache -ygpu org
Manage organizations, membership, sub-accounts, and service accounts.
gpu org list
gpu org create "My Team"
gpu org delete my-team
gpu org switch my-team
gpu org invite alice@example.com --role admin
gpu org service-account create --name ci
gpu org service-account revoke sa_abc123See Organizations & Service Accounts for the current model, including headless automation.
gpu serverless
Deploy and manage RunPod Serverless endpoints.
gpu serverless deploy
gpu serverless list --json
gpu serverless status <ENDPOINT_ID>
gpu serverless delete
gpu serverless warm <ENDPOINT_ID> --gpu
gpu serverless logs <ENDPOINT_ID>
gpu serverless template deleteserverless status
gpu serverless status <ENDPOINT_ID>
gpu serverless status <ENDPOINT_ID> --jsonserverless logs
gpu serverless logs <ENDPOINT_ID>
gpu serverless logs <ENDPOINT_ID> --tail 100
gpu serverless logs <ENDPOINT_ID> --status errorgpu serverless logs currently points you to the RunPod dashboard rather than streaming or filtering logs inside the CLI, but these are still the public CLI flags.
serverless warm
gpu serverless warm <ENDPOINT_ID> --gpu
gpu serverless warm <ENDPOINT_ID> --cpu
gpu serverless warm <ENDPOINT_ID> --timeout 900Current behavior
gpu serverless statusandgpu serverless warmshould be treated as endpoint-ID-driven commands.gpu serverless deletesupports name lookup and interactive selection when you omit the endpoint.gpu serverless warm --gpuis the supported warmup path today.gpu serverless warm --cpu, deploy-time--warm, and deploy-time--write-idsare not wired through the current runtime.gpu serverless logscurrently points you to the RunPod dashboard rather than streaming filtered logs in the CLI.
Use Serverless Endpoints for templates, config shape, and request examples.
gpu start
Start a new GPU pod explicitly instead of letting gpu run manage pod lifecycle for you.
gpu start
gpu start --gpu-type "NVIDIA RTX 6000 Ada"
gpu start --min-vram 24 --max-price 1.50Key flags
| Flag | Description |
|---|---|
--gpu-type <TYPE> | Pin a specific GPU type |
--gpu-count <N> | Request multiple GPUs |
--max-price <PRICE> | Limit hourly spend during selection |
--min-vram <GB> | Set a VRAM floor |
--cloud-type <TYPE> | Choose secure, community, or all |
--region <REGION> | Restrict to a region or datacenter |
--docker-image <IMAGE> | Override the image |
--force | Skip reuse checks and create a new pod |
gpu instances
Inspect or terminate provider-side instances directly.
gpu instances list
gpu instances list --project my-project
gpu instances show pod_abc123
gpu instances terminate pod_abc123 -ygpu desktop
Manage the desktop app.
gpu desktop install
gpu desktop install --channel beta
gpu desktop uninstall -ygpu update
Update GPU CLI to the latest version or inspect update availability.
gpu update
gpu update --check
gpu update --target-version 1.2.3
gpu update --dismissgpu upgrade
Open the pricing flow to upgrade your subscription.
gpu upgradegpu changelog
View release notes by version or version range.
gpu changelog
gpu changelog 0.8.0
gpu changelog --from 0.7.0 --to 0.8.0gpu daemon
Manage the background daemon.
gpu daemon status
gpu daemon start
gpu daemon restart
gpu daemon logs --followgpu doctor
Run a setup diagnostic.
gpu doctor
gpu doctor --jsongpu issue
Submit an issue report with optional daemon logs.
gpu issue
gpu issue "sync not working"
gpu issue --no-logs -y "bug report"gpu agent-docs
Print the agent-focused CLI reference to stdout.
gpu agent-docs
gpu agent-docs | head -50gpu support
Open the GPU CLI community support link in your browser.
gpu supportGlobal flags
These flags work with all commands:
| Flag | Description |
|---|---|
-v, --verbose | Increase logging verbosity (-v, -vv, -vvv) |
-q, --quiet | Minimal output |
--progress-style <STYLE> | Progress display style: panel, pipeline, minimal, verbose |
--help-all | Show all commands including hidden ones |
--no-auto-update | Disable the update check for this invocation |