Quickstart
Install GPU CLI, authenticate, and run your first pod or LLM workflow
Quickstart
This guide walks you through installing GPU CLI, authenticating, and running your first workload on a cloud GPU.
1. Install GPU CLI
curl -fsSL https://gpu-cli.sh/install.sh | shThis installs the gpu command to ~/.gpu-cli/bin and adds it to your PATH.
2. Login to GPU CLI
gpu loginThis opens your browser to authenticate. After signing in, your CLI is connected to your GPU CLI account.
3. Connect Your GPU Provider
GPU CLI uses your provider API key to provision GPUs. The default hosted path today is RunPod, and you pay your provider directly.
- Get your API key from RunPod Settings
- Add it to GPU CLI:
gpu auth loginFollow the prompts to enter your API key.
4. Check Available GPUs
See what GPUs are available and their prices:
gpu inventoryFilter to show only available GPUs:
gpu inventory --available5. Run Your First Command
Test that everything works:
gpu run python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}, Device: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else \"N/A\"}')"GPU CLI will:
- Provision a GPU pod (auto-selects best available)
- Run your command
- Stream output back to your terminal
- Auto-stop the pod after 5 minutes of idle
6. Initialize a Project
For a real project, initialize GPU CLI in your project directory:
cd my-ml-project
gpu initThis creates a gpu.jsonc configuration file. You can customize:
- GPU type
- Output patterns to sync back
- Port forwarding
- And more
See Configuration for all options.
7. Run a Training Script
gpu run python train.pyYour code syncs to the pod, runs on the GPU, and outputs sync back automatically.
8. Try the LLM Workflow
GPU CLI also ships a routed LLM workflow for Ollama and vLLM:
gpu llm runUse this for a pod-based chat UI and local API surface that auto-resumes on request. See LLM Inference for the full guide.
9. Check Status
See what's running:
gpu statusOr open the interactive dashboard:
gpu dashboardNext Steps
- LLM Inference — Run Ollama or vLLM with wake-on-request routing
- Configuration Reference — Customize your
gpu.jsonc - Commands Reference — All CLI commands
- Organizations & Service Accounts — Organizations, service accounts, and CI/CD tokens
- Hosted Proxy Quickstart — Create a shared OpenAI-compatible endpoint for your org
- Troubleshooting — Common issues and solutions