AI Agent Skill
Use GPU CLI from Claude Code, Cursor, Cline, and other AI coding assistants
AI Agent Skill
The GPU CLI skill lets you create and run GPU workloads using natural language in AI coding assistants. Instead of writing configuration manually, just describe what you want to build.
Works with Claude Code, Cursor, Cline, and any agent supporting skills.sh.
Installation
Install using skills.sh:
npx skills add gpu-cli/gpuWhat It Does
Once installed, Claude can:
- Create GPU projects from natural language descriptions
- Debug GPU errors with specific fixes
- Optimize costs by recommending the right GPU
- Configure gpu.jsonc files correctly
- Manage volumes and persistent storage
Usage
Automatic Invocation
Just describe what you want. The skill is automatically triggered when you mention:
- Running ML/AI on cloud GPUs
- Creating GPU projects (ComfyUI, vLLM, training)
- GPU CLI errors (OOM, sync, connection issues)
- GPU cost optimization
- Configuration help
"I want to run Llama 70B inference"
"Train a LoRA on my custom dataset"
"I'm getting CUDA out of memory errors"
"What GPU should I use for FLUX?"Manual Invocation
Invoke directly with:
/gpuExamples
Create an Image Generation Project
You: I want to generate images with FLUX on a cloud GPU
Claude: [Creates gpu.jsonc, requirements.txt, and generate.py]Debug an Error
You: CUDA out of memory when running FLUX
Claude: [Suggests batch size reduction, fp16, or bigger GPU]Optimize Costs
You: What's the cheapest GPU for running SDXL?
Claude: [Compares GPU options and recommends RTX 4090 at $0.44/hr]Skill Capabilities
The skill includes reference documentation for:
| Reference | What It Covers |
|---|---|
create.md | Project templates for common ML tasks |
debug.md | Error diagnostics and fixes |
optimize.md | GPU pricing and cost optimization |
config.md | Configuration file reference |
volumes.md | Storage and volume management |
GPU Selection Matrix
The skill knows which GPUs work best for different workloads:
| Workload | Recommended GPU | VRAM |
|---|---|---|
| FLUX.1 dev | RTX 4090 | 24GB |
| SDXL | RTX 4090 | 24GB |
| Llama 3.1 8B | RTX 4090 | 24GB |
| Llama 3.1 70B | A100 80GB | 80GB |
| Video generation | A100 80GB | 80GB |
Ready-to-Use Templates
The skill includes templates for common use cases:
- hello-gpu — Verify GPU access works
- ollama-models — Run Ollama with any model
- vllm-models — High-performance LLM inference
- background-removal — Remove image backgrounds
- whisper-transcription — Audio transcription
Requirements
- An AI coding assistant that supports skills.sh (Claude Code, Cursor, Cline, etc.)
- GPU CLI installed and authenticated
- RunPod API key configured (
gpu auth login)
Source Code
The skill is open source: github.com/gpu-cli/gpu