AI Agent Skill

The GPU CLI skill lets you create and run GPU workloads using natural language in AI coding assistants. Instead of writing configuration manually, just describe what you want to build.

Works with Claude Code, Cursor, Cline, and any agent supporting skills.sh.

Installation

Install using skills.sh:

npx skills add gpu-cli/gpu

What It Does

Once installed, Claude can:

Create GPU projects from natural language descriptions
Debug GPU errors with specific fixes
Optimize costs by recommending the right GPU
Configure gpu.jsonc files correctly
Manage volumes and persistent storage

Usage

Automatic Invocation

Just describe what you want. The skill is automatically triggered when you mention:

Running ML/AI on cloud GPUs
Creating GPU projects (ComfyUI, vLLM, training)
GPU CLI errors (OOM, sync, connection issues)
GPU cost optimization
Configuration help

"I want to run Llama 70B inference"
"Train a LoRA on my custom dataset"
"I'm getting CUDA out of memory errors"
"What GPU should I use for FLUX?"

Manual Invocation

Invoke directly with:

/gpu

Examples

Create an Image Generation Project

You: I want to generate images with FLUX on a cloud GPU

Claude: [Creates gpu.jsonc, requirements.txt, and generate.py]

Debug an Error

You: CUDA out of memory when running FLUX

Claude: [Suggests batch size reduction, fp16, or bigger GPU]

Optimize Costs

You: What's the cheapest GPU for running SDXL?

Claude: [Compares GPU options and recommends RTX 4090 at $0.44/hr]

Skill Capabilities

The skill includes reference documentation for:

Reference	What It Covers
`create.md`	Project templates for common ML tasks
`debug.md`	Error diagnostics and fixes
`optimize.md`	GPU pricing and cost optimization
`config.md`	Configuration file reference
`volumes.md`	Storage and volume management

GPU Selection Matrix

The skill knows which GPUs work best for different workloads:

Workload	Recommended GPU	VRAM
FLUX.1 dev	RTX 4090	24GB
SDXL	RTX 4090	24GB
Llama 3.1 8B	RTX 4090	24GB
Llama 3.1 70B	A100 80GB	80GB
Video generation	A100 80GB	80GB

Ready-to-Use Templates

The skill includes templates for common use cases:

hello-gpu — Verify GPU access works
ollama-models — Run Ollama with any model
vllm-models — High-performance LLM inference
background-removal — Remove image backgrounds
whisper-transcription — Audio transcription

Requirements

An AI coding assistant that supports skills.sh (Claude Code, Cursor, Cline, etc.)
GPU CLI installed and authenticated
RunPod API key configured (gpu auth login)

Source Code

The skill is open source: github.com/gpu-cli/gpu

AI Agent Skill

On this page