GPU CLI

AI Agent Skill

Use GPU CLI from Claude Code, Cursor, Cline, and other AI coding assistants

AI Agent Skill

The GPU CLI skill lets you create and run GPU workloads using natural language in AI coding assistants. Instead of writing configuration manually, just describe what you want to build.

Works with Claude Code, Cursor, Cline, and any agent supporting skills.sh.

Installation

Install using skills.sh:

npx skills add gpu-cli/gpu

What It Does

Once installed, Claude can:

  • Create GPU projects from natural language descriptions
  • Debug GPU errors with specific fixes
  • Optimize costs by recommending the right GPU
  • Configure gpu.jsonc files correctly
  • Manage volumes and persistent storage

Usage

Automatic Invocation

Just describe what you want. The skill is automatically triggered when you mention:

  • Running ML/AI on cloud GPUs
  • Creating GPU projects (ComfyUI, vLLM, training)
  • GPU CLI errors (OOM, sync, connection issues)
  • GPU cost optimization
  • Configuration help
"I want to run Llama 70B inference"
"Train a LoRA on my custom dataset"
"I'm getting CUDA out of memory errors"
"What GPU should I use for FLUX?"

Manual Invocation

Invoke directly with:

/gpu

Examples

Create an Image Generation Project

You: I want to generate images with FLUX on a cloud GPU

Claude: [Creates gpu.jsonc, requirements.txt, and generate.py]

Debug an Error

You: CUDA out of memory when running FLUX

Claude: [Suggests batch size reduction, fp16, or bigger GPU]

Optimize Costs

You: What's the cheapest GPU for running SDXL?

Claude: [Compares GPU options and recommends RTX 4090 at $0.44/hr]

Skill Capabilities

The skill includes reference documentation for:

ReferenceWhat It Covers
create.mdProject templates for common ML tasks
debug.mdError diagnostics and fixes
optimize.mdGPU pricing and cost optimization
config.mdConfiguration file reference
volumes.mdStorage and volume management

GPU Selection Matrix

The skill knows which GPUs work best for different workloads:

WorkloadRecommended GPUVRAM
FLUX.1 devRTX 409024GB
SDXLRTX 409024GB
Llama 3.1 8BRTX 409024GB
Llama 3.1 70BA100 80GB80GB
Video generationA100 80GB80GB

Ready-to-Use Templates

The skill includes templates for common use cases:

  • hello-gpu — Verify GPU access works
  • ollama-models — Run Ollama with any model
  • vllm-models — High-performance LLM inference
  • background-removal — Remove image backgrounds
  • whisper-transcription — Audio transcription

Requirements

  • An AI coding assistant that supports skills.sh (Claude Code, Cursor, Cline, etc.)
  • GPU CLI installed and authenticated
  • RunPod API key configured (gpu auth login)

Source Code

The skill is open source: github.com/gpu-cli/gpu

On this page