Is GPU CLI really free?

Yes, the CLI is free. Every command, every provider integration, every workflow feature inside the CLI. No tier-gated features, no usage caps from us, no upsell paywalls inside the tool. You only pay your GPU provider for compute, at their rate, with no markup from us. Separate paid products like managed inference are on the pricing roadmap.

Will auto-stop kill my training?

No. Auto-stop only triggers after your command completes. Active GPU work is never interrupted. Your training runs to completion, then auto-stop kicks in five minutes after the last activity.

What happens when I close my laptop?

Your job keeps running. Reconnect anytime to see progress. Output files sync to your machine automatically as they are created.

How is this different from runpodctl?

runpodctl is great for RunPod infrastructure management. GPU CLI adds a development workflow layer across providers: instant relay connections, automatic file syncing, and auto-stop to keep costs down.

No. Your code syncs directly to your GPU provider over SSH or secure tunnels. We only see operational metadata such as duration, GPU type, and cost. Never your actual code or outputs.

How much does a typical run cost?

It depends on your provider and the GPU you choose. On common providers, a two-hour A100 run is often around $4. You pay the provider directly at their rate. GPU CLI itself is free.

What about paid services?

Paid services for teams are on the roadmap: managed model endpoints, team workspaces, hard spend caps, and priority support. They will be additions for teams that need more, never paywalls around features you already use.

Back to Blog

announcementsgpuagentstraining

GPU CLI Pro is free now, and we added three more clouds

James Lal

May 22, 2026

GPU CLI Pro is free now, and we added three more clouds

I train LoRAs almost every day, and I haven't started one of those jobs by hand in weeks. Agents drive them. Today GPU CLI Pro is free for everyone, commercial use included, and we added three new GPU providers.

I train LoRAs nearly every day now for the models behind our other products. The jobs run for hours, but I haven't started one of them by hand in weeks, because I can drive them through Claude, Codex, and Opencode using GPU CLI. Because we built this with agents in mind, all an agent has to do is pick a GPU, launch the run, watch it, and tear it down when it's done, meaning users can manage results instead of keystrokes.

Built to be driven agentically

A coding agent is good at exactly what a long training run needs. Launch a job, watch for it to finish, then shut it down when it's done. The trouble is most tools assume a person is sitting there: interactive prompts, output meant for eyes instead of parsers, state you can only see by looking.

GPU CLI doesn't assume that. Every command runs without a TTY. Your workspace syncs up to the GPU automatically, and anything a run needs along the way, an env var or an API key, comes with it when you pass --env.

Take a LoRA I'm using right now as an example; it's a fine-tuned Gemma model trained with Unsloth on a Vast.ai RTX PRO 6000, 96 GB of VRAM.

bash

# set up the training env on the pod (one time)gpu run --no-output python unsloth_env.py install
# launch the run, syncing your workspace up to the GPUgpu run --no-output --force-sync \  -- python unsloth_env.py run -- train_lora_unsloth.py \  --model-id unsloth/gemma-4-E4B-it \  --max-steps 480 --lora-r 16 --learning-rate 2.1e-4
# wait on it for a few hours, then react when it landsuntil gpu logs --tail 40 | grep -qE "train_runtime|OutOfMemoryError"; do  sleep 90donegpu logs --tail 40 | grep -iE "train_runtime|train_loss|eval_loss"
# tear the instance down when the run is donegpu instances terminate <instance-id>

What did creating this involve? Honestly, not much. I just asked the agent to train the model and with GPU CLI the agent could handle the rest. The agent fired off the run, monitored the logs for a few hours, caught the run landing (or an out-of-memory if the GPU was too small), and shut the instance down after.

A real run on a Vast.ai A5000: launch it, watch it land, tear it down.

Using this approach has drastically reduced the tedium of training models as a six-hour run costs the same amount of attention as a six-minute one. None.

GPU availability is a real problem

I didn't pick that GPU because it was ideal. I picked it because it was the one with capacity that morning. If you've tried to get a GPU lately you know the rest. The card you want is out of stock, in the region you want it, at the price you want to pay.

The fix isn't a better single provider. It's more of them. We added three:

Vast.ai — a marketplace with 70+ GPU types and spot pricing across 20+ regions
Thunder Compute — A6000, A100, and H100 capacity
io.net — 100+ GPU types on a pre-paid model built for long-running jobs

Together with RunPod, that's four clouds and hundreds of GPU types to pull from. Just pass GPU CLI a list of GPUs you'll accept and it works down it, so when your first choice is out of stock it grabs a compatible one instead of failing. More places to ask means your job gets a machine instead of a queue.

Live GPU stock and pricing across RunPod and Vast.ai.

GPU CLI Pro is now free!

The commercial license, unlimited concurrent sessions, detached jobs without upgrade nags. All of it is free for everyone now, commercial use included.

That said, what are you waiting for? Try it and let us know what you think!

bash

curl -fsSL https://gpu-cli.sh/install.sh | sh