Is GPU CLI really free?

Yes, the CLI is free. Every command, every provider integration, every workflow feature inside the CLI. No tier-gated features, no usage caps from us, no upsell paywalls inside the tool. You only pay your GPU provider for compute, at their rate, with no markup from us. Separate paid products like managed inference are on the pricing roadmap.

Will auto-stop kill my training?

No. Auto-stop only triggers after your command completes. Active GPU work is never interrupted. Your training runs to completion, then auto-stop kicks in five minutes after the last activity.

What happens when I close my laptop?

Your job keeps running. Reconnect anytime to see progress. Output files sync to your machine automatically as they are created.

How is this different from runpodctl?

runpodctl is great for RunPod infrastructure management. GPU CLI adds a development workflow layer across providers: instant relay connections, automatic file syncing, and auto-stop to keep costs down.

No. Your code syncs directly to your GPU provider over SSH or secure tunnels. We only see operational metadata such as duration, GPU type, and cost. Never your actual code or outputs.

How much does a typical run cost?

It depends on your provider and the GPU you choose. On common providers, a two-hour A100 run is often around $4. You pay the provider directly at their rate. GPU CLI itself is free.

What about paid services?

Paid services for teams are on the roadmap: managed model endpoints, team workspaces, hard spend caps, and priority support. They will be additions for teams that need more, never paywalls around features you already use.

Back to Blog

gpucloud-gpuavailabilitymulti-cloud

How to Get a GPU When They're Sold Out

James Lal

May 24, 2026

H100s and A100s sell out by the minute in 2026. The fix isn't a cheaper price, it's not betting your workflow on one provider's inventory. Here's how to provision across RunPod, Vast, Thunder, and io.net from a single command.

For most of the last decade, renting a GPU was a price question. You picked the cheapest card that fit the job and ran it. In 2026 the question changed. The card you want is often just not there.

You go to launch an H100, and the region is empty. You wait. You refresh. You settle for an A100, or you queue for hours. If you train or fine-tune on rented GPUs, you have lived this.

The instinct is to treat it as a sourcing problem: find the provider with the best price and the most stock. But stock moves by the minute, and no single provider is reliably full. Betting your whole workflow on one provider's inventory is the actual mistake.

Availability is a multi-cloud problem

There are a lot of GPU clouds now. RunPod, Vast.ai, Thunder Compute, io.net, and more. At any given moment, the H100 that is sold out on one is sitting idle on another.

The catch is that each one is its own account, its own CLI, its own SSH setup, its own quirks. Checking three providers by hand every time you need a card is not a workflow, it is a chore you will skip. So people don't shop around. They wait on the provider they already configured, because moving is annoying. That waiting is the gap.

One command, whatever's available

GPU CLI sits on top of the providers you already use. You don't pick a provider up front. You ask for a GPU, and it provisions where that GPU is actually available. Here is what it shows right now for 80GB and larger cards:

gpu inventory --min-vram 80 showing live pricing and stock for 80GB+ GPUs across RunPod and Vast.ai — gpu inventory --min-vram 80: live pricing and stock for 80GB+ GPUs across providers, from one command.

Look at what that surfaces. The H200 is cheaper and better stocked on Vast. The H100 is the opposite, cheaper and in stock on RunPod. The A100 only shows up on RunPod at all. The best provider is not fixed. It flips per card and by the hour, which is exactly why picking one provider up front and never looking again leaves both GPUs and savings on the table.

Then you run your code the same way you would locally:

bash

gpu run python train.py

No provider flag, no console, no second tool to learn. It routes to a provider that has the card, spins it up, syncs your project, and streams your output back as it is written. If you have used RunPod or Vast directly, this is the same hardware underneath. It is the picking and the plumbing that GPU CLI takes off your plate.

Your environment moves with you, not your provider

The reason switching providers is painful by hand is that everything lives on the pod. Your code, your data, your half-finished run. Move providers and you are rebuilding from scratch.

GPU CLI keeps your project synced from your machine and syncs your outputs back as they are created. The pod is disposable. Your work is not tied to one provider's box, so landing on whatever is available costs you nothing. That is what turns just use a different cloud from a thing people say into a thing people actually do.

A note on auto-stopping

GPU CLI can auto-stop idle pods, and for day-to-day iteration that saves real money. But the tradeoff is worth being honest about in a shortage: stopping a pod releases the GPU, and if that card is scarce, you may not get the same one back when you resume.

So we don't treat auto-stop as a blanket save-money button. For interactive dev where you are stepping away anyway, stop freely. For a card you fought to get and need again in an hour, holding it can be the cheaper move once you price in the time you would lose re-queuing. The value of being multi-cloud is that you get to make that call, instead of being stuck with one provider's answer.

Complementary, not a replacement

GPU CLI is not another GPU cloud. It does not own hardware and it does not mark up anyone else's. RunPod, Vast, Thunder, and io.net are the providers underneath, and they are good at what they do. GPU CLI is the layer that makes using all of them feel like one local command, so availability stops being the thing that blocks your work.

It is free, including for commercial use.

bash

curl -fsSL https://gpu-cli.sh/install.sh | sh

Then gpu run python train.py, and let it find the card.