Is GPU CLI really free?

Yes. GPU CLI is free forever, no account required. Bring your own GPU or a provider API key. You pay the provider directly at their rates, zero markup. Every command, every provider integration, every workflow feature ships in the free CLI, with no tier-gated features, no usage caps from us, and no upsell paywalls inside the tool. Separate paid products like managed inference are on the pricing roadmap.

Will auto-stop kill my training?

No. Auto-stop only triggers after your command completes. Active GPU work is never interrupted. Your training runs to completion, then auto-stop kicks in five minutes after the last activity.

What happens when I close my laptop?

Your job keeps running. Reconnect anytime to see progress. Output files sync to your machine automatically as they are created.

How is this different from runpodctl?

runpodctl is great for RunPod infrastructure management. GPU CLI adds a development workflow layer across providers: instant relay connections, automatic file syncing, and auto-stop to keep costs down.

No. Your code syncs directly to your GPU provider over SSH or secure tunnels. We only see operational metadata such as duration, GPU type, and cost. Never your actual code or outputs.

How much does a typical run cost?

It depends on your provider and the GPU you choose. On common providers, a two-hour A100 run is often around $4. You pay the provider directly at their rate. GPU CLI itself is free.

What about paid services?

Paid services for teams are on the roadmap: managed model endpoints, team workspaces, hard spend caps, and priority support. They will be additions for teams that need more, never paywalls around features you already use.

Back to Blog

gpuinfrastructurecost-analysisml-engineering

Should I rent or buy a GPU?

Angus Bezzina

June 10, 2026

An honest 3-year break-even on an RTX 5090 rig and an 8-GPU H100 server. The math lands around 45-55% sustained utilization, and most teams run below it. Rent by default. Buy only when you can prove the duty cycle and own the datacenter.

Should I rent or buy a GPU?

Renting or buying a GPU looks like a simple price comparison until you run the numbers. A RunPod 5090 pod is $0.99 an hour. The cheapest 5090 you can actually buy in mid-2026 runs about $4,200, well above its $1,999 launch price after a year of memory shortages, and you still need a machine around it. So the real question isn't whether renting is cheaper per hour. It's how many hours you have to run that card before owning it pays off, and whether you'll actually run it that long.

Most people never do the second half of that calculation. They buy the card, it idles five nights out of the week, and the depreciation clock runs the whole time anyway.

Our position, stated plainly so you can disagree with it, is that you should Rent GPUs by default. Buy only when you can *prove* a single accelerator class will run above roughly 50% sustained utilization for at least 18 unbroken months, and you already own the power, space, and ops to host it. Anything short of that and you are pre-paying for idle time on a depreciating asset whose frontier replacement ships in about twelve months. NVIDIA went annual: Hopper in 2022, Blackwell in 2024, Blackwell Ultra in 2025, Rubin later this year. The card you buy today is competitively stale within a year, and you still owe its full depreciation.
The rule only earns trust if the numbers do, so let's do the arithmetic.

An RTX 5090 rig: the cheap-card case

Build it the cheapest sane way. About $4,200 for the 5090, plus a workstation around it (processor, motherboard, memory, power supply, case, and a solid-state drive) at roughly $1,500. That's about $5,700 out the door.

Consumer cards hold their value better than datacenter parts. Assume the card is still worth 40% of its price after three years (its residual value), since a Rubin-era successor will exist by then. That's about $1,680 back, so your net up-front cost, what accountants call capex, is about $4,020.

Now power. The card pulls 575 watts, the rest of the system adds about 150, so 725 watts at the wall. At home there is almost no overhead for cooling and power delivery, so nearly all of that electricity reaches the chip. US residential electricity runs about 17.65 cents per kilowatt-hour, which works out to roughly $0.128 for every hour the rig is actually working.

Set owned cost equal to rent:

text

$4,020 = ($0.99 - $0.128) x HH = 4,664 active hours over 3 years = ~130 hours/month = a 18% duty cycle

So the break-even is 130 hours a month. Below that, the rig is a worse deal than RunPod, and worse still than the Vast marketplace floor, which sits lower again. An 18% duty cycle (the share of the time the card is actually working) is a low bar, which matters when we get to the honest case for buying. But notice it's a *real* bar, not a vibe.

An eight-GPU H100 server: the expensive-card case

This is where the intuition "I'll run it a lot, so I should own it" goes to die.
A complete eight-GPU H100 server routinely runs north of $350,000. Resale is brutal. Used H100s collapsed roughly 85% from the 2023 peak near $40k, down to the $6k to $15k range by 2026. Call it 25% of its price left by year three, so $87,500 back. Net up-front cost is $262,500, or $32,813 per GPU.

Power works differently here, because the server lives in a colocation facility (a colo, the rented datacenter space most teams use instead of a closet). Eight cards at 700 watts each, plus about 2,000 watts of server overhead, is 7,600 watts of computing load. A datacenter spends extra electricity on cooling and power delivery, a factor called power usage effectiveness (PUE), so at a typical 1.54 you are really drawing 11,704 watts at the wall. At US commercial electricity of 14.37 cents per kilowatt-hour that is $1.68 an hour for the whole server, or about $0.21 per GPU per hour. Add the rack rental, the network connection, and on-site staff at roughly $1,500 a month per rack, another $0.26 per GPU per hour.
Spread that up-front cost across three years of running flat out and you get $1.25 per GPU per hour. Stack it up:

text

Owned all-in at full tilt: $1.25 (up-front) + $0.21 (power) + $0.26 (colo) = ~$1.72 per GPU-hour

That's at 100% utilization. Now compare against what the GPU cloud providers (the newer ones are often called neoclouds) charge on demand, meaning pay-as-you-go with no commitment, and solve for the duty cycle where owning costs the same as renting:

| Rent benchmark | $ per GPU-hour | Break-even duty | Hours/month per GPU ||---|---|---|---|| RunPod H100 SXM | $3.29 | 44% | ~323 || RunPod H100 PCIe | $2.89 | 52% | ~376 || Nebius reserved | $2.00 | 82% | ~597 || Marketplace / spot | $1.03 to $1.99 | 82% to never | - |
A note on the labels: SXM and PCIe are two versions of the same H100, with SXM the faster and pricier one. Committed is a prepaid discount for booking capacity ahead of time, and spot or marketplace capacity is cheap leftover supply that can be reclaimed at any moment.

Read the bottom row carefully. At $1.99 you need an 82% duty cycle before owning catches up. At $1.03 spot, the rent rate sits below your owned all-in cost of $1.72 even at 100% utilization, so owning never catches up at all. You'd be paying the up-front cost on top of power and the colo fee while the renter pays nothing the moment the card goes idle.

There is one configuration where the H100 math gets comfortable. A lab that owns its datacenter pays industrial power around 8.54 cents per kilowatt-hour and no colo fee. That drops owned all-in to about $1.37 per GPU per hour and pulls break-even down to roughly 39-45% against $2.89 to $3.29 rent. Owning the building is the single biggest lever in the whole calculation. It's also exactly the thing most teams do not have.

Where the line actually falls

Across both cards, the honest break-even sits around 45-55% sustained utilization at typical neocloud on-demand rates. And here's the inconvenient part. Most organizations run 30-50% average utilization once you subtract dev cycles, idle nights, failed runs, and the gaps between experiments. The median buyer lands below their own break-even. They feel like they're using the card hard. The duty-cycle log says otherwise.

This tracks with what other people doing the math have found. Spheron puts the eight-GPU H100 break-even near 53% sustained. CloudZero's rule of thumb is rent under 40 hours a week, and buy only for around-the-clock use over 18+ months with infrastructure you already own. The exact figure moves with what you count and which rental rate you compare against. The cheaper the rent benchmark, the harder owning is to justify, and rent benchmarks in mid-2026 are cheap.

The asymmetry under the math

The numbers tell you what. The why is structural, and it's the part that holds even when prices move.

Renting is a call option. Buying is a forward commitment. When you rent, your downside is bounded: you scale to zero and pay nothing on an idle GPU, while you keep the upside of swapping to next year's silicon the day it lists. When you buy, you've written yourself a short option. You're obligated to absorb 100% of the asset's depreciation regardless of how much you compute, and you're short the refresh cadence.

Five things compound on the buyer's side.

Optionality. A renter changes card class per project. An owner is locked to one amount of video memory (VRAM) and one performance level for the asset's life.
The annual refresh. Each generation's jump in performance per dollar strands your card, and Jensen Huang's own line is that "when Blackwell starts shipping in volume, you couldn't give Hoppers away."
Depreciation risk. Used H100s fell about 85%, and the renter offloads exactly that risk to the provider.
Idle time. Depreciation runs whether the GPU computes or not, so the owner eats nights and weekends at full freight.
Scale-to-zero. The renter's floor cost is $0. The owner's floor is the full up-front cost, no matter what.

The provider, not you, is the efficient bearer of depreciation, idle, and obsolescence risk, because they pool demand across many tenants and keep an old card busy. Renting moves those risks to whoever can absorb them cheapest.

The honest case for buying

The rule isn't anti-ownership. It's a burden-of-proof rule. Buy when you can clear the bar, and there are real configurations that do.

Buy when

You can *demonstrate*, not hope for, above 50% sustained duty on one fixed accelerator class for 18+ months. That's the only regime where the H100 server math (44-52% break-even versus on-demand cloud rates) favors ownership.
You own or control the datacenter: industrial power, rack space, cooling, ops staff. This is the lever that drops H100 break-even to 39-45%.
Data gravity or regulation pins the workload: air-gapped, sovereign, classified, or petabyte training sets where egress dominates the bill. Here the financial break-even is moot, because renting is simply not permitted. Buy because you must, and be honest that you're paying a premium for the privilege. a16z's own "build a personal AI workstation" piece makes the case on control and privacy and gives no cost break-even, precisely because there isn't a favorable one.
Latency is the product: response loops under 10 milliseconds, on-device, edge, robotics. That justifies a 5090 or RTX PRO 6000 workstation at the edge, not a $350,000 server.
You're a solo developer or small lab whose real, honest use clears the cheap-card line. For a 5090 that's just about 130 hours a month, low enough that hobbyist-heavy use clears it, and the always-on, no-cold-start, no-metered-clock sandbox genuinely changes how much you experiment. On a $5,700 rig, "I'll learn more" is a defensible tiebreaker. On a six-figure server it's rounding error against the depreciation.
You can lock supply in a genuinely scarce part. Blackwell is sold out through mid-2026, with the high-bandwidth memory it needs booked out just as far. That's a capacity bet, not a cost bet, and the rule permits it as a named exception.

Rent when

Almost everywhere else. Rent when utilization is variable, bursty, or unknown, which describes every team before product-market fit. Rent when you're below your break-even duty: under ~130 hours a month for a 5090, under 45-55% sustained for an H100. Rent when you need to match the video memory to each project, so you prototype on a 4090 at $0.69 an hour, fine-tune on an H100, then do a one-off big run on a B200 at $3.95 to $5.89 an hour. Rent when you want to ride the refresh and capture each generation's gain in performance per dollar without eating the prior generation's depreciation. Rent when you lack datacenter infrastructure and would otherwise pay colo power, space, and hands.

The strongest counter, handled honestly

"Rental prices are surging, so I should lock in by buying." This is the best argument against the rule, and it earns a carve-out. By some measures GPU rental prices have roughly doubled since the start of 2026, the tightest squeeze since the original H100 shortage, with Blackwell chips and the high-bandwidth memory (HBM) they depend on both sold out well into the year. a16z's Anjney Midha calls it "the covid of compute, and all the toilet paper is gone."

Two responses. First, volatility cuts both ways. H100 on-demand also fell 64-75% from its 2024 peak of $7-10/hr, and anyone who bought at the top badly overpaid. Second, the right hedge for price risk is usually a reserved rental contract (Nebius committed around $2.00, with reserved rates on the major neoclouds running well under on-demand), which locks the price without marrying you to depreciating silicon. Only buy on this thesis if guaranteed access to a scarce part is itself strategic, in which case you are explicitly choosing to pay a depreciation premium for supply certainty. Name it for what it is.

Then there's the depreciation-isn't-that-bad counter, the Bernstein point that a five-year-old A100 still earns about $0.93/hr against about $0.28/hr cash cost. True, but notice who captures that margin: a fleet operator running the card at high pooled utilization, which is to say a rental provider. A single owner at 30-50% utilization never realizes $0.93/hr. The depreciation curve being survivable for a pooler is exactly why handing the asset to a pooler beats owning it yourself. It's the renter's thesis in disguise.

The verdict, in three real shapes

A solo developer running small fine-tuning jobs (the kind known as LoRA) around 40 hours a month: renting on RunPod is about $475 a year, roughly $1,426 over three years, versus about $4,200 all-in to own the rig. Renting is roughly 3x cheaper, and he can grab an H100 for the occasional big run he could never afford to own. Rent, until his real usage climbs past about 130 hours a month.

A startup with steady inference at ~60% duty on one H100: above the on-demand break-even, yes. But they have no datacenter, so colo costs push real owned cost up, and a committed contract around $2.00 captures most of the savings with zero depreciation or single-point-of-failure exposure. Rent on a committed contract now, and revisit buying only if duty proves out above ~55% and they acquire hosting. This is the genuine knife-edge the rule exists to discipline.

A research lab training near-continuously on an eight-GPU H100 server it owns, paying industrial power, sustaining 85-90%: owned all-in around $1.37 per GPU per hour, decisively under on-demand rent. Buy, and still rent the spiky frontier-class overflow so they never own a card they can't keep busy. This is the one archetype the rule says buy, and it's the rarest.

Rent by default. The burden of proof is on buying.

That prior gets stronger, not weaker, as the friction of renting goes away. Most of what makes a local rig feel nicer than a pod is operational rather than financial: no machine to provision, no remote-login setup to wrestle with, no files to sync by hand, no remembering to shut the thing down. GPU CLI removes most of that, so a rented GPU on RunPod behaves a lot like a machine sitting under your desk, minus the depreciation clock. For most readers, that pushes the break-even even further toward renting than the arithmetic above already puts it.

Angus Bezzina

👨‍🔬