GPU Tech Refresh: When to Upgrade Your AI Cluster

·Bernie Margulies

To decide when to refresh and upgrade your GPU cluster's hardware, you need to consider three things: future demand, the opportunity cost, and the capital required. These three things are tricky because, unexpectedly, instead of falling further, H100 1-year rental rates have rebounded 40%, from $1.70/hr in October to $2.35/hr in March 2026. [1]SemiAnalysis, "The Great GPU Shortage: Rental Capacity, H100 1 Year Rental Price Index" (April 2026) Understanding future demand and opportunity costs is now a lot harder for cluster owners.

Three options

  • Full refresh: Sell the entire fleet, and buy the next generation.
  • Partial refresh: Keep a portion of the fleet and add newer hardware alongside it.
  • Hold: Keep operating the H100 fleet.

The comparison below models all three for a 256-GPU H100 SXM5 fleet purchased in mid-2023, with the loan approaching maturity, and a potential upgrade to B200s. We use B200s as the modeling proxy because pricing and utilization data is available. In practice, the upgrade target could be B300s, GB-NVL72 racks, or whatever is shipping when you pull the trigger.

3-year net profit at
80%
utilization
$0Hold (256 H100)$7.5MPartial (128+128)$8.2MFull refresh (256 B200)$8.8M
Scenario
3-yr revenue
3-yr costs
Equity required
Hold
$9.6M
$3.3M
$0
Partial
$14.9M
$9.0M
$844K
Full refresh
$20.2M
$14.8M
$1.7M

At the default 80% utilization, full refresh produces the highest 3-year net profit because B200 rates are roughly double H100 rates. Below ~70% utilization, hold becomes the better option: the H100 fleet requires no new capital and has lower downside risk. Drag the slider to see where the crossover lands for your assumptions.

Hold scenario

GPU
256x H100 SXM5
Purchased
Mid-2023 at $39,000/GPU (avg. all-in per GPU incl. server, networking)
Loan status
Paid off or maturing 2026
Y1 / Y2 / Y3 rate
$2.35 / $1.75 / $1.25 per GPU/hr
Location
Texas colocation, air-cooled
Utilization
80%

Refresh CapEx

B200 cost
$50,000/GPU all-in (server, networking, 3-yr warranty)
H100 resale
$15,000/GPU (secondary market, strong demand)
Tax recapture
$1.4M (100% bonus depreciation, pass-through entity at ~37%)
LTV / Interest
70% / 15% annual, 3-yr fully amortizing
Origination fee
3% of loan amount
B200 residual (3 yr)
40% of purchase price ($20K/GPU)
H100 residual (3 yr)
~$4,700/GPU ($1.2M for 256)

Operating costs

H100 power
$0.12/GPU/hr (1.25 kW system draw, $0.07/kWh, 1.35 PUE)
B200 power
$0.14/GPU/hr (1.75 kW system draw, $0.07/kWh, 1.15 PUE liquid)
H100 colo
$150/GPU/month
B200 colo
$175/GPU/month (liquid-cooled)
Staff + admin
$200K/year
Insurance + maint + sw
$100K/year
Hold annual OpEx
~$1.1M (256 H100)
Full annual OpEx
~$1.2M (256 B200)
B200 Y1 / Y2 / Y3 rate
$4.50 / $3.75 / $3.00 per GPU/hr

The market flipped

Six months ago, most market observers expected H100 rental rates to keep falling as Blackwell supply ramped. The opposite happened. H100 1-year contract pricing shot up almost 40%. Firms described trying to find compute in early 2026 as “like trying to book airplane tickets on the last flight out.” [1]SemiAnalysis, "The Great GPU Shortage: Rental Capacity, H100 1 Year Rental Price Index" (April 2026)

There were several forces that drove the reversal. Demand increased with adoption of compute-heavy agentic AI, while open-source models expanded the customer pool. Supply was also lower than forecasted. Increased memory prices drove up new server costs, stopping potential compute suppliers, and Blackwell procurement lead times extended into mid-2026. [1]SemiAnalysis, "The Great GPU Shortage: Rental Capacity, H100 1 Year Rental Price Index" (April 2026)

According to SemiAnalysis, all of the Blackwell capacity that they see coming online before September 2026 is already booked. And half the providers in the SemiAnalysis index are completely sold out of the previous generation's Hopper capacity. [1]SemiAnalysis, "The Great GPU Shortage: Rental Capacity, H100 1 Year Rental Price Index" (April 2026)

GPU rental rates ($/GPU/hr)

$2.00$3.00$4.00$5.00Q1 2024Q3 2024Q1 2025Q3 2025Q1 2026B200 $4.5$1.7H100 $2.35

For income-based valuation methods like discounted cash-flow, hardware resale values (ie. residual value) are easily derived from the rates the hardware can generate. Reality shows that this is unreliable; the neocloud providers are focused on having stable rates to meet payback requirements. Hardware resale values don't always track the rental rates.

H100 resale prices have climbed with the demand rebound, but eventually, prices should resume declining. That lengthens payback periods for operators and makes the H100 less desirable. A used H100 that you can resell for $15,000 today might only get you $10,000 if you try to resell it in another year. [2]SemiAnalysis, GPU Rental Price Dashboard and ClusterMAX (2025-2026)

GPU residual value by generation (illustrative)

$0$10K$20K$30K2017201920212023202520272029V100A100 80GBH100 SXMH200 SXM
Project tooff

Hyperscalers disagree on how fast to depreciate. Amazon shortened GPU useful life from 6 to 5 years in Q1 2025, a $700M hit. Google extended to 6 years, saving ~$3.4B in 2023. This measure is imperfect of course. Depreciation accounting implies a GPU is worth $0 after its useful life, but that's clearly not true in the real world. It should only be used as a reference. [3]Princeton CITP, "Lifespan of AI Chips: The $300 Billion Question" (October 2025)https://blog.citp.princeton.edu/2025/10/15/lifespan-of-ai-chips-the-300-billion-question/

Server useful life by company (years, from SEC filings)
1234567HyperscalerNeocloudGoogle6yrAmazon5-6yrMeta5-6yrMicrosoft2-5yrCoreWeave3-5yrNebius3-5yr

57 companies surveyed. Shaded region shows the 3-5yr range used by 80% of filers. Source: 10-K/20-F filings (FY2024-2025). [4]American Compute analysis of 57 public company 10-K/20-F filings (FY2024-2025)

Option 1: Hold scenario

Two things to consider:

  1. The risk of rates falling and residual values also slipping. What gives you confidence in those rates, can you lock in those rates in long-term offtake agreements? Are you sure utilization of H100s will stay high?
  2. The opportunity cost of holding. Could you make more money if you switched to newer hardware?

Hold 3-year P&L (256 H100s, no new capital)

$0$3.0M$6.0M$9.0M$12.0M$9.6MRevenue-$3.3MOpEx$1.2MResidual$7.5MNet profit
3-yr net OpEx$2.1M
Eff. cost/GPU-hr @ 80% util$0.39/GPU-hr

Option 2: Partial refresh scenario

A partial refresh meaning selling some part of the fleet, and upgrading only a part to newer hardware. CoreWeave is the largest neocloud example: 250,000+ GPUs across 32 data centers mixing H100s, H200s, and GB200 systems. [5]CoreWeave S-1 filing and investor relations (2025-2026)https://investors.coreweave.com

How refinancing works (interactive)

$10M of hardware

You pay $3M (30%) out of pocket, and borrow $7M (aka. 70% LTV). The hardware is collateral for the loan. If you can’t pay, the lender takes them.

HARDWARE VALUE$10MDEBT$7MEQUITY (your money)$3M

Collateral constraints

If your current hardware was purchased with an active loan the lender has a legal claim on the GPUs as collateral. You cannot sell any of them without lender consent. Even partial sales might break loan covenants, leading to expensive penalties. There's two options:

  1. Some lenders will allow a partial collateral release, letting you sell some GPUs, if the remaining loan balance stays below 60% of the value of the hardware you keep. [6]CoreWeave $8.5B financing facility press release (March 2026)https://investors.coreweave.com/news/news-details/2026/CoreWeave-Closes-Landmark-8-5-Billion-Financing-Facility-Achieving-First-Investment-Grade-Rated-GPU-backed-Financing/default.aspx
  2. The more common path is to refinance: use any hardware sale to pay off the existing loan and take out a new loan that covers remaining and new hardware as collateral. Same as the first loan, there'll be one-time origination fees (typically 3% of the new loan amount) plus legal and appraisal expenses.

Partial refresh 3-year P&L (128 H100 + 128 B200)

-$6.0M-$3.0M$0$3.0M$6.0M$9.0M$12.0M$1.9MH100 sale-$710KTax-$4.5MB200 loan-$1.9MB200 equity-$134KOrig. fee-$1.1MInterest-$6.4MNet cost$14.9MRevenue-$3.5MOpEx$3.2MResidual$8.2MNet profit

Not modeled here is that Section 1245 tax recapture is also a cost. If you depreciated the H100s aggressively, their book value in your tax returns were well below what you'll now sell them for, and the IRS taxes that difference as ordinary income. [7]IRS Publication 946, "How To Depreciate Property" and Section 1245 recapture (2025)https://www.irs.gov/publications/p946

Two things to consider:

  1. The risk of losing your market. What gives you confidence that customers will want either generation, and that utilization will stay high? If now you can only offer customers smaller cluster size because you sold some hardware, will that lose you deals?
  2. Doing the work, but not realizing the full benefits. You're going through the work and pain of procuring, refinancing, and migrating to a new generation. Sometimes, it's better to whole-heartedly bet on one generation.
3-yr total cost (partial refresh)$9.9M
Eff. cost/GPU-hr @ 80% util$1.84/GPU-hr

Option 3: Full refresh scenario

Two things to consider:

  1. Demand is lower than expected. Are you sure new generation's rates and customer demand will be sufficiently high? The hardware is more expensive, so the rates need to be higher to breakeven.
  2. Timing is everything. Are you sure you can get the new hardware in time, when procurement is tough? Can you find a facility that supports liquid cooling and the higher power density? Cooling is the colo provider's infrastructure. But available liquid-cooled capacity is scarce, so finding a slot is a search problem. If you're too slow, you might lose customers to competitors.

Full refresh 3-year P&L (256 GPUs, H100 → B200)

-$12.0M-$6.0M$0$6.0M$12.0M$3.8MH100 sale-$1.4MTax-$9.0MB200 loan-$3.8MB200 equity-$269KOrig. fee-$2.2MInterest-$12.9MNet cost$20.2MRevenue-$3.6MOpEx$5.1MResidual$8.8MNet profit

Migration downtime is not modeled here: procurement, racking, and burn-in for B200s typically takes 1-3 months, and that's a window of zero revenue. At B200 rates that could be $0.5-1.5M in lost income.

3-yr total cost (full refresh)$16.5M
Eff. cost/GPU-hr @ 80% util$3.06/GPU-hr

How to decide

A few variables determine the path: loan status, current utilization, facility and equipment availability, and contracted demand.

Hold, partial refresh, or full refresh?

0 of 1 answered

1.

Is your H100 loan paid off or maturing within 6 months?

In practice, most operators won't do a clean refresh. Selling hardware that is under contract means downtime, and you can't time it so every long-term agreement expires at once. More likely, you can only sell a portion of the fleet as contracts roll off, and if you pre-sold capacity months ago, even that portion is locked up.

The path of least resistance is to keep the current fleet running until operating costs approach revenue, and deploy new hardware in a separate facility with fresh capital. That is what is happening across the neocloud sector: more debt, more deployments, not refresh cycles. It feeds the growth narrative, but it also means the industry is accumulating hardware rather than replacing it.

Hold when

  • Lender prevents hardware sale and refinancing is not available.
  • You have high-value, long-term offtake agreements for the H100s.
  • No signed or near-signed customer demand at B200 rates.

Partial refresh when

  • Customer demand exists at B200 rates, but liquid-cooled colo space is scarce or limited.
  • Migration downtime is not acceptable.
  • Rubin volume availability is less than 18 months away.

Full refresh when

  • Loan clear or refinancing available.
  • Customers signing contracts for B200s at $3.50+/hr.
  • Liquid-cooled facility secured.
  • Downtime during migration is acceptable.
  • Rubin volume is 18+ months out.

References

  1. SemiAnalysis, "The Great GPU Shortage: Rental Capacity, H100 1 Year Rental Price Index" (April 2026)
  2. SemiAnalysis, GPU Rental Price Dashboard and ClusterMAX (2025-2026)
  3. Princeton CITP, "Lifespan of AI Chips: The $300 Billion Question" (October 2025)
  4. American Compute analysis of 57 public company 10-K/20-F filings (FY2024-2025)
  5. CoreWeave S-1 filing and investor relations (2025-2026)
  6. CoreWeave $8.5B financing facility press release (March 2026)
  7. IRS Publication 946, "How To Depreciate Property" and Section 1245 recapture (2025)

Frequently Asked Questions

What are the three options for upgrading a GPU cluster?

Full refresh: sell the entire fleet and buy the next generation. Partial refresh: keep a portion of the fleet and add newer hardware alongside it. Hold: keep operating the current fleet. The comparison models all three for a 256-GPU H100 SXM5 fleet purchased in mid-2023, with the loan approaching maturity, and a potential upgrade to B200s.

Why did H100 rental rates rebound in 2026?

H100 1-year contract pricing shot up almost 40%, from $1.70/hr in October to $2.35/hr in March 2026. Demand increased with adoption of compute-heavy agentic AI, while open-source models expanded the customer pool. Supply was lower than forecasted: increased memory prices drove up new server costs, and Blackwell procurement lead times extended into mid-2026. All Blackwell capacity coming online before September 2026 is already booked, and half the providers in the SemiAnalysis index are completely sold out of Hopper capacity.

What are the collateral constraints on selling GPUs from a financed cluster?

If your current hardware was purchased with an active loan, the lender has a legal claim on the GPUs as collateral. You cannot sell any of them without lender consent. Even partial sales might break loan covenants. Some lenders will allow a partial collateral release if the remaining loan balance stays below 60% of the value of the hardware you keep. The more common path is to refinance: use any hardware sale to pay off the existing loan and take out a new loan that covers remaining and new hardware as collateral.

How do you decide between holding, partial refresh, and full refresh?

A few variables determine the path: loan status, current utilization, facility and equipment availability, and contracted demand. Hold when the lender prevents hardware sale, B200 delivery cannot be secured in under 9 months, or customers are not interested in latest-gen hardware. Partial refresh when customers are willing to pay for latest-gen hardware and liquid-cooled colo space is available but limited. Full refresh when customers are signing multi-year contracts for B200s at $4.00+/hr, a liquid-cooled facility is secured, and downtime during migration is acceptable.

Residual Value Insurance Solutions for GPUs

Coverage creates a minimum value for what your GPUs are worth at a future date. If they sell below the floor, the policy pays you the difference.

Learn how it works →
GPU Tech Refresh: When to Upgrade Your AI Cluster | American Compute