GPUSCALE Benchmark Leaderboard

AssumptionsElectricity$/kWhDuty cycle24h/day

Cheapest rental per 1M tokens

—

no data

Lowest vast/runpod $ to generate a million tokens at this GPU's best run.

Most energy-efficient per 1M tokens

—

no data

If you already own the card — cheapest electricity burn per million tokens.

Cheapest $ per GB VRAM (buy)

—

no data

For VRAM-bound workloads — best dollars per usable gigabyte, upfront.

Fastest break-even vs rental

—

no data

Hours of rental equivalent to the full buy price (after subtracting your electricity).

Computing cost report…

How to read this

TDP, not measured power. Power math uses manufacturer nominal TDP. Per-run avg_power_draw_wreadings are noisy on vast/runpod hosts (SXM boards often report 0–45W regardless of load), so TDP gives a more honest “will this blow up my electric bill” answer.

Best tok/s per GPU. Each row uses the single highest tok/s observed across all engine/model/quant combinations for that GPU. Hover the cell to see which run it came from.

Rent $/hr. Cheapest of the latest vast, vast (community), and runpod prices. Hover to see which source provided it.

Break-even. upfront / (rent$/hr − electricity$/hr). If electricity alone costs more than the rental rate the row shows “—”.

Power / month = TDP × 24 h/day × 30 × $0.13/kWh. Adjust the controls above to match your usage.

kWh / Mtok = TDP × (1,000,000 / tok/s) / 3,600,000. A lower number means the GPU converts a joule into more generated tokens.