Token
Economics
Calculator

We created this Tokenomics calculator to help companies deploying LLMs understand the costs and energy involved, and compare different hardware and models.

Start here

Model Selection

Included models

Parameters (millions)

Quantization (model parameter precision)

Quantization (math/KV-cache precision)

Hardware Selection

Reference Hardware

Memory per Chip (GB)

Chips per Rack

Rack Power (kW)

Rack Cost (USD)

Concurrent Users (est)

Operating Parameters

Depreciation (years)

Weighted Average Cost of Capital (%)

Colocation Cost ($/kW per month)

Learn about colocation cost

Energy Cost ($/kWh)

Learn about US energy cost Learn about EU energy cost

Colo/Datacenter PUE

Expected System Utilization (%)

Results

Hardware

0 Rack Comparison

Total Tokens per Second

Waiting for inputs

Cost per million total tokens

Waiting for inputs

Total Tokens per USD

Waiting for inputs

Total Tokens per kWh

Waiting for inputs

Total Expenses per Month

Waiting for inputs

Contact Sales

Total Rack Cost (USD)

Waiting for inputs

Contact Sales

Racks Needed

Reference Hardware Cost Breakdown

Learn more about why Tensordyne performs so well

Contact

Contribute

MLPerf® Inference: Datacenter benchmark results for Llama 2 70B 99.9% and Llama 3.1 405B were retrieved from https://mlcommons.org/ on April 10, 2025 (v5.0 Closed) and September 15, 2025 (v5.1 Closed). Results from entries 5.0-0047, 5.0-0060, 5.1-0003, 5.1-0051, 5.1-0069, 5.1-0071, 5.1-0075. Total Tokens per Second were determined by dividing total throughput by the number of reported chips and multiplying by the number of chips for the Racks Systems Needed. Tensordyne's simulation results of MLPerf® configurations are unverified and have not been through an MLPerf® review and may use measurement methodologies and/or workload implementations that are inconsistent with the MLPerf® specification for verified results. Tensordyne plans to submit results for verification as soon as possible after system availability. The MLPerf® name and logo are registered and unregistered trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use strictly prohibited. See www.mlcommons.org for more information.

Disclaimer

The GenAI Token Economics Calculator was developed to help users consider how variation in certain parameters may impact the performance, costs, and energy for deploying LLMs using different hardware and models.

Tensordyne has attempted to provide estimates for a limited number of substantially similar scenarios by compiling information from publicly available sources including MLCommons¹ and normalizing the data. Where public information is limited or missing for certain parameters, Tensordyne has included its best estimates, e.g. concurrent users supported, quantization, etc.

The calculator makes the following key assumptions:

“Racks needed” are calculated assuming that model parameters and KV-cache reside in fast memory (SRAM, HBM, GDDR) to achieve best performance
“Rack Power” is based on best available public sources and may be TDP (Thermal Design Power)
Where publicly available results are believed to include speculative decoding, performance has been either normalized or excluded.
Results are for actual system performance other than for Tensordyne1 which is based on our simulation results.

‍This calculator is for instructional purposes only. The calculator makes many assumptions and relies on publicly available data that may be incorrect or outdated, which may introduce significant error into results. Tensordyne does not make factual representations about any of the calculator’s inputs or outputs. Users are encouraged to observe the relative sensitivity of outputs to changes in inputs.

Use of this calculator for any purpose, including the use of any resulting output, is at the sole discretion and responsibility of the user.

Tensordyne makes no warranty regarding the accuracy of results or the suitability of use for any purpose and disclaims all responsibility and/or liability arising from its use.

See our Token Economics Calculator Explained blog for additional background.

‍

¹MLPerf® Inference: Datacenter benchmark results for Llama 2 70B 99.9% and Llama 3.1 405B were retrieved from https://mlcommons.org/benchmarks/ on April 10, 2025 (v5.0 Closed) and September 15, 2025 (v5.1 Closed). Results from entries 5.0-0047, 5.0-0060, 5.1-0003, 5.1-0051, 5.1-0069, 5.1-0071, 5.1-0075. Total Tokens per Second were determined by dividing total throughput by the number of reported chips and multiplying by the number of chips for the Racks Systems Needed.Tensordyne simulation results are unverified and have not been through an MLPerf® review and may use measurement methodologies and/or workload implementations that are inconsistent with the MLPerf® specification for verified results. Tensordyne plans to submit results for verification as soon as possible after system availability.

The MLPerf name and logo are registered and unregistered trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use strictly prohibited. See www.mlcommons.org for more information.