Jump to content

Compute (machine learning)

From Wikipedia, the free encyclopedia
Since at least 2025, compute is often used in the context of computation produced by growing amounts of GPUs.[1]

In machine learning and deep learning, compute is the amount of computing power or computational resources required to train machine learning models and large language models. More broadly, compute is the computational power or resources necessary for a computer or computer program to function.

Definition

[edit]

Compute is commonly defined as the amount of computing power or computational resources required to train machine learning and large language models.[2][1] The term "compute" has also been more broadly applied to cloud computing, referencing processing power, memory, networking, storage, and other resources required for the computation of any program.[3]

Compute is measured in petaflop/s-days and is used to document AI training.[4] A petaflop/s-day (pfs-day) consists of performing 1015 neural net operations per second for one day, or a total of about 1020 operations. The compute-time product serves as a mental convenience, similar to kilowatt-hour for energy. An amount of compute is meant to give an idea of the number of actual operations performed.[4]

History

[edit]

In a 2018 analysis titled "AI and compute", artificial intelligence company OpenAI introduced the concept of compute.[4]

OpenAI identified two eras of training AI systems in terms of compute-usage. From 1959 to 2012, compute roughly followed Moore’s law.[4] Between 2012 and 2018, the amount of compute used in the largest AI training runs increased exponentially, growing by more than 300,000 times — roughly doubling every 3.4 months.[4][2] By comparison, Moore’s Law doubled every two years over the same period.[4] One of the largest models, released in 2020, used 600,000 times more computing power than the 2012 model.[2]

After 2020, compute growth began to slow down,[2] with the compute needed for the largest AI models continuing to slow down in 2023.[5] The notion of compute has become increasingly used from the mid-2020s onwards.[6]

Compute growth and AI progress

[edit]

Larger AI models trained on more data and using more computational resources, tend to perform better.[7][8] This happens even if the algorithms themselves remain unchanged.[8]

As early as 2018, OpenAI noted the exponential increase in compute to be have a key role in AI progress.[4] OpenAI considers three factors drive the advance of AI: algorithmic innovation, data, and the amount of compute available for training.[4] AI models with more compute not only improve in the tasks they were trained on but can develop emergent abilities.[9] Incremental improvements can lead to more abrupt leaps in capabilities.[7]

AI provider SpaceXAI said in 2026 that their AI progress is driven by compute and used it a key metric in the AI training of its supercomputer Colossus, the which contains 1 million GPUs.[10] Anthropic has a contract of $1.25 billion per month with SpaceXAI to buy all the compute capacity at Colossus 1 data center.[11]

Criticism and policy

[edit]

Increasing, promoting or constraining progress in artificial intelligence has often be done via controlling the amount of compute.[12] Policymarkers have enacted policies and provided support to make compute resources more accessible to domestic AI researchers.[12]

In a January 2022 report, the Center for Security and Emerging Technology (CSET) suggested to institutions that increasingly powerful and generalizable AI (AGI) will likely require other strategies than maximizing compute.[2][12] Some AI researchers are also concerned that government might exclusively focus on scaling compute instead of other strategies.[12]

The CSET has reported on the various bottlenecks which could explain why deep learning needs for compute have slow down:

  • training is expensive and training extremely large models generates traffic jams across many processors that are difficult to manage.[2]
  • there is a limited supply of AI chips[2] (see AI chip memory shortage).
  • CSET advances that the main resource is human capital, specifically talented researchers — according to a 2023 published survey of more than 400 AI researchers, academic and private sector workers.[12][5] The survey found that AI researchers are not primarily or exclusively constrained by compute access.[12] However, both academic and industry AI researchers equally report concerns that insufficient compute could prevent them from contributing meaningfully to AI research in the future.[12] High compute users are more concerned about compute access.[12] When asked about which resource provided by the government would be the most useful to them, some AI researchers select compute, other prefer grant funding.[12]

For this goal, CSET advised policymakers to ensure that even researchers with smaller budgets could effectively contribute to AI research.[2] Other proposed strategies include using contemporary AI algorithms, managing modern AI infrastructure or focusing on interdisciplinary work between the AI field and other fields of computer science.[2]

A 2024 study on compute access found that academic-only AI research teams often have less compute intensive research topics, especially foundation models, compared to industry AI labs.[13] As a consequence, academia is likely to play a smaller role in advancing such techniques. The researchers suggest nationally-sponsored computing infrastructure as well as open science initiatives to boost academic compute access.[13]

Data

[edit]

A 2022 study found that current large language models are significantly under-trained, a consequence of focusing on scaling language models whilst keeping the amount of training data constant. By training over 400 language models of various parameter and token size, they found that "for compute-optimal training", the model size and the number of training tokens should ideally be scaled equally: for every doubling of model size the number of training tokens should also be doubled.[8]

See also

[edit]

References

[edit]
  1. ^ a b Samborska, Veronika (2025-01-20). "Scaling up: how increasing inputs has made artificial intelligence more capable". Our World in Data.
  2. ^ a b c d e f g h i "AI and Compute". Center for Security and Emerging Technology. 2022-01-01. Retrieved 2026-05-27.
  3. ^ "What is Compute? - Enterprise Cloud Computing Explained - AWS". Amazon Web Services, Inc. Retrieved 2026-05-27.
  4. ^ a b c d e f g h "AI and compute". OpenAI. 2018. Retrieved 2026-05-27.
  5. ^ a b "How Important is Compute to the Future of AI?". Center for Security and Emerging Technology. 2023-05-25. Retrieved 2026-05-27.
  6. ^ "compute". www.merriam-webster.com. 2026-01-08. Retrieved 2026-05-28.
  7. ^ a b "Artificial intelligence: Performance on knowledge tests vs. training computation". Our World in Data. Archived from the original on 2026-05-09. Retrieved 2026-05-27.
  8. ^ a b c Hoffmann, Jordan; Borgeaud, Sebastian; Mensch, Arthur; Buchatskaya, Elena; Cai, Trevor; Rutherford, Eliza; Casas, Diego de Las; Hendricks, Lisa Anne; Welbl, Johannes (2022-03-29), Training Compute-Optimal Large Language Models, arXiv, doi:10.48550/arXiv.2203.15556, arXiv:2203.15556, retrieved 2026-05-27
  9. ^ Wei, Jason; Tay, Yi; Bommasani, Rishi; Raffel, Colin; Zoph, Barret; Borgeaud, Sebastian; Yogatama, Dani; Bosma, Maarten; Zhou, Denny (2022-10-26), Emergent Abilities of Large Language Models, arXiv, doi:10.48550/arXiv.2206.07682, arXiv:2206.07682, retrieved 2026-05-27
  10. ^ "Colossus: The World's Largest AI Supercomputer | xAI". x.ai. Retrieved 2026-05-27.
  11. ^ Brandom, Russell (2026-05-06). "Is xAI a neocloud now?". TechCrunch. Retrieved 2026-05-27.
  12. ^ a b c d e f g h i ""The Main Resource is the Human"". Center for Security and Emerging Technology. 2023-04-01. Retrieved 2026-05-27.
  13. ^ a b Besiroglu, Tamay; Bergerson, Sage Andrus; Michael, Amelia; Heim, Lennart; Luo, Xueyun; Thompson, Neil (2024-01-08), The Compute Divide in Machine Learning: A Threat to Academic Contribution and Scrutiny?, arXiv, doi:10.48550/arXiv.2401.02452, arXiv:2401.02452, retrieved 2026-05-27