When Should AI Training Runs Trigger Oversight?

Introduction

One of the most widely discussed ideas in compute governance is the use of training-compute thresholds as a trigger for oversight. Instead of waiting until an AI system demonstrates dangerous capabilities, regulators would require organisations to report, document, or review training runs once they exceed a specified level of computational power. The underlying logic is simple: if the most capable frontier systems require unusually large amounts of compute to train, then compute can serve as an early warning signal. [Institute for Law & AI]law-ai.orgInstitute for Law & AIThe Role of Compute Thresholds for AI GovernanceFebruary 20, 2025 — This article discusses the role of training compute thresholds, which use training compute to determine which potenti…Published: February 20, 2025

Thresholds illustration 1 Within debates about AI doom and existential risk, these thresholds matter because they are intended to identify projects that might eventually create systems capable of large-scale misuse, dangerous autonomy, or loss-of-control scenarios. Yet there is no consensus on where the thresholds should be set, how often they should be updated, or whether sophisticated developers could eventually bypass them. The debate is not about whether compute can be measured; it is about whether a measurable quantity is a sufficiently reliable proxy for future danger. [arXiv]arxiv.orgarXiv Training Compute Thresholds: Features and Functions in AI RegulationarXiv Training Compute Thresholds: Features and Functions in AI Regulation

How Compute Thresholds Are Defined

Most proposals define thresholds using the total number of computational operations used during training, typically measured in floating-point operations (FLOPs). Training runs above a specified FLOP level would automatically trigger additional obligations such as government notification, independent review, safety evaluations, or auditing. [Institute for Law & AI]law-ai.orgInstitute for Law & AILegal Considerations for Defining “Frontier Model”September 30, 2024 — 9 Sept 2024 — This was, at least in part, the reason for the inclusion of training compute thresholds of 1026 FLOP i…Published: September 30, 2024

The most influential threshold to date emerged from the United States government’s 2023 AI Executive Order. It established reporting requirements for certain models trained using more than 10^26 floating-point or integer operations. A lower threshold of 10^23 operations was applied to some biological-sequence models because of concerns about biotechnology-related risks. [Federal Register]federalregister.govFederal RegisterSafe, Secure, and Trustworthy Development and Use of…November 1, 2023 — 1 Nov 2023 — Such reports shall include, at a…Published: November 1, 2023 [Morrison Foerster]mofo.com231107 the ai executive order presidential authorityMorrison FoersterThe AI Executive Order: Presidential Authority for…7 Nov 2023 — Any AI model that was trained: using a quantity of co…

The same 10^26 FLOP benchmark later appeared in several frontier-model policy proposals, including California’s SB 1047. In these frameworks, crossing the threshold did not automatically imply that a model was dangerous. Instead, it created a presumption that the model was advanced enough to justify closer scrutiny. [Morgan Lewis]morganlewis.comcomputing power of three times 10^25 integer or FLOP costing over $10 million.[1] This is the same computing threshold as set in the Bide… [2orrick.com]orrick.commodels under SB-1047 sets a high threshold for regulation.Read more…

Supporters of compute thresholds often point to several practical advantages: [epoch.ai]epoch.aimodel counts compute thresholdsHow many AI models will exceed compute thresholds?30 May 2025 — We project how many notable AI models will exceed training compute thresh…Published: May 2025

Compute is measurable and can be estimated relatively consistently.
Training compute can be assessed before deployment, allowing earlier intervention.
Large training runs generally require specialised hardware, making them harder to hide than software development alone.
Training compute has historically correlated with frontier-level capability growth. [arXiv]arxiv.orgarXiv Training Compute Thresholds: Features and Functions in AI RegulationarXiv Training Compute Thresholds: Features and Functions in AI Regulation [Institute]law-ai.orgInstitute for Law & AIThe Role of Compute Thresholds for AI GovernanceFebruary 20, 2025 — This article discusses the role of training compute thresholds, which use training compute to determine which potenti…Published: February 20, 2025 for Law & AI

This does not mean compute perfectly predicts capability. Rather, advocates view it as a screening tool that identifies projects deserving additional attention. [arXiv]arxiv.orgarXiv Training Compute Thresholds: Features and Functions in AI RegulationarXiv Training Compute Thresholds: Features and Functions in AI Regulation

Why AI Doom Advocates Care About Thresholds

For people concerned about existential risk, the appeal of compute thresholds is that they operate upstream of dangerous outcomes.

A central concern in many AI doom scenarios is that by the time a system clearly demonstrates dangerous capabilities, it may already be deeply integrated into critical infrastructure, military systems, scientific research, or economic decision-making. Oversight triggered during training could create opportunities for evaluations and safety testing before deployment. [Institute for Law & AI]law-ai.orgInstitute for Law & AILegal Considerations for Defining “Frontier Model”September 30, 2024 — 9 Sept 2024 — This was, at least in part, the reason for the inclusion of training compute thresholds of 1026 FLOP i…Published: September 30, 2024

The argument is partly strategic. Governments may struggle to define “dangerous AI” in advance because capabilities can emerge unexpectedly. Compute, by contrast, is a concrete quantity that can be monitored. As a result, some governance researchers argue that thresholds provide a practical mechanism for identifying frontier projects even when policymakers cannot accurately predict which specific capabilities will emerge. [Institute for Law & AI]law-ai.orgInstitute for Law & AILegal Considerations for Defining “Frontier Model”September 30, 2024 — 9 Sept 2024 — This was, at least in part, the reason for the inclusion of training compute thresholds of 1026 FLOP i…Published: September 30, 2024

Critics respond that the relationship between training scale and existential risk is much less certain than proponents sometimes imply. A model trained with somewhat less compute could still prove dangerous, while a model trained with enormous compute might not create the feared risks. The connection between compute and catastrophe remains inferential rather than directly demonstrated. [arXiv]arxiv.orgarXiv Training Compute Thresholds: Features and Functions in AI RegulationarXiv Training Compute Thresholds: Features and Functions in AI Regulation

Reporting Requirements and Audits

Most compute-threshold proposals do not advocate automatic bans on large training runs. Instead, they use thresholds to activate additional oversight requirements.

Common proposals include:

Mandatory notification to a government regulator or AI safety authority before or after a qualifying training run.
Safety evaluations designed to test for dangerous capabilities.
Risk assessments documenting foreseeable catastrophic-use concerns.
Incident reporting if significant safety failures are discovered.
Independent auditing of compute records and training procedures. [Oxford Martin AIGI]aigi.ox.ac.ukSurvey on thresholds for advanced AI systems 1Oxford Martin AIGISURVEY ON THRESHOLDS FOR ADVANCED AI SYSTEMSAugust 29, 2025 — by J Schuett · 2025 · Cited by 3 — “If training compute t…Published: August 29, 2025 2arXiv

This distinction is important. The strongest advocates of compute governance often present thresholds as a trigger rather than a final decision rule. Crossing the threshold does not automatically prove a model is dangerous; it merely initiates a review process. The subsequent evaluations are intended to determine whether additional safeguards are warranted. [arXiv]arxiv.orgarXiv Training Compute Thresholds: Features and Functions in AI RegulationarXiv Training Compute Thresholds: Features and Functions in AI Regulation

This approach attempts to solve a practical governance problem. Regulators may lack the capacity to examine every AI project. Thresholds provide a filtering mechanism that concentrates attention on a relatively small number of unusually large training efforts. [Institute for Law & AI]law-ai.orgInstitute for Law & AILegal Considerations for Defining “Frontier Model”September 30, 2024 — 9 Sept 2024 — This was, at least in part, the reason for the inclusion of training compute thresholds of 1026 FLOP i…Published: September 30, 2024

Thresholds illustration 2

Where Should the Threshold Be Set?

The hardest policy question is not whether thresholds are useful, but where they should sit.

If thresholds are set too low, regulators may become overwhelmed with notifications and audits. Oversight resources could be spread across many systems that pose little existential concern. If thresholds are set too high, potentially important frontier projects may escape scrutiny altogether. [Governor of California]gov.ca.govJune 17 2025 – The California Report on Frontier AI PolicyGovernor of CaliforniaTHE CALIFORNIA REPORT ON FRONTIER AI POLICYJune 17, 2025 — 17 Jun 2025 — Noteworthy examples of compute thresholds…Published: June 17, 2025

The 10^26 FLOP benchmark was partly chosen because it was above the training compute used by many leading models when the rule was designed. Policymakers hoped it would focus attention on systems pushing beyond the existing frontier. [Institute for Law & AI]law-ai.orgInstitute for Law & AILegal Considerations for Defining “Frontier Model”September 30, 2024 — 9 Sept 2024 — This was, at least in part, the reason for the inclusion of training compute thresholds of 1026 FLOP i…Published: September 30, 2024

However, technological progress quickly creates pressure on any fixed threshold. Models that seem exceptional today may become routine within a few years. Research forecasting published by Epoch AI projects that the number of notable models exceeding 10^26 FLOPs could rise dramatically over the second half of the decade, potentially transforming what was once a rare threshold into a relatively common one. [Epoch AI]epoch.aimodel counts compute thresholdsHow many AI models will exceed compute thresholds?30 May 2025 — We project how many notable AI models will exceed training compute thresh…Published: May 2025

This creates a moving-target problem. A threshold that is effective in one year may become obsolete in the next. Many analysts therefore argue that thresholds should be regularly updated rather than permanently fixed in law. [Governor of California]gov.ca.govJune 17 2025 – The California Report on Frontier AI PolicyGovernor of CaliforniaTHE CALIFORNIA REPORT ON FRONTIER AI POLICYJune 17, 2025 — 17 Jun 2025 — Noteworthy examples of compute thresholds…Published: June 17, 2025

Arguments Over Evasion and Effectiveness

The most significant criticism of compute thresholds is that they may become easier to evade over time.

AI researchers continuously discover techniques that improve capability without proportionally increasing training compute. Better algorithms, model reuse, fine-tuning methods, synthetic data generation, and inference-time techniques can all produce stronger systems while reducing the amount of compute needed during training. [arXiv]arxiv.orgarXiv Training Compute Thresholds: Features and Functions in AI RegulationarXiv Training Compute Thresholds: Features and Functions in AI Regulation

This raises a challenge for threshold-based regulation. If policymakers assume that high capability always requires high training compute, developers may eventually find ways to remain below regulatory thresholds while still producing highly capable systems. Researchers have explicitly identified fine-tuning, model expansion, and reuse of existing frontier models as potential loopholes in threshold-based frameworks. [arXiv]arxiv.orgarXiv Training Compute Thresholds: Features and Functions in AI RegulationarXiv Training Compute Thresholds: Features and Functions in AI Regulation

Another concern is that thresholds create artificial boundaries. Risk generally changes gradually, but regulation often creates a sharp distinction between systems just below and just above a numerical cutoff. A model trained at 9.9 × 10^25 FLOPs may not be meaningfully different from one trained at 1.01 × 10^26 FLOPs, yet one might trigger extensive requirements while the other does not. [Governor of California]gov.ca.govJune 17 2025 – The California Report on Frontier AI PolicyGovernor of CaliforniaTHE CALIFORNIA REPORT ON FRONTIER AI POLICYJune 17, 2025 — 17 Jun 2025 — Noteworthy examples of compute thresholds…Published: June 17, 2025

Supporters respond that every regulatory system requires practical thresholds somewhere. The relevant question is not whether thresholds are perfect, but whether they are more workable than alternatives. Compared with vague capability-based definitions, compute remains relatively objective, measurable, and auditable. [arXiv]arxiv.orgarXiv Training Compute Thresholds: Features and Functions in AI RegulationarXiv Training Compute Thresholds: Features and Functions in AI Regulation

Thresholds illustration 3

What Thresholds Can and Cannot Do

The strongest case for compute reporting thresholds is not that they solve AI doom risk on their own. Rather, they provide an administrative mechanism for identifying frontier projects before deployment and directing limited oversight resources toward the systems most likely to deserve attention. [arXiv]arxiv.orgarXiv Training Compute Thresholds: Features and Functions in AI RegulationarXiv Training Compute Thresholds: Features and Functions in AI Regulation

The strongest criticism is that compute is only a proxy. Future breakthroughs could weaken the link between training compute and capability, while legal thresholds may struggle to keep pace with changing technology. A threshold can identify some frontier systems, but it cannot reliably determine whether a particular model is aligned, controllable, deceptive, or existentially dangerous. [arXiv]arxiv.orgarXiv Training Compute Thresholds: Features and Functions in AI RegulationarXiv Training Compute Thresholds: Features and Functions in AI Regulation

As a result, a growing consensus among governance researchers is that compute thresholds are most useful as safety triggers rather than as complete safety standards. Their role is to determine when reporting, auditing, evaluation, and scrutiny should begin. The more difficult question—whether a given system actually poses catastrophic or existential risk—still requires direct assessment of capabilities and behaviour rather than reliance on compute alone. [arXiv]arxiv.orgarXiv Training Compute Thresholds: Features and Functions in AI RegulationarXiv Training Compute Thresholds: Features and Functions in AI Regulation

Amazon book picks

Marketplace Samples

Example marketplace items related to this page. Use the search link to explore similar finds on eBay.

Example eBay listing

BeAndge STEM Kits for Kids Crafts 6-8 8-12 Boys Gifts Idea Robotics Science Kits

Search eBay.com: robotics kit

Browse similar on eBay.com

Example eBay listing

Animatronic Eyes DIY kit for Arduino Bionic robot SG90 servo Joystick control

Search eBay.com: robotics kit

Browse similar on eBay.com

Example eBay listing

Smart Robot Platform DIY Chassis Kit Crawler Tank Motors for Pi Arduino Maker

Search eBay.com: robotics kit

Browse similar on eBay.com

Example eBay listing

Solar Robot Kit DIY Toys Kids Ages 8+ STEM Educational 12-in-1 Kit

Search eBay.com: robotics kit

Browse similar on eBay.com

Browse more on eBay.com

Example items shown for inspiration; availability and pricing can change. Branchoria may earn a commission if you purchase through outbound eBay links.

Example eBay listing

AI ARTIFICIAL INTELLIGENCE . 2001 ORIGINAL MOVIE POSTER vintage 24 YEARS OLD

Search eBay.co.uk: AI poster

Browse similar on eBay.co.uk

Example eBay listing

AI - Artificial Intelligence (Poster + Slipcase) Blu-Ray

Search eBay.co.uk: AI poster

Browse similar on eBay.co.uk

Browse more on eBay.co.uk

Example items shown for inspiration; availability and pricing can change. Branchoria may earn a commission if you purchase through outbound eBay links.

Endnotes

Source: law-ai.org
Title: Institute for Law & AIThe Role of Compute Thresholds for AI Governance
Link: https://law-ai.org/the-role-of-compute-thresholds-for-ai-governance/
Source snippet
February 20, 2025 — This article discusses the role of training compute thresholds, which use training compute to determine which potenti...

Published: February 20, 2025
Source: arxiv.org
Title: arXiv Training Compute Thresholds: Features and Functions in AI Regulation
Link: https://arxiv.org/abs/2405.10799
Source: arxiv.org
Title: arXiv On the Limitations of Compute Thresholds as a Governance Strategy
Link: https://arxiv.org/abs/2407.05694
Source: orrick.com
Link: https://www.orrick.com/en/Insights/2024/07/California-Looks-to-Regulate-Cutting-Edge-Frontier-AI-Models-5-Things-to-Know-About-SB1047
Source snippet
models under SB-1047 sets a high threshold for regulation.Read more...
Source: cdn.governance.ai
Title: Computing Power and the Governance of AI
Link: https://cdn.governance.ai/Computing_Power_and_the_Governance_of_AI.pdf
Source snippet
Computing Power and the Governance of Artificial...14 Feb 2024 — Computing power, or "compute," is crucial for the development and deplo...
Source: law-ai.org
Title: Institute for Law & AILegal Considerations for Defining “Frontier Model”
Link: https://law-ai.org/frontier-model-definitions/
Source snippet
September 30, 2024 — 9 Sept 2024 — This was, at least in part, the reason for the inclusion of training compute thresholds of 1026 FLOP i...

Published: September 30, 2024
Source: epoch.ai
Title: model counts compute thresholds
Link: https://epoch.ai/publications/model-counts-compute-thresholds
Source snippet
How many AI models will exceed compute thresholds?30 May 2025 — We project how many notable AI models will exceed training compute thresh...

Published: May 2025
Source: arxiv.org
Title: arXiv Defending Compute Thresholds Against Legal Loopholes
Link: https://arxiv.org/abs/2502.00003
Source snippet
arXivDefending Compute Thresholds Against Legal LoopholesJanuary 3, 2025...

Published: January 3, 2025
Source: arxiv.org
Link: https://arxiv.org/html/2405.10799v2
Source snippet
Training Compute Thresholds: Features and Functions in...6 Aug 2024 — We argue that training compute currently is the most suitable metr...
Source: arxiv.org
Link: https://arxiv.org/pdf/2502.00003
Source snippet
Under the vetoed California Senate Bill 1047, the definition of 'covered model' would have included AI models...Read more...
Source: federalregister.gov
Link: https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence
Source snippet
Federal RegisterSafe, Secure, and Trustworthy Development and Use of...November 1, 2023 — 1 Nov 2023 — Such reports shall include, at a...

Published: November 1, 2023
Source: mofo.com
Title: 231107 the ai executive order presidential authority
Link: https://www.mofo.com/resources/insights/231107-the-ai-executive-order-presidential-authority
Source snippet
Morrison FoersterThe AI Executive Order: Presidential Authority for...7 Nov 2023 — Any AI model that was trained: using a quantity of co...
Source: morganlewis.com
Link: https://www.morganlewis.com/pubs/2024/08/californias-sb-1047-would-impose-new-safety-requirements-for-developers-of-large-scale-ai-models
Source snippet
computing power of three times 10^25 integer or FLOP costing over $10 million.[1] This is the same computing threshold as set in the Bide...
Source: aigi.ox.ac.uk
Title: Survey on thresholds for advanced AI systems 1
Link: https://aigi.ox.ac.uk/wp-content/uploads/2025/08/Survey_on_thresholds_for_advanced_AI_systems_1.pdf
Source snippet
Oxford Martin AIGISURVEY ON THRESHOLDS FOR ADVANCED AI SYSTEMSAugust 29, 2025 — by J Schuett · 2025 · Cited by 3 — “If training compute t...

Published: August 29, 2025
Source: gov.ca.gov
Title: June 17 2025 – The California Report on Frontier AI Policy
Link: https://www.gov.ca.gov/wp-content/uploads/2025/06/June-17-2025-%E2%80%93-The-California-Report-on-Frontier-AI-Policy.pdf
Source snippet
Governor of CaliforniaTHE CALIFORNIA REPORT ON FRONTIER AI POLICYJune 17, 2025 — 17 Jun 2025 — Noteworthy examples of compute thresholds...

Published: June 17, 2025
Source: aws.amazon.com
Link: https://aws.amazon.com/what-is/compute/
Source snippet
Enterprise Cloud Computing ExplainedIt is a generic term used to reference processing power, memory, networking, storage, and other resou...
Source: hpe.com
Link: https://www.hpe.com/uk/en/what-is/compute.html
Source snippet
What is Compute? | Glossary31 Oct 2025 — Compute refers to the ability of a computer system to process and execute tasks, calculations, a...

Additional References

Source: merriam-webster.com
Link: https://www.merriam-webster.com/dictionary/compute
Source snippet
COMPUTE Definition & Meaning1. to make calculation: reckon They compute by weight in selling grain. 2. to use a computer 3. informal: t...
Source: reddit.com
Link: https://www.reddit.com/r/explainlikeimfive/comments/hiqdpx/eli5_what_exactly_is_compute/
Source snippet
eli5: What exactly is compute?: r/explainlikeimfiveI am curious to understand more what compute means in reference to AWS services. What...
Source: mayerbrown.com
Link: https://www.mayerbrown.com/en/insights/publications/2024/09/us-department-of-commerce-issues-proposal-to-require-reporting-development-of-advanced-ai-models-and-computer-clusters
Source snippet
US Department of Commerce Issues Proposal to Require...17 Sept 2024 — “Conducting any AI model training run using more than 10^26 comput...
Source: GOV.UK
Link: https://www.gov.uk/government/publications/ai-safety-institute-overview/introducing-the-ai-safety-institute
Source snippet
the AI Safety InstituteThe government is committed to supporting a thriving compute environment that maintains the UK's position as a lea...
Source: reddit.com
Link: https://www.reddit.com/r/LocalLLaMA/comments/17k7obo/biden_executive_order_regulates_very_large_models/
Source: paulweiss.com
Link: https://www.paulweiss.com/insights/client-memos/commerce-proposes-rule-to-collect-frontier-ai-and-computing-cluster-data-for-national-security-purposes
Source snippet
Commerce Proposes Rule to Collect Frontier AI and...13 Sept 2024 — [7] Models trained on primarily biological sequence data, but at the...
Source: lw.com
Link: https://www.lw.com/admin/upload/SiteAttachments/President-Bidens-Executive-Order-on-AI-Initial-Analysis-of-Private-Sector-Implications.pdf
Source snippet
President Biden's Executive Order on Artificial Intelligence30 Oct 2023 — Until then, a model shall be considered to have potential for s...
Source: santafe.edu
Title: what does it mean to compute new paper by sfi researchers points to an answer
Link: https://www.santafe.edu/news-center/news/what-does-it-mean-to-compute-new-paper-by-sfi-researchers-points-to-an-answer
Source snippet
What does it mean to compute?25 Feb 2026 — It also provides a way to define computation specifically. “We can say that some system can co...
Source: engineadvocacyfoundation.medium.com
Title: ai essentials what is compute and how is it measured 36951f78485a
Link: https://engineadvocacyfoundation.medium.com/ai-essentials-what-is-compute-and-how-is-it-measured-36951f78485a
Source snippet
Essentials: What is compute and how is it measured?Compute refers to the hardware resources that make AI models work, allowing them to tr...
Source: fenwick.com
Title: interesting developments for regulatory thresholds of [ai compute]({{ ‘compute-kyc/’ | relative_url }})
Link: https://www.fenwick.com/insights/publications/interesting-developments-for-regulatory-thresholds-of-ai-compute
Source snippet
Technological Challenges for Regulatory Thresholds of AI…20 Jun 2024 — This comports with California's proposed SB 1047, which asserts th...

When Should AI Training Runs Trigger Oversight?

Introduction

How Compute Thresholds Are Defined

Why AI Doom Advocates Care About Thresholds

Reporting Requirements and Audits

Where Should the Threshold Be Set?

Arguments Over Evasion and Effectiveness

What Thresholds Can and Cannot Do

Further Reading

The Alignment Problem

Human Compatible

Superintelligence

The Coming Wave

Marketplace Samples

BeAndge STEM Kits for Kids Crafts 6-8 8-12 Boys Gifts Idea Robotics Science Kits

Animatronic Eyes DIY kit for Arduino Bionic robot SG90 servo Joystick control

Smart Robot Platform DIY Chassis Kit Crawler Tank Motors for Pi Arduino Maker

Solar Robot Kit DIY Toys Kids Ages 8+ STEM Educational 12-in-1 Kit

AI ARTIFICIAL INTELLIGENCE . 2001 ORIGINAL MOVIE POSTER vintage 24 YEARS OLD

AI - Artificial Intelligence (Poster + Slipcase) Blu-Ray

Endnotes

Additional References

Follow this branch

Parent topic

Related pages 3

More on this topic 3