Within Safety Checks

How Policy Thresholds Govern Safe Frontier AI Development

Describes how predefined risk limits trigger interventions like pausing training, additional safeguards, or regulatory reporting.

On this page

  • Defining capability thresholds and danger metrics
  • Actions triggered when limits are exceeded
  • Balancing strictness to avoid over blocking innovation
Preview for How Policy Thresholds Govern Safe Frontier AI Development

Introduction

In proposals for mandatory frontier AI safety evaluations, policy thresholds are the point at which concern turns into action. Rather than treating safety reviews as advisory exercises, threshold-based governance links predefined indicators of danger to specific consequences. If a model, training run, or organisation crosses a threshold, additional evaluations, reporting requirements, deployment restrictions, or even pauses to development may be triggered. The core idea is that potentially catastrophic risks should not depend entirely on the judgement of individual AI companies. Instead, agreed limits create predictable responses before systems become too powerful to manage. Supporters see this as a practical way to reduce the risk of loss of control, dangerous autonomy, or catastrophic misuse. Critics argue that the wrong thresholds could either fail to catch genuinely dangerous systems or unnecessarily slow beneficial innovation. [Frontier Model Forum]frontiermodelforum.orgissue brief thresholds for frontier ai safety frameworksFrontier Model ForumIssue Brief: Thresholds for Frontier AI Safety Frameworks7 Feb 2025 — This brief elaborates on the importance of thre… [GOV.UK]GOV.UKfrontier ai safety commitments ai seoul summit 2024Frontier AI Safety Commitments, AI Seoul Summit 2024Feb 7, 2025 — Thresholds can be defined using model capabilities, estimates of risk…

Risk Thresholds illustration 1 Within AI doom debates, policy thresholds matter because many existential-risk arguments assume that warning signs may appear before a full catastrophe. The challenge is deciding which warning signs deserve intervention and what consequences should follow when they appear.

What Counts as a Threshold?

A policy threshold is a predefined condition that triggers a regulatory or governance response. Different proposals use different kinds of thresholds because no single measurement captures AI risk reliably.

Several categories appear repeatedly in frontier AI governance discussions:

  • Compute thresholds: based on the amount of computational power used during training.
  • Capability thresholds: based on demonstrated abilities, such as advanced cyber operations, biological design assistance, or autonomous AI research.
  • Risk thresholds: based on estimated probabilities of severe harm. [frontiermodelforum.org]frontiermodelforum.orgrisk taxonomy and thresholdsfor Frontier AI Frameworks18 Jun 2025 — This report examines the rationale for including only select risk domains within frontier AI fram…
  • Deployment thresholds: based on the context in which a model is used and the safeguards surrounding it.
  • Combined thresholds: using several indicators together rather than relying on one measure. [Frontier Model Forum]frontiermodelforum.orgrisk taxonomy and thresholdsfor Frontier AI Frameworks18 Jun 2025 — This report examines the rationale for including only select risk domains within frontier AI fram… [GOV.UK]GOV.UKemerging processes for frontier ai safetyprocesses for frontier AI safety27 Oct 2023 — This document contains the world's first overview of emerging safety processes focused on f…

Many frontier safety frameworks increasingly rely on capability-based thresholds. The reasoning is straightforward: existential risk comes from what a system can do rather than how many computers were used to build it. Frameworks published by major AI developers and analysed by independent researchers commonly define capability levels that trigger additional safeguards once models approach areas such as cyber offence, biological assistance, autonomous replication, strategic deception, or AI-enabled acceleration of further AI development. [Metr]metr.orgcommon elementsMetrCommon Elements of Frontier AI Safety Policies16 Dec 2025 — The Framework is built around capability thresholds called “Critical Capa… [Metr]metr.org2025 12 09 common elements of frontier ai safety policiesCommon Elements of Frontier AI Safety Policies…9 Dec 2025 — The policies also outline commitments to conduct model evaluations assessi…

Why Compute Thresholds Remain Important

Despite their limitations, compute thresholds remain attractive because they can be measured before training begins.

A regulator cannot directly inspect capabilities that do not yet exist, but it can identify unusually large training runs. This makes compute a useful early-warning mechanism. Several governance proposals therefore treat compute thresholds as an initial screening tool that determines which projects receive enhanced oversight. [arXiv]arxiv.orgarXiv Risk thresholds for frontier AIarXiv Risk thresholds for frontier AI

The most prominent example is the European Union’s framework for general-purpose AI models. Models trained above a specified computational threshold are presumed to present systemic risk and become subject to additional obligations. The threshold currently associated with systemic-risk classification is 10²⁵ floating-point operations (FLOPs), although regulators have indicated that thresholds may evolve as technology changes. [Digital Strategy]digital-strategy.ec.europa.euDigital StrategyGeneral-purpose AI obligations under the AI ActAug 1, 2025 — **GPAI models are presumed to pose systemic risk if they are… [Artificial Intelligence Act]artificialintelligenceact.eugpai guidelines overviewmore…

However, compute thresholds face an important criticism. Advances in inference-time scaling—the use of large amounts of computation after training rather than during training—may weaken the connection between training compute and actual capability. A model might become significantly more capable without crossing traditional training thresholds. This is one reason many researchers view compute thresholds as useful filters rather than complete safety measures. [International AI Safety Report]internationalaisafetyreport.orginternational ai safety report 2026International AI Safety ReportInternational AI Safety Report 20263 Feb 2026 — For example, some current governance approaches use thresho…

What Happens When a Threshold Is Crossed?

The defining feature of threshold-based governance is that crossing a limit has consequences.

Different frameworks propose different responses, but common interventions include:

  1. Mandatory additional evaluations before development continues.
  1. Independent external review by regulators or accredited assessors.
  2. Enhanced security requirements to reduce theft, misuse, or model leakage.
  3. Deployment restrictions limiting access to high-risk capabilities.
  4. Incident reporting obligations when dangerous behaviour is discovered.
  5. Temporary pauses in training or deployment until risks are addressed.

Amazon book picks

Further Reading

Books and field guides related to How Policy Thresholds Govern Safe Frontier AI Development. Use these as the next step if you want deeper reading beyond the article.

eBay marketplace picks

Marketplace Samples

Example marketplace items related to this page. Use the search link to explore similar finds on eBay.

Using USA
  1. Escalation to government oversight for systems deemed capable of causing severe harm. CLTR GOV.UK

Some frontier safety proposals use a tiered model rather than a single cut-off. Instead of dividing systems into “safe” and “unsafe”, they establish increasing levels of concern. A model approaching a threshold might require intensified monitoring, while a model crossing a higher threshold could trigger deployment restrictions or development pauses. Researchers sometimes describe these as green, yellow, and red zones, with increasingly serious interventions as risks rise. arXiv

For advocates concerned about AI doom, the key purpose of escalation is to prevent situations where companies continue pushing capabilities upward despite evidence that systems are becoming difficult to control.

Risk Thresholds illustration 2

Regulatory Consequences in Practice

The strongest existing examples of threshold-linked obligations come from emerging AI regulations and voluntary frontier safety commitments.

Under the EU AI Act, providers of general-purpose AI models classified as presenting systemic risk face additional requirements. These include model evaluations, risk assessments, adversarial testing, cybersecurity protections, and serious-incident reporting. Significant penalties can apply for non-compliance. The framework therefore creates a direct connection between crossing a threshold and assuming new legal responsibilities. Digital Strategy Digital Strategy

The 2024 Seoul Frontier AI Safety Commitments took a related approach. Participating companies agreed to define thresholds at which risks would become intolerable unless adequately mitigated. The commitments emphasised that thresholds should be measurable and linked to meaningful actions when exceeded. GOV.UK

In the United States, debate around California’s proposed SB 1047 illustrated how threshold-based governance can become politically contentious. The bill attempted to apply obligations to the largest frontier models and required safety testing and emergency shutdown capabilities. Supporters argued that the legislation addressed catastrophic risks from advanced systems, while opponents warned that it could burden innovation and create legal uncertainty. The bill ultimately did not become law, but it demonstrated how disagreements often focus less on whether thresholds should exist and more on where they should be set. Morgan Lewis 2Wikipedia

The Hard Problem: Defining “Too Dangerous”

The most difficult question is not what happens after a threshold is crossed. It is deciding where the threshold belongs.

Capability thresholds can appear more directly connected to real-world harm than compute thresholds, but they create measurement problems. Regulators must determine exactly what level of cyber capability, biological assistance, or autonomous behaviour counts as unacceptable. Small changes in wording can have major consequences. arXiv

Risk thresholds attempt to solve this problem by focusing directly on harm. Instead of asking whether a model can perform a task, they ask whether it raises the probability of severe damage beyond an acceptable level. In principle, this approach is more defensible because it targets outcomes rather than proxies. In practice, estimating the probability of unprecedented harms remains extremely difficult. Researchers therefore often recommend combining risk thresholds with more measurable capability indicators. arXiv

This uncertainty is especially important in AI doom discussions. If existential risks arise from novel forms of misalignment, deception, or strategic planning that have never previously existed, policymakers may not know which capabilities are most predictive. Thresholds could therefore be set too low, creating unnecessary restrictions, or too high, failing to intervene before dangerous systems emerge.

Risk Thresholds illustration 3

Can Thresholds Reduce Existential Risk?

From an AI doom perspective, policy thresholds are best understood as a governance tool for handling uncertainty rather than a guaranteed solution.

Supporters argue that thresholds create predetermined stopping points in environments where competitive pressures might otherwise encourage continuous capability expansion. If organisations know in advance that crossing certain capability levels triggers mandatory evaluations, reporting requirements, or pauses, they may have stronger incentives to invest in safety before risks emerge. CLTR GOV.UK

Critics respond that thresholds depend on evaluators correctly identifying dangerous capabilities. If transformative risks arise from unexpected combinations of abilities, predefined limits may provide a false sense of security. Some analysts also note that many existing safety frameworks still lack clear quantitative definitions of acceptable and unacceptable risk, making enforcement difficult. arXiv

The central trade-off is therefore not safety versus innovation but predictability versus flexibility. Strict thresholds can create clear accountability and intervention points. Flexible approaches may adapt more easily to rapidly changing technology but risk allowing dangerous capability growth before oversight mechanisms activate.

For advocates of mandatory frontier AI safety evaluations before training, thresholds are the mechanism that transforms evaluation from observation into governance. Without consequences attached to crossing predefined limits, evaluations merely describe risk. With thresholds, they become a basis for deciding when development should continue, when additional safeguards are required, and when the potential stakes are high enough to justify regulatory intervention. Frontier Model Forum CLTR

Endnotes

  1. Source: GOV.UK
    Title: frontier ai safety commitments ai seoul summit 2024
    Link: https://www.gov.uk/government/publications/frontier-ai-safety-commitments-ai-seoul-summit-2024/frontier-ai-safety-commitments-ai-seoul-summit-2024
    Source snippet

    Frontier AI Safety Commitments, AI Seoul Summit 2024Feb 7, 2025 — Thresholds can be defined using model capabilities, estimates of risk...

  2. Source: arxiv.org
    Title: arXiv Risk thresholds for frontier AI
    Link: https://arxiv.org/abs/2406.14713

  3. Source: metr.org
    Title: common elements
    Link: https://metr.org/common-elements
    Source snippet

    MetrCommon Elements of Frontier AI Safety Policies16 Dec 2025 — The Framework is built around capability thresholds called “Critical Capa...

  4. Source: metr.org
    Title: 2025 12 09 common elements of frontier ai safety policies
    Link: https://metr.org/blog/2025-12-09-common-elements-of-frontier-ai-safety-policies/
    Source snippet

    Common Elements of Frontier AI Safety Policies...9 Dec 2025 — The policies also outline commitments to conduct model evaluations assessi...

  5. Source: arxiv.org
    Link: https://arxiv.org/pdf/2406.14713
    Source snippet

    arXivRisk thresholds for frontier AIJune 20, 2024 — by L Koessler · 2024 · Cited by 25 — Compute thresholds should thus be used as an ini...

    Published: June 20, 2024

  6. Source: longtermresilience.org
    Title: frontier ai safety frameworks need to include risk governance
    Link: https://www.longtermresilience.org/reports/frontier-ai-safety-frameworks-need-to-include-risk-governance/
    Source snippet

    CLTRFrontier AI safety frameworks need to include risk...5 Feb 2025 — These frameworks aim to set thresholds for powerful AI models, and...

  7. Source: arxiv.org
    Link: https://arxiv.org/abs/2507.16534
    Source snippet

    arXivFrontier AI Risk Management Framework in Practice: A Risk Analysis Technical ReportJuly 22, 2025...

    Published: July 22, 2025

  8. Source: arxiv.org
    Title: arXiv Intolerable Risk Threshold Recommendations for [Artificial]({{ ‘artificial-goals/’ | relative_url }}) Intelligence
    Link: https://arxiv.org/abs/2503.05812

  9. Source: Wikipedia
    Title: Safe and Secure Innovation for Frontier Artificial Intelligence Models Act
    Link: https://en.wikipedia.org/wiki/Safe_and_Secure_Innovation_for_Frontier_Artificial_Intelligence_Models_Act
    Source snippet

    Safe and Secure Innovation for Frontier Artificial...The Safe and Secure Innovation for Frontier Artificial Intelligence Models Act...

  10. Source: arxiv.org
    Link: https://arxiv.org/abs/2512.01166
    Source snippet

    arXivEvaluating AI Companies' Frontier Safety Frameworks: Methodology and ResultsDecember 1, 2025...

    Published: December 1, 2025

  11. Source: metr.org
    Link: https://metr.org/notes/2026-01-29-frontier-ai-safety-regulations/
    Source snippet

    Frontier AI safety regulations: A reference for lab staff29 Jan 2026 — Signatories have been expected to comply with the Code since Augus...

    Published: August 2025

  12. Source: Wikipedia
    Title: Artificial intelligence
    Link: https://en.wikipedia.org/wiki/Artificial_intelligence
    Source snippet

    Artificial intelligenceArtificial intelligence (AI) is the capability of computational systems to perform tasks typically associated w...

  13. Source: GOV.UK
    Title: emerging processes for frontier ai safety
    Link: https://www.gov.uk/government/publications/emerging-processes-for-frontier-ai-safety/emerging-processes-for-frontier-ai-safety
    Source snippet

    processes for frontier AI safety27 Oct 2023 — This document contains the world's first overview of emerging safety processes focused on f...

  14. Source: arxiv.org
    Link: https://arxiv.org/html/2512.01166v3
    Source snippet

    Evaluating AI Providers' Frontier AI Safety Frameworks26 Mar 2026 — Capability Thresholds: Defined levels of AI system performance that...

  15. Source: frontiermodelforum.org
    Title: issue brief thresholds for frontier ai safety frameworks
    Link: https://www.frontiermodelforum.org/updates/issue-brief-thresholds-for-frontier-ai-safety-frameworks/
    Source snippet

    Frontier Model ForumIssue Brief: Thresholds for Frontier AI Safety Frameworks7 Feb 2025 — This brief elaborates on the importance of thre...

  16. Source: internationalaisafetyreport.org
    Title: international ai safety report 2026
    Link: https://internationalaisafetyreport.org/publication/international-ai-safety-report-2026
    Source snippet

    International AI Safety ReportInternational AI Safety Report 20263 Feb 2026 — For example, some current governance approaches use thresho...

  17. Source: digital-strategy.ec.europa.eu
    Link: https://digital-strategy.ec.europa.eu/en/factpages/general-purpose-ai-obligations-under-ai-act
    Source snippet

    Digital StrategyGeneral-purpose AI obligations under the AI ActAug 1, 2025 — **GPAI models are presumed to pose systemic risk if they are...

  18. Source: artificialintelligenceact.eu
    Title: gpai guidelines overview
    Link: https://artificialintelligenceact.eu/gpai-guidelines-overview/
    Source snippet

    more...

  19. Source: digital-strategy.ec.europa.eu
    Title: general purpose ai models ai act questions answers
    Link: https://digital-strategy.ec.europa.eu/en/faqs/general-purpose-ai-models-ai-act-questions-answers
    Source snippet

    Digital StrategyGeneral-Purpose AI Models in the AI Act – Questions & Answers10 Jul 2025 — The obligations for providers of general-purpo...

  20. Source: morganlewis.com
    Link: https://www.morganlewis.com/pubs/2024/08/californias-sb-1047-would-impose-new-safety-requirements-for-developers-of-large-scale-ai-models
    Source snippet

    Morgan LewisCalifornia's SB 1047 Would Impose New Safety...August 29, 2024 — 29 Aug 2024 — The bill would broadly cover any AI developer...

    Published: August 29, 2024

  21. Source: artificialintelligenceact.eu
    Link: https://artificialintelligenceact.eu/article/51/
    Source snippet

    If the AI model has high impact capabilities, determined by technical tools and...Read more...

  22. Source: digital-strategy.ec.europa.eu
    Title: eu A I Act | Shaping Europe’s digital future
    Link: https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai
    Source snippet

    Act | Shaping Europe's digital future - European UnionThe AI Act is the first-ever legal framework on AI, which addresses the risks of AI...

  23. Source: digital-strategy.ec.europa.eu
    Title: contents code gpai
    Link: https://digital-strategy.ec.europa.eu/en/policies/contents-code-gpai
    Source snippet

    General-Purpose AI Code of PracticeJul 10, 2025 — The Code of Practice helps industry comply with the AI Act legal obligations on safety...

  24. Source: frontiermodelforum.org
    Title: risk taxonomy and thresholds
    Link: https://www.frontiermodelforum.org/technical-reports/risk-taxonomy-and-thresholds/
    Source snippet

    for Frontier AI Frameworks18 Jun 2025 — This report examines the rationale for including only select risk domains within frontier AI fram...

Additional References

  1. Source: twobirds.com
    Link: https://www.twobirds.com/en/insights/2025/taking-the-eu-ai-act-to-practice-how-the-final-gpai-guidelines-shape-the-ai-regulatory-landscape
    Source snippet

    Taking the EU AI Act to Practice How the Final GPAI...31 Jul 2025 — The AI Act itself establishes that a GPAI model poses a systemic ris...

  2. Source: enkryptai.com
    Title: frontier safety frameworks comprehensive overview
    Link: https://www.enkryptai.com/blog/frontier-safety-frameworks-comprehensive-overview
    Source snippet

    Frontier Safety Frameworks — A Comprehensive Picture17 Jul 2025 — Google DeepMind's Frontier Safety Framework introduces Critical Capabil...

  3. Source: reuters.com
    Link: https://www.reuters.com/sustainability/boards-policy-regulation/ai-models-with-systemic-risks-given-pointers-how-comply-with-eu-ai-rules-2025-07-18/
    Source snippet

    These guidelines aim to ease the regulatory burden for businesses and provide clarity for complying with the law, which comes into effect...

    Published: July 18, 2025

  4. Source: theverge.com
    Link: https://www.theverge.com/2024/8/28/24229068/california-sb-1047-ai-safety-bill-passed-state-assembly-governor-newsom-signature
    Source snippet

    Esta ley obliga a las empresas de IA que operan en California a implementar una serie de precauciones antes de entrenar un modelo de base...

  5. Source: ai.google
    Title: Google AI
    Link: https://ai.google/
    Source snippet

    How we're making AI helpful for everyoneDiscover how Google AI is committed to enriching knowledge, solving complex challenges and helpin...

  6. Source: assets.amazon.science
    Title: pc amazon frontier model safety framework 2 7 final 2 9
    Link: https://assets.amazon.science/a7/7c/8bdade5c4eda9168f3dee6434fff/pc-amazon-frontier-model-safety-framework-2-7-final-2-9.pdf
    Source snippet

    amazon.scienceAmazon's Frontier Model Safety Framework9 Feb 2025 — First, it specifies Critical Capability. Thresholds, a set of model ca...

  7. Source: brookings.edu
    Title: misrepresentations of californias ai safety bill
    Link: https://www.brookings.edu/articles/misrepresentations-of-californias-ai-safety-bill/
    Source snippet

    Misrepresentations of California's AI safety bill27 Sept 2024 — California Senate Bill 1047 (SB-1047), which aims to regulate catastrophi...

  8. Source: eipa.eu
    Title: understanding general purpose ai
    Link: https://www.eipa.eu/blog/understanding-general-purpose-ai/
    Source snippet

    11 Mar 2025 — The simple answer? GPAI models are regulated based on the extent of systemic risk they pose, as defined in Article 51 of th...

  9. Source: european-union.europa.eu
    Link: https://european-union.europa.eu/index_en
    Source snippet

    Union: Your gateway to the EU, News, Highlights5 hours ago — Discover what the EU does for citizens, how it protects rights, promotes pro...

  10. Source: freshfields.com
    Link: https://www.freshfields.com/en/our-thinking/campaigns/tech-data-and-ai-the-digital-frontier/eu-digital-strategy/artificial-intelligence-act
    Source snippet

    Artificial Intelligence ActThe AI Act introduces EU-wide minimum requirements for AI systems and proposes a sliding scale of rules based...

Topic Tree

Follow this branch

Parent topic

Safety Checks Should Frontier Models Pass Safety Checks First?

Related pages 2