Within Safety Checks
How Policy Thresholds Govern Safe Frontier AI Development
Describes how predefined risk limits trigger interventions like pausing training, additional safeguards, or regulatory reporting.
On this page
- Defining capability thresholds and danger metrics
- Actions triggered when limits are exceeded
- Balancing strictness to avoid over blocking innovation
Page outline Jump by section
Introduction
In proposals for mandatory frontier AI safety evaluations, policy thresholds are the point at which concern turns into action. Rather than treating safety reviews as advisory exercises, threshold-based governance links predefined indicators of danger to specific consequences. If a model, training run, or organisation crosses a threshold, additional evaluations, reporting requirements, deployment restrictions, or even pauses to development may be triggered. The core idea is that potentially catastrophic risks should not depend entirely on the judgement of individual AI companies. Instead, agreed limits create predictable responses before systems become too powerful to manage. Supporters see this as a practical way to reduce the risk of loss of control, dangerous autonomy, or catastrophic misuse. Critics argue that the wrong thresholds could either fail to catch genuinely dangerous systems or unnecessarily slow beneficial innovation. [Frontier Model Forum]frontiermodelforum.orgissue brief thresholds for frontier ai safety frameworksFrontier Model ForumIssue Brief: Thresholds for Frontier AI Safety Frameworks7 Feb 2025 — This brief elaborates on the importance of thre… [GOV.UK]GOV.UKfrontier ai safety commitments ai seoul summit 2024Frontier AI Safety Commitments, AI Seoul Summit 2024Feb 7, 2025 — Thresholds can be defined using model capabilities, estimates of risk…
Within AI doom debates, policy thresholds matter because many existential-risk arguments assume that warning signs may appear before a full catastrophe. The challenge is deciding which warning signs deserve intervention and what consequences should follow when they appear.
What Counts as a Threshold?
A policy threshold is a predefined condition that triggers a regulatory or governance response. Different proposals use different kinds of thresholds because no single measurement captures AI risk reliably.
Several categories appear repeatedly in frontier AI governance discussions:
- Compute thresholds: based on the amount of computational power used during training.
- Capability thresholds: based on demonstrated abilities, such as advanced cyber operations, biological design assistance, or autonomous AI research.
- Risk thresholds: based on estimated probabilities of severe harm. [frontiermodelforum.org]frontiermodelforum.orgrisk taxonomy and thresholdsfor Frontier AI Frameworks18 Jun 2025 — This report examines the rationale for including only select risk domains within frontier AI fram…
- Deployment thresholds: based on the context in which a model is used and the safeguards surrounding it.
- Combined thresholds: using several indicators together rather than relying on one measure. [Frontier Model Forum]frontiermodelforum.orgrisk taxonomy and thresholdsfor Frontier AI Frameworks18 Jun 2025 — This report examines the rationale for including only select risk domains within frontier AI fram… [GOV.UK]GOV.UKemerging processes for frontier ai safetyprocesses for frontier AI safety27 Oct 2023 — This document contains the world's first overview of emerging safety processes focused on f…
Many frontier safety frameworks increasingly rely on capability-based thresholds. The reasoning is straightforward: existential risk comes from what a system can do rather than how many computers were used to build it. Frameworks published by major AI developers and analysed by independent researchers commonly define capability levels that trigger additional safeguards once models approach areas such as cyber offence, biological assistance, autonomous replication, strategic deception, or AI-enabled acceleration of further AI development. [Metr]metr.orgcommon elementsMetrCommon Elements of Frontier AI Safety Policies16 Dec 2025 — The Framework is built around capability thresholds called “Critical Capa… [Metr]metr.org2025 12 09 common elements of frontier ai safety policiesCommon Elements of Frontier AI Safety Policies…9 Dec 2025 — The policies also outline commitments to conduct model evaluations assessi…
Why Compute Thresholds Remain Important
Despite their limitations, compute thresholds remain attractive because they can be measured before training begins.
A regulator cannot directly inspect capabilities that do not yet exist, but it can identify unusually large training runs. This makes compute a useful early-warning mechanism. Several governance proposals therefore treat compute thresholds as an initial screening tool that determines which projects receive enhanced oversight. [arXiv]arxiv.orgarXiv Risk thresholds for frontier AIarXiv Risk thresholds for frontier AI
The most prominent example is the European Union’s framework for general-purpose AI models. Models trained above a specified computational threshold are presumed to present systemic risk and become subject to additional obligations. The threshold currently associated with systemic-risk classification is 10²⁵ floating-point operations (FLOPs), although regulators have indicated that thresholds may evolve as technology changes. [Digital Strategy]digital-strategy.ec.europa.euDigital StrategyGeneral-purpose AI obligations under the AI ActAug 1, 2025 — **GPAI models are presumed to pose systemic risk if they are… [Artificial Intelligence Act]artificialintelligenceact.eugpai guidelines overviewmore…
However, compute thresholds face an important criticism. Advances in inference-time scaling—the use of large amounts of computation after training rather than during training—may weaken the connection between training compute and actual capability. A model might become significantly more capable without crossing traditional training thresholds. This is one reason many researchers view compute thresholds as useful filters rather than complete safety measures. [International AI Safety Report]internationalaisafetyreport.orginternational ai safety report 2026International AI Safety ReportInternational AI Safety Report 20263 Feb 2026 — For example, some current governance approaches use thresho…
What Happens When a Threshold Is Crossed?
The defining feature of threshold-based governance is that crossing a limit has consequences.
Different frameworks propose different responses, but common interventions include:
- Mandatory additional evaluations before development continues.
- Independent external review by regulators or accredited assessors.
- Enhanced security requirements to reduce theft, misuse, or model leakage.
- Deployment restrictions limiting access to high-risk capabilities.
- Incident reporting obligations when dangerous behaviour is discovered.
- Temporary pauses in training or deployment until risks are addressed.
Amazon book picks
Further Reading
Books and field guides related to How Policy Thresholds Govern Safe Frontier AI Development. Use these as the next step if you want deeper reading beyond the article.
Superintelligence
Examines when intervention may be justified as systems become more capable.
The Coming Wave
Focuses on policy responses and governance thresholds for powerful technologies.
Some frontier safety proposals use a tiered model rather than a single cut-off. Instead of dividing systems into “safe” and “unsafe”, they establish increasing levels of concern. A model approaching a threshold might require intensified monitoring, while a model crossing a higher threshold could trigger deployment restrictions or development pauses. Researchers sometimes describe these as green, yellow, and red zones, with increasingly serious interventions as risks rise. arXiv
For advocates concerned about AI doom, the key purpose of escalation is to prevent situations where companies continue pushing capabilities upward despite evidence that systems are becoming difficult to control.
Regulatory Consequences in Practice
The strongest existing examples of threshold-linked obligations come from emerging AI regulations and voluntary frontier safety commitments.
Under the EU AI Act, providers of general-purpose AI models classified as presenting systemic risk face additional requirements. These include model evaluations, risk assessments, adversarial testing, cybersecurity protections, and serious-incident reporting. Significant penalties can apply for non-compliance. The framework therefore creates a direct connection between crossing a threshold and assuming new legal responsibilities. Digital Strategy Digital Strategy
The 2024 Seoul Frontier AI Safety Commitments took a related approach. Participating companies agreed to define thresholds at which risks would become intolerable unless adequately mitigated. The commitments emphasised that thresholds should be measurable and linked to meaningful actions when exceeded. GOV.UK
In the United States, debate around California’s proposed SB 1047 illustrated how threshold-based governance can become politically contentious. The bill attempted to apply obligations to the largest frontier models and required safety testing and emergency shutdown capabilities. Supporters argued that the legislation addressed catastrophic risks from advanced systems, while opponents warned that it could burden innovation and create legal uncertainty. The bill ultimately did not become law, but it demonstrated how disagreements often focus less on whether thresholds should exist and more on where they should be set. Morgan Lewis 2Wikipedia
The Hard Problem: Defining “Too Dangerous”
The most difficult question is not what happens after a threshold is crossed. It is deciding where the threshold belongs.
Capability thresholds can appear more directly connected to real-world harm than compute thresholds, but they create measurement problems. Regulators must determine exactly what level of cyber capability, biological assistance, or autonomous behaviour counts as unacceptable. Small changes in wording can have major consequences. arXiv
Risk thresholds attempt to solve this problem by focusing directly on harm. Instead of asking whether a model can perform a task, they ask whether it raises the probability of severe damage beyond an acceptable level. In principle, this approach is more defensible because it targets outcomes rather than proxies. In practice, estimating the probability of unprecedented harms remains extremely difficult. Researchers therefore often recommend combining risk thresholds with more measurable capability indicators. arXiv
This uncertainty is especially important in AI doom discussions. If existential risks arise from novel forms of misalignment, deception, or strategic planning that have never previously existed, policymakers may not know which capabilities are most predictive. Thresholds could therefore be set too low, creating unnecessary restrictions, or too high, failing to intervene before dangerous systems emerge.
Can Thresholds Reduce Existential Risk?
From an AI doom perspective, policy thresholds are best understood as a governance tool for handling uncertainty rather than a guaranteed solution.
Supporters argue that thresholds create predetermined stopping points in environments where competitive pressures might otherwise encourage continuous capability expansion. If organisations know in advance that crossing certain capability levels triggers mandatory evaluations, reporting requirements, or pauses, they may have stronger incentives to invest in safety before risks emerge. CLTR GOV.UK
Critics respond that thresholds depend on evaluators correctly identifying dangerous capabilities. If transformative risks arise from unexpected combinations of abilities, predefined limits may provide a false sense of security. Some analysts also note that many existing safety frameworks still lack clear quantitative definitions of acceptable and unacceptable risk, making enforcement difficult. arXiv
The central trade-off is therefore not safety versus innovation but predictability versus flexibility. Strict thresholds can create clear accountability and intervention points. Flexible approaches may adapt more easily to rapidly changing technology but risk allowing dangerous capability growth before oversight mechanisms activate.
For advocates of mandatory frontier AI safety evaluations before training, thresholds are the mechanism that transforms evaluation from observation into governance. Without consequences attached to crossing predefined limits, evaluations merely describe risk. With thresholds, they become a basis for deciding when development should continue, when additional safeguards are required, and when the potential stakes are high enough to justify regulatory intervention. Frontier Model Forum CLTR
Endnotes
-
Source: GOV.UK
Title: frontier ai safety commitments ai seoul summit 2024
Link: https://www.gov.uk/government/publications/frontier-ai-safety-commitments-ai-seoul-summit-2024/frontier-ai-safety-commitments-ai-seoul-summit-2024Source snippet
Frontier AI Safety Commitments, AI Seoul Summit 2024Feb 7, 2025 — Thresholds can be defined using model capabilities, estimates of risk...
-
Source: arxiv.org
Title: arXiv Risk thresholds for frontier AI
Link: https://arxiv.org/abs/2406.14713 -
Source: metr.org
Title: common elements
Link: https://metr.org/common-elementsSource snippet
MetrCommon Elements of Frontier AI Safety Policies16 Dec 2025 — The Framework is built around capability thresholds called “Critical Capa...
-
Source: metr.org
Title: 2025 12 09 common elements of frontier ai safety policies
Link: https://metr.org/blog/2025-12-09-common-elements-of-frontier-ai-safety-policies/Source snippet
Common Elements of Frontier AI Safety Policies...9 Dec 2025 — The policies also outline commitments to conduct model evaluations assessi...
-
Source: arxiv.org
Link: https://arxiv.org/pdf/2406.14713Source snippet
arXivRisk thresholds for frontier AIJune 20, 2024 — by L Koessler · 2024 · Cited by 25 — Compute thresholds should thus be used as an ini...
Published: June 20, 2024
-
Source: longtermresilience.org
Title: frontier ai safety frameworks need to include risk governance
Link: https://www.longtermresilience.org/reports/frontier-ai-safety-frameworks-need-to-include-risk-governance/Source snippet
CLTRFrontier AI safety frameworks need to include risk...5 Feb 2025 — These frameworks aim to set thresholds for powerful AI models, and...
-
Source: arxiv.org
Link: https://arxiv.org/abs/2507.16534Source snippet
arXivFrontier AI Risk Management Framework in Practice: A Risk Analysis Technical ReportJuly 22, 2025...
Published: July 22, 2025
-
Source: arxiv.org
Title: arXiv Intolerable Risk Threshold Recommendations for [Artificial]({{ ‘artificial-goals/’ | relative_url }}) Intelligence
Link: https://arxiv.org/abs/2503.05812 -
Source: Wikipedia
Title: Safe and Secure Innovation for Frontier Artificial Intelligence Models Act
Link: https://en.wikipedia.org/wiki/Safe_and_Secure_Innovation_for_Frontier_Artificial_Intelligence_Models_ActSource snippet
Safe and Secure Innovation for Frontier Artificial...The Safe and Secure Innovation for Frontier Artificial Intelligence Models Act...
-
Source: arxiv.org
Link: https://arxiv.org/abs/2512.01166Source snippet
arXivEvaluating AI Companies' Frontier Safety Frameworks: Methodology and ResultsDecember 1, 2025...
Published: December 1, 2025
-
Source: metr.org
Link: https://metr.org/notes/2026-01-29-frontier-ai-safety-regulations/Source snippet
Frontier AI safety regulations: A reference for lab staff29 Jan 2026 — Signatories have been expected to comply with the Code since Augus...
Published: August 2025
-
Source: Wikipedia
Title: Artificial intelligence
Link: https://en.wikipedia.org/wiki/Artificial_intelligenceSource snippet
Artificial intelligenceArtificial intelligence (AI) is the capability of computational systems to perform tasks typically associated w...
-
Source: GOV.UK
Title: emerging processes for frontier ai safety
Link: https://www.gov.uk/government/publications/emerging-processes-for-frontier-ai-safety/emerging-processes-for-frontier-ai-safetySource snippet
processes for frontier AI safety27 Oct 2023 — This document contains the world's first overview of emerging safety processes focused on f...
-
Source: arxiv.org
Link: https://arxiv.org/html/2512.01166v3Source snippet
Evaluating AI Providers' Frontier AI Safety Frameworks26 Mar 2026 — Capability Thresholds: Defined levels of AI system performance that...
-
Source: frontiermodelforum.org
Title: issue brief thresholds for frontier ai safety frameworks
Link: https://www.frontiermodelforum.org/updates/issue-brief-thresholds-for-frontier-ai-safety-frameworks/Source snippet
Frontier Model ForumIssue Brief: Thresholds for Frontier AI Safety Frameworks7 Feb 2025 — This brief elaborates on the importance of thre...
-
Source: internationalaisafetyreport.org
Title: international ai safety report 2026
Link: https://internationalaisafetyreport.org/publication/international-ai-safety-report-2026Source snippet
International AI Safety ReportInternational AI Safety Report 20263 Feb 2026 — For example, some current governance approaches use thresho...
-
Source: digital-strategy.ec.europa.eu
Link: https://digital-strategy.ec.europa.eu/en/factpages/general-purpose-ai-obligations-under-ai-actSource snippet
Digital StrategyGeneral-purpose AI obligations under the AI ActAug 1, 2025 — **GPAI models are presumed to pose systemic risk if they are...
-
Source: artificialintelligenceact.eu
Title: gpai guidelines overview
Link: https://artificialintelligenceact.eu/gpai-guidelines-overview/Source snippet
more...
-
Source: digital-strategy.ec.europa.eu
Title: general purpose ai models ai act questions answers
Link: https://digital-strategy.ec.europa.eu/en/faqs/general-purpose-ai-models-ai-act-questions-answersSource snippet
Digital StrategyGeneral-Purpose AI Models in the AI Act – Questions & Answers10 Jul 2025 — The obligations for providers of general-purpo...
-
Source: morganlewis.com
Link: https://www.morganlewis.com/pubs/2024/08/californias-sb-1047-would-impose-new-safety-requirements-for-developers-of-large-scale-ai-modelsSource snippet
Morgan LewisCalifornia's SB 1047 Would Impose New Safety...August 29, 2024 — 29 Aug 2024 — The bill would broadly cover any AI developer...
Published: August 29, 2024
-
Source: artificialintelligenceact.eu
Link: https://artificialintelligenceact.eu/article/51/Source snippet
If the AI model has high impact capabilities, determined by technical tools and...Read more...
-
Source: digital-strategy.ec.europa.eu
Title: eu A I Act | Shaping Europe’s digital future
Link: https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-aiSource snippet
Act | Shaping Europe's digital future - European UnionThe AI Act is the first-ever legal framework on AI, which addresses the risks of AI...
-
Source: digital-strategy.ec.europa.eu
Title: contents code gpai
Link: https://digital-strategy.ec.europa.eu/en/policies/contents-code-gpaiSource snippet
General-Purpose AI Code of PracticeJul 10, 2025 — The Code of Practice helps industry comply with the AI Act legal obligations on safety...
-
Source: frontiermodelforum.org
Title: risk taxonomy and thresholds
Link: https://www.frontiermodelforum.org/technical-reports/risk-taxonomy-and-thresholds/Source snippet
for Frontier AI Frameworks18 Jun 2025 — This report examines the rationale for including only select risk domains within frontier AI fram...
Additional References
-
Source: twobirds.com
Link: https://www.twobirds.com/en/insights/2025/taking-the-eu-ai-act-to-practice-how-the-final-gpai-guidelines-shape-the-ai-regulatory-landscapeSource snippet
Taking the EU AI Act to Practice How the Final GPAI...31 Jul 2025 — The AI Act itself establishes that a GPAI model poses a systemic ris...
-
Source: enkryptai.com
Title: frontier safety frameworks comprehensive overview
Link: https://www.enkryptai.com/blog/frontier-safety-frameworks-comprehensive-overviewSource snippet
Frontier Safety Frameworks — A Comprehensive Picture17 Jul 2025 — Google DeepMind's Frontier Safety Framework introduces Critical Capabil...
-
Source: reuters.com
Link: https://www.reuters.com/sustainability/boards-policy-regulation/ai-models-with-systemic-risks-given-pointers-how-comply-with-eu-ai-rules-2025-07-18/Source snippet
These guidelines aim to ease the regulatory burden for businesses and provide clarity for complying with the law, which comes into effect...
Published: July 18, 2025
-
Source: theverge.com
Link: https://www.theverge.com/2024/8/28/24229068/california-sb-1047-ai-safety-bill-passed-state-assembly-governor-newsom-signatureSource snippet
Esta ley obliga a las empresas de IA que operan en California a implementar una serie de precauciones antes de entrenar un modelo de base...
-
Source: ai.google
Title: Google AI
Link: https://ai.google/Source snippet
How we're making AI helpful for everyoneDiscover how Google AI is committed to enriching knowledge, solving complex challenges and helpin...
-
Source: assets.amazon.science
Title: pc amazon frontier model safety framework 2 7 final 2 9
Link: https://assets.amazon.science/a7/7c/8bdade5c4eda9168f3dee6434fff/pc-amazon-frontier-model-safety-framework-2-7-final-2-9.pdfSource snippet
amazon.scienceAmazon's Frontier Model Safety Framework9 Feb 2025 — First, it specifies Critical Capability. Thresholds, a set of model ca...
-
Source: brookings.edu
Title: misrepresentations of californias ai safety bill
Link: https://www.brookings.edu/articles/misrepresentations-of-californias-ai-safety-bill/Source snippet
Misrepresentations of California's AI safety bill27 Sept 2024 — California Senate Bill 1047 (SB-1047), which aims to regulate catastrophi...
-
Source: eipa.eu
Title: understanding general purpose ai
Link: https://www.eipa.eu/blog/understanding-general-purpose-ai/Source snippet
11 Mar 2025 — The simple answer? GPAI models are regulated based on the extent of systemic risk they pose, as defined in Article 51 of th...
-
Source: european-union.europa.eu
Link: https://european-union.europa.eu/index_enSource snippet
Union: Your gateway to the EU, News, Highlights5 hours ago — Discover what the EU does for citizens, how it protects rights, promotes pro...
-
Source: freshfields.com
Link: https://www.freshfields.com/en/our-thinking/campaigns/tech-data-and-ai-the-digital-frontier/eu-digital-strategy/artificial-intelligence-actSource snippet
Artificial Intelligence ActThe AI Act introduces EU-wide minimum requirements for AI systems and proposes a sliding scale of rules based...
Topic Tree







