When should AI labs be forced to pause?

Introduction

In debates about AI doom and existential risk, one of the persistent governance challenges is deciding when AI systems become dangerous enough that developers should be required to change course — either by stopping development, imposing stricter safeguards, or limiting deployment. A core response in recent policy and safety frameworks is the use of capability thresholds: defined levels of model competence that act as triggers for stronger safety controls and potentially forced pauses in deployment choices. Capability thresholds aim to shift decisions from ad‑hoc internal judgement calls to shared, measurable triggers that make cautious behaviour a common standard, reducing incentives for any one lab to rush ahead without safeguards.[METR]metr.orgCommon Elements of Frontier AI Safety PoliciesMETRCommon Elements of Frontier AI Safety Policies - METRDecember 16, 2025…Published: December 16, 2025

Thresholds illustration 1

What Counts as a Dangerous Capability Threshold

At its simplest, a capability threshold is a pre‑specified point in a model’s abilities — such as performance on certain tasks, degree of autonomy, or ability to meaningfully assist harmful actors — that activates new governance obligations. These obligations can include deeper safety evaluations, enhanced security controls, restrictions on certain kinds of deployment, and in some frameworks, deliberate pauses in training until mitigations are in place.[Juncture Policy]juncturepolicy.orgJuncture Policy Capability ThresholdJuncture PolicyCapability Threshold - Juncture Policy…

Across the frontier AI safety frameworks published by major developers, the notion of “dangerous” varies but generally aligns with abilities that could enable large‑scale harm without substantial mitigation: assisting in biological misuse, automating sophisticated cyberattacks, or providing autonomous capability to execute harmful strategies. These thresholds are not based solely on broad resource proxies like compute but are tied to specific, identifiable capabilities that correlate with societal risk vectors.[Frontier Model Forum]frontiermodelforum.orgFrontier Model Forum Risk Taxonomy and Thresholds for Frontier AI FrameworksFrontier Model ForumRisk Taxonomy and Thresholds for Frontier AI Frameworks - Frontier Model Forum…

A typical structure seen in these frameworks involves two linked concepts:

Enabling capability thresholds signal that a model has reached skills that could make certain harmful outcomes plausible if not mitigated.
Deployment or residual risk thresholds then assess whether a model that has crossed an enabling threshold can be safely deployed after specified safeguards are in place.[Frontier Model Forum]frontiermodelforum.orgFrontier Model Forum Risk Taxonomy and Thresholds for Frontier AI FrameworksFrontier Model ForumRisk Taxonomy and Thresholds for Frontier AI Frameworks - Frontier Model Forum…

Because true risk — in terms of societal harm probability — is very hard to estimate for novel technologies, many frameworks currently use capability thresholds as proxies for risk triggers, balancing measurability with risk relevance.[GovAI]governance.airisk thresholds for frontier aiGovAIRisk Thresholds for Frontier AI | GovAIJune 20, 2024…Published: June 20, 2024

How Capability Thresholds Can Reduce Competitive Race Incentives

A central problem that drives premature deployment in the AI race is the fear of losing advantage if a lab slows while competitors keep developing or deploying. Shared capability thresholds help realign incentives by making safety obligations predictable and common rather than private policy choices. If all actors agree that, for example, ability to generate detailed actionable biological synthesis instructions, or to autonomously coordinate harmful digital operations, triggers compulsory safeguards or deployment restrictions, then no single actor can treat those capabilities as a private risk judgement.[METR]metr.orgCommon Elements of Frontier AI Safety PoliciesMETRCommon Elements of Frontier AI Safety Policies - METRDecember 16, 2025…Published: December 16, 2025

This shared triggering reduces the strategic advantage of secrecy around risk signalling. Developers know that exceeding an agreed threshold will automatically elevate safety requirements, so there is less value in hiding or downplaying risky features to reach the market faster. It also helps external auditors, regulators, and governments to understand when and why higher safety controls should apply, providing a basis for consistent oversight.[Frontier Model Forum]frontiermodelforum.orgFrontier Model Forum Risk Taxonomy and Thresholds for Frontier AI FrameworksFrontier Model ForumRisk Taxonomy and Thresholds for Frontier AI Frameworks - Frontier Model Forum…

Importantly, capability thresholds are sometimes embedded in tiered safety regimes, much like biosafety levels in laboratories. Lower thresholds might require additional internal testing and oversight, while higher ones temporarily halt open deployment until concrete mitigations — such as containment protocols, external audits, or behavioural constraints — are proven effective.[METR]metr.orgCommon Elements of Frontier AI Safety PoliciesMETRCommon Elements of Frontier AI Safety Policies - METRDecember 16, 2025…Published: December 16, 2025

Thresholds illustration 2

Why Compute, Capability and Risk Triggers Remain Disputed

Although capability thresholds are gaining traction, they are not without controversy or limits.

Measurement challenges: Evaluating whether a model truly possesses a dangerous capability is difficult. Benchmark scores or task success rates are imperfect proxies for real‑world harm potential, and capabilities may emerge unpredictably outside defined tests.[METR]metr.orgCommon Elements of Frontier AI Safety PoliciesMETRCommon Elements of Frontier AI Safety Policies - METRDecember 16, 2025…Published: December 16, 2025
Compute versus capability proxies: Some propose simpler triggers based on training compute (e.g., FLOPs thresholds) because they are easy to measure. But relying on resource use alone can miss small models with harmful behaviours or overflag benign systems, making compute thresholds a rougher tool than capability‑based ones.[Frontier Model Forum]frontiermodelforum.orgFrontier Model Forum Risk Taxonomy and Thresholds for Frontier AI FrameworksFrontier Model ForumRisk Taxonomy and Thresholds for Frontier AI Frameworks - Frontier Model Forum…
Risk thresholds versus capability thresholds: True risk thresholds — explicit limits on acceptable harm probability or impact — are arguably more principled but currently too hard to compute with any confidence for unforeseen AI risks. Capability thresholds therefore serve as a stand‑in, though experts caution against treating them as direct measures of ultimate danger.[GovAI]governance.airisk thresholds for frontier aiGovAIRisk Thresholds for Frontier AI | GovAIJune 20, 2024…Published: June 20, 2024
Governance implementation: Even when thresholds are defined, enforcing them across labs globally is difficult. Without binding regulation or mutual verification, voluntary frameworks risk divergence and strategic non‑compliance, especially under competitive pressure. A recent example is the shift in some companies’ safety policies away from explicit pause commitments, which raises concerns about the reliability of self‑imposed thresholds in practice.[PC Gamer]pcgamer.comPreviously, under its Responsible Scaling Policy (RSP), Anthropic pledged to halt AI development should new systems reach dangerous capab…

Implementation and Practical Limits

Capability thresholds are becoming a staple of frontier AI safety frameworks, with many of the largest developers embedding them into staged plans where crossing a threshold escalates requirements for evaluation, mitigation, security, and in some cases, deployment constraints. These frameworks typically include:

A catalogue of hazardous capabilities identified through threat modelling;
Evaluation protocols that test models against those capabilities;
Decision processes that link test outcomes to governance steps;
Escalation rules that require stronger safeguards or pauses if thresholds are crossed.[AI Security & Safety Directory]aisecurityandsafety.orgAI Security & Safety DirectoryFrontier AI Safety Framework — AI Governance Definition & Guide | AI Safety DirectoryMarch 27, 2026…Published: March 27, 2026

Critically, setting thresholds in advance — rather than deciding on the fly — helps create external accountability and transparency. It also allows ecosystem actors, including regulators and civil society, to understand and critique the basis for safety escalations. But because frontier capabilities evolve rapidly, thresholds must be iteratively updated, and their validity re‑tested against real‑world outcomes and new risk evidence.[GOV.UK]GOV.UKEmerging processes for frontier AI safety27, 2023…

Thresholds illustration 3

Looking Ahead: Thresholds in Governance and Public Policy

Capability thresholds have emerged as one of the most tractable governance tools for aligning incentives toward safer deployment choices in an accelerating race. By providing shared signals about what counts as potentially dangerous, they can help make safety expectations predictable and less dependent on unilateral lab judgements. However, they are not a panacea: their effectiveness depends on measurement quality, international cooperation, and integration with broader regulatory frameworks that can enforce consequences when thresholds are crossed. As AI capabilities continue to advance, refining these thresholds and their implementation will remain a key frontier in efforts to manage existential risks from advanced systems.[Frontier Model Forum]frontiermodelforum.orgFrontier Model Forum Risk Taxonomy and Thresholds for Frontier AI FrameworksFrontier Model ForumRisk Taxonomy and Thresholds for Frontier AI Frameworks - Frontier Model Forum…

Amazon book picks

Marketplace Samples

Example marketplace items related to this page. Use the search link to explore similar finds on eBay.

Example eBay listing

Matserpi Robotic Arm Car Kit for Raspberry Pi - AI Vision, 5DOF, Educational Rob

Search eBay.com: AI robot kit

Browse similar on eBay.com

Example eBay listing

Picrawler AI Robot Kit for Raspberry Pi with Voice Control & Video Recognition

Search eBay.com: AI robot kit

Browse similar on eBay.com

Example eBay listing

UGV Beast PI4B AI Kit Opensource Off-Road Tracked AI Robot PC Vision

Search eBay.com: AI robot kit

Browse similar on eBay.com

Example eBay listing

TurboPi Smart Robot Car Kit Vision AI Robot-Hiwonder 2DOF HD Cam for Raspberry

Search eBay.com: AI robot kit

Browse similar on eBay.com

Browse more on eBay.com

Example items shown for inspiration; availability and pricing can change. Branchoria may earn a commission if you purchase through outbound eBay links.

Example eBay listing

No AI 3D Print Pin Badge Gift Anti Artificial Intelligence Artist Maker UK Made

Search eBay.co.uk: artificial intelligence pin

Browse similar on eBay.co.uk

Example eBay listing

Beware of AI Artificial Intelligence Pinback Button Badge 25mm, 32mm, 58mm

Search eBay.co.uk: artificial intelligence pin

Browse similar on eBay.co.uk

Example eBay listing

Wholesale Lot 10 Unused PINS 2001 A.I. ARTIFICIAL INTELLIGENCE Jude Law CM397

Search eBay.co.uk: artificial intelligence pin

Browse similar on eBay.co.uk

Example eBay listing

Copilot is my Jesus Large 2.25” Button AI Religion Artificial Intelligence God

Search eBay.co.uk: artificial intelligence pin

Browse similar on eBay.co.uk

Browse more on eBay.co.uk

Example items shown for inspiration; availability and pricing can change. Branchoria may earn a commission if you purchase through outbound eBay links.

Endnotes

Source: metr.org
Title: Common Elements of Frontier AI Safety Policies
Link: https://metr.org/common-elements
Source snippet
METRCommon Elements of Frontier AI Safety Policies - METRDecember 16, 2025...

Published: December 16, 2025
Source: governance.ai
Title: risk thresholds for frontier ai
Link: https://www.governance.ai/research-paper/risk-thresholds-for-frontier-ai
Source snippet
GovAIRisk Thresholds for Frontier AI | GovAIJune 20, 2024...

Published: June 20, 2024
Source: GOV.UK
Title: Emerging processes for frontier AI safety
Link: https://www.gov.uk/government/publications/emerging-processes-for-frontier-ai-safety/emerging-processes-for-frontier-ai-safety
Source snippet
27, 2023...
Source: governance.ai
Title: coordinated pausing evaluation based scheme
Link: https://www.governance.ai/research-paper/coordinated-pausing-evaluation-based-scheme
Source snippet
Coordinated Pausing: An Evaluation-Based Coordination Scheme for Frontier AI Developers | GovAISeptember 30, 2023 — COORDINATED PAUSING...

Published: September 30, 2023
Source: frontiermodelforum.org
Title: Frontier Model Forum Risk Taxonomy and Thresholds for Frontier AI Frameworks
Link: https://www.frontiermodelforum.org/technical-reports/risk-taxonomy-and-thresholds/
Source snippet
Frontier Model ForumRisk Taxonomy and Thresholds for Frontier AI Frameworks - Frontier Model Forum...
Source: juncturepolicy.org
Title: Juncture Policy Capability Threshold
Link: https://juncturepolicy.org/glossary/terms-c/capability-threshold/
Source snippet
Juncture PolicyCapability Threshold - Juncture Policy...
Source: frontiermodelforum.org
Title: Frontier Model Forum Frontier AI Biosafety Thresholds
Link: https://www.frontiermodelforum.org/issue-briefs/frontier-ai-biosafety-thresholds/
Source snippet
Frontier AI Biosafety Thresholds - Frontier Model ForumMay 12, 2025 — ISSUE BRIEF FRONTIER AI BIOSAFETY THRESHOLDS Posted on: 12th May 20...

Published: May 12, 2025
Source: frontiermodelforum.org
Title: Frontier Model Forum Issue Brief: Thresholds for Frontier AI Safety Frameworks
Link: https://www.frontiermodelforum.org/updates/issue-brief-thresholds-for-frontier-ai-safety-frameworks/
Source snippet
Frontier Model ForumIssue Brief: Thresholds for Frontier AI Safety Frameworks - Frontier Model ForumFebruary 7, 2025...

Published: February 7, 2025
Source: pcgamer.com
Link: [https://www.pcgamer.com/software/ai/anthropic
Source snippet
Previously, under its Responsible Scaling Policy (RSP), Anthropic pledged to halt AI development should new systems reach dangerous capab...
Source: aisecurityandsafety.org
Link: https://aisecurityandsafety.org/en/glossary/frontier-ai-safety-framework/
Source snippet
AI Security & Safety DirectoryFrontier AI Safety Framework — AI Governance Definition & Guide | AI Safety DirectoryMarch 27, 2026...

Published: March 27, 2026
Source: aiwiki.ai
Title: Responsible Scaling Policy | AI Wiki
Link: https://aiwiki.ai/wiki/responsible_scaling_policy
Source snippet
May 7, 2026 — Responsible Scaling Policy RESPONSIBLE SCALING POLICY AI GovernanceAI PolicyAI SafetyFrontier AI 39 min read Updated May 7...

Published: May 7, 2026
Source: comparativeai.org
Title: safety framework
Link: https://comparativeai.org/en/companies/openai/safety-framework/
Source snippet
Comparative AIApril 25, 2026 — SAFETY FRAMEWORK > Snapshot: based on the Preparedness Framework v2.0 (15 April 2025), 2025–2026 blog upda...

Published: April 25, 2026
Source: frontiermodelforum.org
Title: Managing Advanced Cyber Risks in Frontier AI Frameworks
Link: https://www.frontiermodelforum.org/technical-reports/managing-advanced-cyber-risks-in-frontier-ai-frameworks/
Source snippet
Frontier Model ForumFebruary 13, 2026 — 1.3 CURRENT CONSENSUS ON CYBER THRESHOLDS Frontier AI frameworks use thresholds to help determine...

Published: February 13, 2026

Additional References

Source: pattrndata.io
Link: https://www.pattrndata.io/blog/ai-governance-committee-decision-rights-charter-approve-pause-terminate
Source snippet
| AI Governance Questions | Pattrn DataMarch 16, 2026 — WHAT DECISION RIGHTS AND CHARTER SHOULD AN AI GOVERNANCE COMMITTEE HAVE TO APPROV...

Published: March 16, 2026
Source: oecd.ai
Title: Risk thresholds for frontier AI: Insights from the AI Action Summit
Link: https://oecd.ai/en/wonk/risk-thresholds-for-frontier-ai-insights-from-the-ai-action-summit
Source snippet
5, 2025 — RISK THRESHOLDS FOR FRONTIER AI: INSIGHTS FROM THE AI ACTION SUMMIT Eunseo Dana Choi, Dylan Rogers March 5, 2025 — Image: clock...

Published: March 5, 2025
Source: aigi.ox.ac.uk
Title: ox.ac.uk Survey on thresholds for advanced AI systems
Link: https://aigi.ox.ac.uk/publications/survey-on-thresholds-for-advanced-ai-systems/
Source snippet
on thresholds for advanced AI systems - Oxford Martin AIGIAugust 29, 2025 — Image: Survey on thresholds for advanced AI systems SURVEY ON...

Published: August 29, 2025
Source: emergentmind.com
Title: risk thresholds for frontier ai
Link: https://www.emergentmind.com/topics/risk-thresholds-for-frontier-ai
Source snippet
January 11, 2026 — RISK THRESHOLDS FOR FRONTIER AI Updated 11 January 2026 * Risk Thresholds for Frontier AI are quantitatively defined l...

Published: January 11, 2026
Source: emergentmind.com
Title: Frontier AI Regulation
Link: https://www.emergentmind.com/topics/frontier-ai-regulation
Source snippet
THRESHOLDS: RISK, CAPABILITY, AND COMPUTE Threshold-based regulation is a core strategy for scalable oversight (Koessler et al., 2024, Ra...
Source: youtube.com
Title: Christopher Painter
Link: https://www.youtube.com/watch?v=0lWXXJ5CY4Y
Source snippet
The Most Important Graph in AI Right Now | Beth Barnes, CEO of METR...
Source: youtube.com
Title: The Most Important Graph in AI Right Now | Beth Barnes, CEO of METR
Link: https://www.youtube.com/watch?v=jXtk68Kzmms
Source snippet
The Pattern Nobody's Talking About | AI Safety Collapse...
Source: youtube.com
Link: https://www.youtube.com/watch?v=Z19UEZHJzAg
Source snippet
Sovereign AI Stacks: The New Strategic National Resource...
Source: youtube.com
Title: The Pattern Nobody’s Talking About | AI Safety Collapse
Link: https://www.youtube.com/watch?v=c5Yw4qMgj3s
Source snippet
By 2050 we could get "10,000 years of technological progress"...
Source: papers.ssrn.com
Link: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5908745
Source snippet
Thresholds for Managing Frontier AI Risks by Freeman Jackson:: SSRNDecember 11, 2025 — Download This Paper Open PDF in Browser Add Paper...

Published: December 11, 2025

When should AI labs be forced to pause?

Introduction

What Counts as a Dangerous Capability Threshold

How Capability Thresholds Can Reduce Competitive Race Incentives

Why Compute, Capability and Risk Triggers Remain Disputed

Implementation and Practical Limits

Looking Ahead: Thresholds in Governance and Public Policy

Further Reading

Human Compatible

The Alignment Problem

Superintelligence

The Coming Wave

Marketplace Samples

Matserpi Robotic Arm Car Kit for Raspberry Pi - AI Vision, 5DOF, Educational Rob

Picrawler AI Robot Kit for Raspberry Pi with Voice Control & Video Recognition

UGV Beast PI4B AI Kit Opensource Off-Road Tracked AI Robot PC Vision

TurboPi Smart Robot Car Kit Vision AI Robot-Hiwonder 2DOF HD Cam for Raspberry

No AI 3D Print Pin Badge Gift Anti Artificial Intelligence Artist Maker UK Made

Beware of AI Artificial Intelligence Pinback Button Badge 25mm, 32mm, 58mm

Wholesale Lot 10 Unused PINS 2001 A.I. ARTIFICIAL INTELLIGENCE Jude Law CM397

Copilot is my Jesus Large 2.25” Button AI Religion Artificial Intelligence God

Endnotes

Additional References

Follow this branch

Parent topic

Related pages 2