What happens when a model crosses a threshold?

Introduction

When AI developers and regulators talk about severe‑risk thresholds, they mean pre‑defined decision points that determine whether a frontier AI model’s capabilities or estimated harm warrant special governance actions, including tighter safeguards, controlled deployment, or even pausing release entirely. These thresholds are not arbitrary checklist items; they are explicit, operational boundaries designed to translate technical evaluations into concrete release decisions. That’s why they are central to how frontier model governance seeks to prevent catastrophic or existential outcomes from increasingly powerful AI systems: by defining in advance what counts as too risky and linking that to what must happen next. [GOV.UK]GOV.UKEmerging processes for frontier AI safety27, 2023…

Risk thresholds illustration 1 In practice, thresholds help bridge the gap between capability testing (what a model can do) and policy decisions (whether it should be used, deployed widely, or scaled further). They shape not only internal lab governance but also external accountability and, in some frameworks, regulatory oversight. [Frontier Model Forum]frontiermodelforum.orgFrontier Model Forum Issue Brief: Thresholds for Frontier AI Safety FrameworksFrontier Model ForumIssue Brief: Thresholds for Frontier AI Safety Frameworks - Frontier Model Forum…

What Severe‑Risk Thresholds Are and Why They Matter

At their core, severe‑risk thresholds serve two linked purposes in frontier AI governance: [governance.ai]governance.airisk thresholds for frontier aiGovAIRisk Thresholds for Frontier AI | GovAIJune 20, 2024…Published: June 20, 2024

Clarifying danger zones: They define the levels of capability or risk at which a model’s potential for harm crosses from “manageable with normal safeguards” into “requires extraordinary action”. This could be because a model suddenly exhibits behaviours that could meaningfully increase misuse, or because it adds to systemic risk in ways that can’t be mitigated by routine controls. [GOV.UK]GOV.UKfrontier ai safety commitments ai seoul summit 2024Frontier AI Safety Commitments, AI Seoul Summit 2024 - GOV.UKFebruary 7, 2025…Published: February 7, 2025
Triggering governance actions: Once a threshold is crossed, predefined policies kick in — ranging from escalated evaluation, heightened security measures, limited access modes, to postponing broader deployment or ongoing training. Without these anchors, decisions about releasing high‑risk models risk being ad hoc and inconsistent. [Frontier Model Forum]frontiermodelforum.orgFrontier Model Forum Issue Brief: Thresholds for Frontier AI Safety FrameworksFrontier Model ForumIssue Brief: Thresholds for Frontier AI Safety Frameworks - Frontier Model Forum…

Importantly, thresholds are not just technical bars on a test score. They embody value judgements about which harms are tolerable and which are not, based on both the severity of potential impacts and the uncertainties in forecasting future harms. [Frontier Model Forum]frontiermodelforum.orgFrontier Model Forum Issue Brief: Thresholds for Frontier AI Safety FrameworksFrontier Model ForumIssue Brief: Thresholds for Frontier AI Safety Frameworks - Frontier Model Forum…

Capability vs Risk Thresholds: Two Complementary Concepts

Frontier governance frameworks typically distinguish between two kinds of thresholds, each guiding release decisions in a different way:

Capability Thresholds

These are defined in terms of what the model can do. A capability threshold identifies abilities that matter because they enable a pathway to severe harmful outcomes. For example, a model that can reason at PhD‑level proficiency in biological design, or autonomously pursue multi‑step cyber offence strategies, may be flagged as crossing a capability threshold. Crossing such thresholds doesn’t necessarily block release on its own but signals that heightened scrutiny and stronger safeguards are now required before continuing. [Frontier Model Forum]frontiermodelforum.orgFrontier Model Forum Issue Brief: Thresholds for Frontier AI Safety FrameworksFrontier Model ForumIssue Brief: Thresholds for Frontier AI Safety Frameworks - Frontier Model Forum…

[Risk or Acceptable Deployment Thresholds]cltc.berkeley.eduintolerable ai risk thresholdsRisk Threshold Recommendations for Artificial Intelligence - CLTCWhite Paper / February 2025 INTOLERABLE RISK THRESHOLD RECOMMENDATIONS F…Published: February 2025

These focus on harm outcomes and are linked to context and mitigations. Once a model crosses a capability threshold, frameworks assess whether it can still be released without exposing society to unacceptable risk after planned safeguards. If not, that model may be restricted to internal use, limited testing settings, or withheld entirely. This second threshold is more directly tied to “should we release it at all?” decisions. [Frontier Model Forum]frontiermodelforum.orgFrontier Model Forum Issue Brief: Thresholds for Frontier AI Safety FrameworksFrontier Model ForumIssue Brief: Thresholds for Frontier AI Safety Frameworks - Frontier Model Forum…

In summary, capability thresholds flag when risks might be serious, and acceptable deployment thresholds determine whether, given the current mitigations, the risk is actually acceptable for release. [Frontier Model Forum]frontiermodelforum.orgFrontier Model Forum Issue Brief: Thresholds for Frontier AI Safety FrameworksFrontier Model ForumIssue Brief: Thresholds for Frontier AI Safety Frameworks - Frontier Model Forum…

Risk thresholds illustration 2

How Thresholds Shape Release Decisions

In applied safety frameworks used by major labs and emerging policy proposals, severe‑risk thresholds influence release decisions in several clear ways:

1. Structured Pre‑Commitments

Governance frameworks often require that risk thresholds and responses be specified before a model is trained or evaluated. This includes deciding what counts as a breach and what mitigation steps or escalations follow it. Doing so makes decisions less ad hoc and aligns internal lab practice with public accountability. [GOV.UK]GOV.UKEmerging processes for frontier AI safety27, 2023…

2. Escalation and Mitigation Paths

When a threshold is hit, frameworks lay out specific mitigation commitments. These could include third‑party reviews, strengthened behavioural constraints, tool access governance, pre‑release hardening, or deployment only in controlled contexts. After mitigation, a residual risk assessment checks if the model is fit to move forward. [GOV.UK]GOV.UKfrontier ai safety commitments ai seoul summit 2024Frontier AI Safety Commitments, AI Seoul Summit 2024 - GOV.UKFebruary 7, 2025…Published: February 7, 2025

3. Deployment Controls and Limits

If risk remains too high, frameworks can conditionally allow narrow or monitored deployment rather than full public access — for example, operational testing in secure environments or limited API releases. In some commitments, actors agree not to release models at all if mitigations cannot ensure that thresholds will not be breached. [GOV.UK]GOV.UKfrontier ai safety commitments ai seoul summit 2024Frontier AI Safety Commitments, AI Seoul Summit 2024 - GOV.UKFebruary 7, 2025…Published: February 7, 2025

4. Pausing Development

Many early safety commitments actually tied threshold breaches to a pause in further scaling or deployment until risk was sufficiently reduced. While some firms have altered these commitments in response to competitive pressures, the underlying logic — that crossing a serious risk threshold should slow or halt progress — remains central in many frameworks and policy discussions. [PC Gamer]pcgamer.comPreviously, under its Responsible Scaling Policy (RSP), Anthropic pledged to halt AI development should new systems reach dangerous capab…

5. External Oversight Triggers

Pre‑defined risk thresholds also help determine when external actors — regulators, independent evaluators, or governments — should be engaged. This can mean sharing sensitive evaluation results under NDA or entering into collaborative risk assessments before a model enters broader use. [GOV.UK]GOV.UKfrontier ai safety commitments ai seoul summit 2024Frontier AI Safety Commitments, AI Seoul Summit 2024 - GOV.UKFebruary 7, 2025…Published: February 7, 2025

Why Thresholds Are Hard but Important

Setting meaningful severe‑risk thresholds for frontier AI is difficult for a few reasons: [governance.ai]governance.airisk thresholds for frontier aiGovAIRisk Thresholds for Frontier AI | GovAIJune 20, 2024…Published: June 20, 2024

Fast‑evolving capabilities: Frontier models advance so quickly that fixed thresholds risk being obsolete shortly after they are published. As a result, many frameworks treat thresholds as iterative and revise them as evidence and understanding evolves. [GOV.UK]GOV.UKfrontier ai safety commitments ai seoul summit 2024Frontier AI Safety Commitments, AI Seoul Summit 2024 - GOV.UKFebruary 7, 2025…Published: February 7, 2025
Complex, dual‑use harms: It can be hard to quantify “risk” in a way that captures both likelihood and severity when harms may be unprecedented and uncertain. That is why some governance proposals distinguish between probability‑based risk thresholds and capability‑based proxies that are easier to measure. [GovAI]governance.airisk thresholds for frontier aiGovAIRisk Thresholds for Frontier AI | GovAIJune 20, 2024…Published: June 20, 2024
Ecosystem effects: A model’s marginal risk may depend not only on its standalone capabilities but on how it interacts with other models and tools in the broader ecosystem. Thresholds therefore sometimes need to account for collective risk growth as multiple models each introduce small risk increments. [Frontier Model Forum]frontiermodelforum.orgFrontier Model Forum Issue Brief: Thresholds for Frontier AI Safety FrameworksFrontier Model ForumIssue Brief: Thresholds for Frontier AI Safety Frameworks - Frontier Model Forum…

Despite these challenges, having structured thresholds enables more predictable, transparent, and accountable release decisions — a core goal of frontier AI governance. [Frontier Model Forum]frontiermodelforum.orgFrontier Model Forum Issue Brief: Thresholds for Frontier AI Safety FrameworksFrontier Model ForumIssue Brief: Thresholds for Frontier AI Safety Frameworks - Frontier Model Forum…

Risk thresholds illustration 3

Why Open‑Weight Releases Raise the Stakes

One particular governance flashpoint is the release of open model weights — the underlying parameters of a model that allow anyone to run or adapt it independently. Open‑weight releases greatly expand who can use and modify a model, including actors without internal safeguards or oversight. Designating thresholds that preclude open‑weight release until risk is demonstrably lower is becoming a standard practice in many proposed frameworks precisely because open weights amplify both misuse pathways and difficulty in containing harms. [Frontier Model Forum]frontiermodelforum.orgFrontier Model Forum Issue Brief: Thresholds for Frontier AI Safety FrameworksFrontier Model ForumIssue Brief: Thresholds for Frontier AI Safety Frameworks - Frontier Model Forum…

In contrast, closed or controlled deployments (e.g. via a hosted API with monitoring and usage rules) allow developers and external actors to retain some governance levers and trace misuse more easily. Decisions about open‑weight release therefore often lie at the far end of the risk threshold spectrum — reserved only for models with strong evidence they do not enable severe harms beyond manageable levels. [Frontier Model Forum]frontiermodelforum.orgFrontier Model Forum Issue Brief: Thresholds for Frontier AI Safety FrameworksFrontier Model ForumIssue Brief: Thresholds for Frontier AI Safety Frameworks - Frontier Model Forum…

Final Takeaway

Severe‑risk thresholds are a governance tool that turns evaluation results into action. They help labs and regulators decide not just what a frontier model can do, but whether it should be released or scaled, in what form, and under what safeguards. As policies evolve, these thresholds are shaping both internal industry practices and emerging regulatory standards for how society handles one of the most consequential technologies of our time. [GOV.UK]GOV.UKfrontier ai safety commitments ai seoul summit 2024Frontier AI Safety Commitments, AI Seoul Summit 2024 - GOV.UKFebruary 7, 2025…Published: February 7, 2025

Amazon book picks

Marketplace Samples

Example marketplace items related to this page. Use the search link to explore similar finds on eBay.

Example eBay listing

Kaiyodo Grendizer Cold Cast Figure with Base Super Robot Display Model New

Search eBay.com: robot display model

Browse similar on eBay.com

Example eBay listing

Lost In Space YM-3 Robot Mini Display Model in Retro TV 17RMB03

Search eBay.com: robot display model

Browse similar on eBay.com

Example eBay listing

MOC GLaDOS Robot Portal Game Style Building Block Set 1868 Bricks Display Model

Search eBay.com: robot display model

Browse similar on eBay.com

Example eBay listing

DENSO Industrial Robot Arm Model 1:6 Scale Manipulator Simulation Display Gift

Search eBay.com: robot display model

Browse similar on eBay.com

Browse more on eBay.com

Example items shown for inspiration; availability and pricing can change. Branchoria may earn a commission if you purchase through outbound eBay links.

Example eBay listing

Artificial intelligence Framed Wall Art Poster Canvas Print Picture

Search eBay.co.uk: artificial intelligence wall art

Browse similar on eBay.co.uk

Example eBay listing

Artificial intelligence Framed Wall Art Poster Canvas Print Picture

Search eBay.co.uk: artificial intelligence wall art

Browse similar on eBay.co.uk

Example eBay listing

artificial intelligence Framed Art Framed Wall Art Poster Canvas Print Picture

Search eBay.co.uk: artificial intelligence wall art

Browse similar on eBay.co.uk

Example eBay listing

Artificial Intelligence Framed Art Framed Wall Art Poster Canvas Print Picture

Search eBay.co.uk: artificial intelligence wall art

Browse similar on eBay.co.uk

Browse more on eBay.co.uk

Example items shown for inspiration; availability and pricing can change. Branchoria may earn a commission if you purchase through outbound eBay links.

Endnotes

Source: GOV.UK
Title: Emerging processes for frontier AI safety
Link: https://www.gov.uk/government/publications/emerging-processes-for-frontier-ai-safety/emerging-processes-for-frontier-ai-safety
Source snippet
27, 2023...
Source: GOV.UK
Title: frontier ai safety commitments ai seoul summit 2024
Link: https://www.gov.uk/government/publications/frontier-ai-safety-commitments-ai-seoul-summit-2024/frontier-ai-safety-commitments-ai-seoul-summit-2024
Source snippet
Frontier AI Safety Commitments, AI Seoul Summit 2024 - GOV.UKFebruary 7, 2025...

Published: February 7, 2025
Source: governance.ai
Title: risk thresholds for frontier ai
Link: https://www.governance.ai/research-paper/risk-thresholds-for-frontier-ai
Source snippet
GovAIRisk Thresholds for Frontier AI | GovAIJune 20, 2024...

Published: June 20, 2024
Source: frontiermodelforum.org
Title: Frontier Model Forum Issue Brief: Thresholds for Frontier AI Safety Frameworks
Link: https://www.frontiermodelforum.org/updates/issue-brief-thresholds-for-frontier-ai-safety-frameworks/
Source snippet
Frontier Model ForumIssue Brief: Thresholds for Frontier AI Safety Frameworks - Frontier Model Forum...
Source: frontiermodelforum.org
Title: Frontier Model Forum Risk Taxonomy and Thresholds for Frontier AI Frameworks
Link: https://www.frontiermodelforum.org/technical-reports/risk-taxonomy-and-thresholds/
Source snippet
Frontier Model ForumRisk Taxonomy and Thresholds for Frontier AI Frameworks - Frontier Model ForumJune 18, 2025...

Published: June 18, 2025
Source: pcgamer.com
Link: [https://www.pcgamer.com/software/ai/anthropic
Source snippet
Previously, under its Responsible Scaling Policy (RSP), Anthropic pledged to halt AI development should new systems reach dangerous capab...
Source: aiwiki.ai
Title: Responsible Scaling Policy | AI Wiki
Link: https://aiwiki.ai/wiki/responsible_scaling_policy
Source snippet
May 7, 2026 — Responsible Scaling Policy RESPONSIBLE SCALING POLICY AI GovernanceAI PolicyAI SafetyFrontier AI 39 min read Updated May 7...

Published: May 7, 2026
Source: aisecurityandsafety.org
Title: frontier ai safety
Link: https://aisecurityandsafety.org/en/guides/frontier-ai-safety/
Source snippet
Managing Risks from the Most Capable AI Systems (2026) | AI Safety DirectoryApril 3, 2026 — FRONTIER AI SAFETY: MANAGING RISKS FROM THE M...

Published: April 3, 2026
Source: cltc.berkeley.edu
Title: intolerable ai risk thresholds
Link: https://cltc.berkeley.edu/publication/intolerable-ai-risk-thresholds/
Source snippet
Risk Threshold Recommendations for [Artificial]({{ 'artificial-goals/' | relative_url }}) Intelligence - CLTCWhite Paper / February 2025 INTOLERABLE RISK THRESHOLD RECOMMENDATIONS F...

Published: February 2025

Additional References

Source: carnegieendowment.org
Link: https://carnegieendowment.org/europe/research/2024/09/if-then-commitments-for-ai-risk-reduction
Source snippet
Key text: “II. Set out thresholds at which severe risks posed by a model or system, unless adequately mitigated, would be deemed intolera...
Source: aisecurityandsafety.org
Link: https://aisecurityandsafety.org/en/glossary/frontier-ai-safety-framework/
Source snippet
March 27, 2026 — FRONTIER AI SAFETY FRAMEWORK governance Last updated: March 27, 2026 DEFINITION A document published by an AI developer...

Published: March 27, 2026
Source: metr.org
Link: https://metr.org/common-elements
Source snippet
Common Elements of Frontier AI Safety Policies - METRDecember 16, 2025 — CAPABILITY THRESHOLDS Descriptions of AI capability levels which...

Published: December 16, 2025
Source: cltc.berkeley.edu
Title: cltc submits working paper for ai action summit
Link: https://cltc.berkeley.edu/2024/11/18/cltc-submits-working-paper-for-ai-action-summit/
Source snippet
AI Security Initiative Publishes Working Paper on Intolerable Risk Thresholds for AI - CLTC UC Berkeley Center for Long-Term Cybersecurit...
Source: digitalcompliance.snellman.com
Title: chapter v general purpose ai models art 51 56
Link: https://digitalcompliance.snellman.com/regulation/ai-act/chapter-v-general-purpose-ai-models-art-51-56/
Source snippet
51-56) - EU Digital Compliance Tracker (Snellman)CHAPTER V – GENERAL-PURPOSE AI MODELS (ART. 51-56) * Art. 51 AI Act – Classification of...
Source: concordia-ai.com
Title: Frontier AI Risk Management Framework (v1.5)
Link: https://concordia-ai.com/research/frontier-ai-risk-management-framework-v1-5/
Source snippet
Concordia AIFebruary 25, 2026 — FRONTIER AI RISK MANAGEMENT FRAMEWORK (V1.5) Image Download PDF February 2026 The Frontier AI Risk Manage...

Published: February 25, 2026
Source: youtube.com
Title: Claude’s Maker Abandons Safety Pledge — Industry Shockwaves
Link: https://www.youtube.com/watch?v=kWACKWBlTQw
Source snippet
Theory to Practice: A Report from the World's First AI Safety Institute...
Source: youtube.com
Title: Theory to Practice: A Report from the World’s First AI Safety Institute
Link: https://www.youtube.com/watch?v=8dQXK9HJOPQ
Source snippet
Google's Broken Promises on AI Safety Explained...
Source: youtube.com
Title: Anthropic Responsible Scaling Policy v3: Dive Into The Details
Link: https://www.youtube.com/watch?v=AQMh9JelvVE
Source snippet
Claude's Maker Abandons Safety Pledge — Industry Shockwaves...
Source: youtube.com
Title: Anthropic’s Plan to Stop AI Bioweapons & Autonomous Misuse
Link: https://www.youtube.com/watch?v=Z_nHHKrcjQM
Source snippet
Anthropic Responsible Scaling Policy v3: Dive Into The Details...

What happens when a model crosses a threshold?

Introduction

What Severe‑Risk Thresholds Are and Why They Matter

Capability vs Risk Thresholds: Two Complementary Concepts

How Thresholds Shape Release Decisions

Why Thresholds Are Hard but Important

Why Open‑Weight Releases Raise the Stakes

Final Takeaway

Further Reading

Superintelligence

Human Compatible

The Alignment Problem

The Coming Wave

Marketplace Samples

Kaiyodo Grendizer Cold Cast Figure with Base Super Robot Display Model New

Lost In Space YM-3 Robot Mini Display Model in Retro TV 17RMB03

MOC GLaDOS Robot Portal Game Style Building Block Set 1868 Bricks Display Model

DENSO Industrial Robot Arm Model 1:6 Scale Manipulator Simulation Display Gift

Artificial intelligence Framed Wall Art Poster Canvas Print Picture

Artificial intelligence Framed Wall Art Poster Canvas Print Picture

artificial intelligence Framed Art Framed Wall Art Poster Canvas Print Picture

Artificial Intelligence Framed Art Framed Wall Art Poster Canvas Print Picture

Endnotes

Additional References

Follow this branch

Parent topic

Related pages 3

More on this topic 3