N/A - VU#667211

Executive Summary

Summary
Title	Various GPT services are vulnerable to two systemic jailbreaks, allows for bypass of safety guardrails

Informations
Name	VU#667211	First vendor Publication	2025-04-29
Vendor	VU-CERT	Last vendor Modification	2025-04-29
Severity (Vendor)	N/A	Revision	M

Security-Database Scoring CVSS v3

Cvss vector : N/A
Overall CVSS Score	NA
Base Score	NA	Environmental Score	NA
impact SubScore	NA	Temporal Score	NA
Exploitabality Sub Score	NA

Calculate full CVSS 3.0 Vectors scores

Security-Database Scoring CVSS v2

Cvss vector :
Cvss Base Score	N/A	Attack Range	N/A
Cvss Impact Score	N/A	Attack Complexity	N/A
Cvss Expoit Score	N/A	Authentication	N/A
Calculate full CVSS 2.0 Vectors scores

Detail

Overview

Two systemic jailbreaks, affecting a number of generative AI services, were discovered. These jailbreaks can result in the bypass of safety protocols and allow an attacker to instruct the corresponding LLM to provide illicit or dangerous content. The first jailbreak, called ?Inception,? is facilitated through prompting the AI to imagine a fictitious scenario. The scenario can then be adapted to another one, wherein the AI will act as though it does not have safety guardrails. The second jailbreak is facilitated through requesting the AI for information on how not to reply to a specific request. Both jailbreaks, when provided to multiple AI models, will result in a safety guardrail bypass with almost the exact same syntax. This indicates a systemic weakness within many popular AI systems.

Description

Two systemic jailbreaks, affecting several generative AI services, have been discovered. These jailbreaks, when performed against AI services with the exact same syntax, result in a bypass of safety guardrails on affected systems.

The first jailbreak, facilitated through prompting the AI to imagine a fictitious scenario, can then be adapted to a second scenario within the first one. Continued prompting to the AI within the second scenarios context can result in bypass of safety guardrails and allow the generation of malicious content. This jailbreak, named ?Inception? by the reporter, affects the following vendors:

ChatGPT (OpenAI)
Claude (Anthropic)
- Copilot (Microsoft)
DeepSeek
Gemini (Google)
Grok (Twitter/X)
MetaAI (FaceBook)
MistralAI

The second jailbreak is facilitated through prompting the AI to answer a question with how it should not reply within a certain context. The AI can then be further prompted with requests to respond as normal, and the attacker can then pivot back and forth between illicit questions that bypass safety guardrails and normal prompts. This jailbreak affects the following vendors:

ChatGPT
Claude
- Copilot
DeepSeek
Gemini
Grok
MistralAI

Impact

These jailbreaks, while of low severity on their own, bypass the security and safety guidelines of all affected AI services, allowing an attacker to abuse them for instructions to create content on various illicit topics, such as controlled substances, weapons, phishing emails, and malware code generation. A motivated threat actor could exploit this jailbreak to achieve a variety of malicious actions. The systemic nature of these jailbreaks heightens the risk of such an attack. Additionally, the usage of legitimate services such as those affected by this jailbreak can function as a proxy, hiding a threat actors malicious activity.

Solution

Various affected vendors have provided statements on the issue and have altered services to prevent the jailbreak.

Acknowledgements

Thanks to the reporters, David Kuzsmar, who reported the first jailbreak, and Jacob Liddle, who reported the second jailbreak. This document was written by Christopher Cullen.

Original Source

Url : https://kb.cert.org/vuls/id/667211

Alert History

If you want to see full details history, please login or register.

Date	Informations
2025-05-26 21:20:25	First insertion

N/A - VU#667211

Executive Summary

Security-Database Scoring CVSS v3

Security-Database Scoring CVSS v2

Detail

Overview

Description

Impact

Solution

Acknowledgements

Original Source

Alert History

Global Informations

COMPANY

STANDARDS

RECENT POSTS

MENU