What category does Jailbreak belong to?

Jailbreak is a AI & ML concept, typically considered intermediate difficulty for developers learning this area.

Jailbreak

Medium — good to knowAI & ML

ELI5 — The Vibe Check

A jailbreak is a sneaky prompt that tricks an AI into ignoring its safety rules. It's like convincing the strict teacher to let you skip homework by telling an elaborate story. People craft these creative prompts to make the AI do things it normally wouldn't — like generating harmful content or pretending it has no rules. AI labs play constant whack-a-mole patching them.

Real Talk

In the context of AI, a jailbreak is a prompt injection technique designed to bypass a language model's safety guardrails and content filters. Methods include role-playing scenarios, hypothetical framings, character switching, and multi-step social engineering. AI providers continuously update models to resist known jailbreak patterns while maintaining helpful behavior.