Introduction to LLM jailbreaking
LLM jailbreaking bypasses safety alignment to force models into generating restricted content. Covers DAN, roleplay, token smuggling, and adversarial suffixes.
LLM jailbreaking bypasses safety alignment to force models into generating restricted content. Covers DAN, roleplay, token smuggling, and adversarial suffixes.
Indirect prompt injection embeds payloads in external data that LLMs process. Covers data poisoning, web content injection, email vectors, and concealment.
Direct prompt injection targets LLMs through the user input channel. Covers system prompt extraction strategies and behaviour manipulation techniques.
LLM reconnaissance maps the attack surface of AI applications before testing. Covers model identification, architecture probing, and LLMmap fingerprinting.
Prompt injection exploits the lack of boundary between system and user prompts in LLMs. Covers multi-turn context, multimodal vectors, and architectural causes.
Prompt engineering controls LLM output through input design. Covers best practices and maps security risks to OWASP LLM Top 10 and Google SAIF risk categories.
ML infrastructure carries every traditional security risk plus deployment-specific threats. Covers misconfigurations, DoS, resource exhaustion, and TTPs.
The application layer of ML systems inherits every traditional web vulnerability. Covers injection, authentication, XSS, and social engineering attack vectors.
How adversaries poison training data, embed backdoors, and exfiltrate ML data sets. Covers data poisoning, supply chain attacks, and federated learning risks.
A red teamer's reference for attacking model components, covering poisoning, jailbreak techniques, model extraction, and MITRE ATLAS TTP mapping with examples.