Perplexity Releases BrowseSafe to Combat Prompt Injection in AI Browsers

Perplexity Releases BrowseSafe to Combat Prompt Injection in AI Browsers
Photo by Glenn Carstens-Peters on Unsplash

Perplexity has released BrowseSafe, an open research benchmark and content detection model designed to protect users as AI agents integrate into web browsers. Prompt injection has emerged as a critical attack vector for web agents, yet its real-world impact remains insufficiently understood. This release addresses a significant gap in AI browser security by providing both a detection mechanism and a comprehensive evaluation framework for the emerging threat landscape.

BrowseSafe-Bench comprises 14,719 examples mimicking a mix of malicious and harmless samples across 11 attack types, nine injection strategies spanning hidden fields to visible paragraphs and footers, and three linguistic styles from explicit commands to indirect, camouflaged text. The benchmark emphasises injections that can influence real-world actions rather than mere text outputs, presenting attack payloads with complexity similar to what real-world agents encounter. Evaluation results reveal clear patterns: direct attacks, such as asking the agent to reveal its system prompt or exfiltrate information via URL segments, are easiest for models to detect, whilst multilingual attacks and those written as indirect or hypothetical instructions prove significantly harder because they avoid obvious keywords many detectors implicitly rely on. Attacks hidden in comments are detected relatively well, whilst versions rewritten into visible footers, table cells, or inline paragraphs are much more difficult to catch. BrowseSafe is a detection model fine-tuned to identify such malicious content, scanning full web pages in real time without slowing the browser. Large general-purpose models, though capable of reasoning well about these cases, are often too slow and expensive to run on every page.

Perplexity's multi-layered defence strategy combines both architectural and model-based defences to protect against evolving prompt injection attacks, offering a blueprint for designing practical, secure web agents through a defence-in-depth approach. The company is releasing the model and the BrowseSafe-Bench evaluation suite as a resource for evaluating and improving defence effectiveness. This release provides the AI security community with concrete tools to address a critical vulnerability as web agents become increasingly prevalent in everyday browsing.

---

Sources:

1. https://www.perplexity.ai/hub/blog/building-safer-ai-browsers-with-browsesafe

2. https://arxiv.org/abs/2511.20597