Vulnerability in OpenAI’s Deep Research AI agent exposed confidential email data

23.09.2025 12:29:00

Дата публикации

Cybersecurity company Radware has discovered a critical vulnerability in Deep Research, an experimental OpenAI agent designed to automatically analyze documents, web pages, and emails. During testing, researchers found that the agent could execute hidden commands embedded in email text — even when sent by malicious actors.

The attack relied on a prompt injection technique, where harmful instructions were disguised as regular text. The AI agent interpreted these as legitimate directives and acted upon them without verification.

Researchers demonstrated that the AI could extract confidential information from Gmail if a message contained a hidden command — such as forwarding attachments or copying email content. These actions occurred automatically, without any user confirmation.

OpenAI promptly patched the flaw and thanked Radware for the discovery. However, experts note that the issue is systemic: task-oriented AI agents tend to “obey” any text perceived as an instruction.

This poses a serious risk when AI agents are connected to email systems or other sources of sensitive data — including government service platforms. Without contextual filters and source validation, AI may become a new vector for data leaks.

Radware emphasized that such vulnerabilities require a redesign of AI agent architectures — integrating mechanisms to distinguish between content and commands, and verifying sources before executing actions.

The study also raises broader questions about AI autonomy. The more access these agents have to personal data, the higher the risk of manipulation — especially when safeguards are missing.