See How Visible Your Brand is in AI Search Get Free Report

Researchers Find 7 ChatGPT Vulnerabilities That Let Hackers Steal Your Data

  • November 6, 2025
    Updated
researchers-find-7-chatgpt-vulnerabilities-that-let-hackers-steal-your-data

Security researchers disclosed seven techniques that can trick ChatGPT into leaking user memories and chat data, with some issues already mitigated and others still under active review.

The seven identified vulnerabilities and attack techniques were discovered in OpenAI’s GPT-4o and GPT-5 models.

📌 Key Takeaways

  • Seven techniques target browsing, search, links, context, markdown, and persistent memory.
  • Some issues were addressed, yet several attack paths still deserve strong compensating controls.
  • Zero-click variants work through indexed content, turning simple queries into stealth exploits.
  • Bing allow-listing enabled masked redirects, weakening safety checks on rendered URLs.
  • Memory poisoning persists across sessions, raising the stakes for personal data exposure.


What Was Found And Why It Matters

Researchers at Tenable mapped seven attack paths that exploit how LLMs parse instructions mixed with external data. The result is context poisoning that makes a model obey attacker prompts hiding in web pages, link parameters, or prior outputs.

The most serious risk is silent exfiltration. When an assistant fetches a page or summarizes a site, embedded instructions can steer it to reveal chat contents or stored memories. Users may never see the hostile text that triggered the leak.

“Prompt injection is a known issue with the way that LLMs work, and it will probably not be fixed systematically in the near future.” — Moshe Bernstein and Liv Matan, Tenable Researchers


The Seven Techniques, In Plain Language

The list spans the toolchain where assistants read, render, and remember.

  • First, indirect prompt injection hides commands in web pages, then asks the assistant to summarize them.
  • Second, a zero-click version exploits already indexed content through natural-language queries.
  • Third, a one-click pattern abuses a chatgpt.com/?q={prompt} link that auto-executes the query.
  • Fourth, a safety bypass takes advantage of bing.com being allow-listed by masking malicious redirects behind Bing ad tracking links.
  • Fifth, conversation injection places prompts in a site that the assistant summarizes, which then pollutes follow-up answers.
  • Sixth, malicious content hiding uses a markdown rendering quirk to conceal commands within fenced code formatting.
  • Seventh, memory injection plants durable instructions that persist across sessions.

chatgpt-hack-attackers-server


What Was Fixed And What Still Needs Guardrails

The vendor has addressed some issues and is working through others, but the broader class is structural. Any assistant that reads untrusted content, follows links, or stores long-lived memory inherits exposure without layered defenses.

Teams should treat browsing and search as high-risk features. Strict URL filters, response sanitization, and provenance checks help. Durable memory needs validation and expiry by default, or small compromises become lasting problems.

“If attackers only need a small number of crafted documents, poisoning becomes far more feasible than previously believed.” — Research Consortium


How To Reduce Risk Now

Start by hardening allow-lists and stripping tracking parameters before render. Block open redirects and require fetch-then-verify flows where the model never sees raw page chrome, comments, or hidden code blocks.

Adopt a deny-by-default memory policy. Gate writes through a reviewer, expires entries fast, and surfaces memory diffs to users. For enterprise deployments, isolate browsing-enabled assistants, log every external fetch, and monitor for prompt patterns.


What To Watch Next

Expect tighter defaults around link handling, markdown parsing, and memory writes. Independent red-team work will likely expand these seven techniques into a larger taxonomy that product teams can test against pre-release.

The long-term fix is layered: input filtering, extraction sandboxes, and tool mediation that limits what an assistant can do from a single prompt. Reliable assistants will look more like systems than chat boxes.


Conclusion

These findings confirm what many teams already sense. As assistants gain browsing, search, and memory, the attack surface grows in directions traditional web filters did not anticipate. Some patches are shipping, yet the class remains live.

Treat untrusted content like code. Control what gets rendered, what gets remembered, and what gets executed. With layered defenses, assistants can stay helpful without turning every summary into a potential exploit.


For the recent AI News, visit our site.


If you liked this article, be sure to follow us on X/Twitter and also LinkedIn for more exclusive content.

Was this article helpful?
YesNo
Generic placeholder image
Articles written 940

Khurram Hanif

Reporter, AI News

Khurram Hanif, AI Reporter at AllAboutAI.com, covers model launches, safety research, regulation, and the real-world impact of AI with fast, accurate, and sourced reporting.

He’s known for turning dense papers and public filings into plain-English explainers, quick on-the-day updates, and practical takeaways. His work includes live coverage of major announcements and concise weekly briefings that track what actually matters.

Outside of work, Khurram squads up in Call of Duty and spends downtime tinkering with PCs, testing apps, and hunting for thoughtful tech gear.

Personal Quote

“Chase the facts, cut the noise, explain what counts.”

Highlights

  • Covers model releases, safety notes, and policy moves
  • Turns research papers into clear, actionable explainers
  • Publishes a weekly AI briefing for busy readers

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *