By Oluwole Akinwale, Director of Professional Services
You might use an AI assistant to draft emails, summarize messages or help with online shopping. You probably trust it to do exactly what you asked, don’t you?
But what if someone managed to sneak in secret instructions without you knowing?
Prompt injection is a sneaky way to trick AI by hiding commands in the content it reads. When discussing cybersecurity, “injection” refers to inserting something into an existing flow. So prompt injection takes your prompt, or the input you give to an AI, and inserts hidden instructions into the AI’s input. Think of it as someone whispering fake instructions to your assistant before you talk, and the assistant follows those instead of listening to you.
How Prompt Injection Works
Prompt injection exploits how AI reads text. AI models often struggle to tell the difference between what the user asked for, what the developer intended, and what a bad actor quietly inserted.
If the AI sees harmful instructions that look like normal text, it might follow them. Imagine you’re using an AI to manage your calendar and receive an invitation via email. The email seems to be from a colleague, but it contains hidden instructions for the AI to add a meeting with incorrect details, disrupting your schedule. There are two main ways this happens:
- Direct injection: A user types instructions that directly manipulate the AI.
- Indirect injection: Someone hides commands within something the AI reads, such as an email, web page or document.
This kind of thing is already happening in real life:
Gmail’s AI Summaries Tricked by Hidden Commands
Gmail’s AI can summarize long emails for you. In 2025, researchers found that attackers could hide invisible text in an email, such as by using a white font on a white background, with secret instructions. The AI followed those hidden instructions and gave misleading summaries, even when the email was risky or deceptive.
Chatbot Tricked Into Offering a $76,000 Car for $1
In December 2023, Chris Bakke went to a Chevrolet dealership’s website and did something both funny and a bit worrying. With just a few clever messages, he got their ChatGPT-powered chatbot to agree to sell him a $76,000 Chevy Tahoe for $1.
Why It Matters
Prompt injection doesn’t break into your system like malware. Instead, it takes over the conversation between you and the AI. Since AI treats text as instructions, a harmful prompt can bypass the developer’s protections if the model can’t distinguish safe messages from an attacker’s input.
This means that even everyday tools can be compelled to act in strange ways if someone knows the right tricks.
What You Can Do as a User
While developers are constantly working to build better defenses, you can protect yourself in the meantime by:
1. Being cautious with AI input.
Avoid pasting in untrusted text or documents.
2. Watching for odd behavior.
If the AI starts acting strangely, don’t ignore it. You can also report the anomaly through appropriate channels, helping to save thousands or even millions of other users.
3. Choose trusted tools.
Use AI services and plugins from well-known providers that update frequently and prioritize security. To ensure a tool is reliable, look for recent updates or check if the provider frequently releases patches. Also, be on the lookout for security badges or certifications that indicate rigorous testing for privacy and data protection.
These steps empower you to make safer choices and maintain control over your AI interactions.
Bottom Line
Prompt injection is similar to social engineering, which involves tricking people into giving up important information. But for AI, it doesn’t need a password; it just needs the right words in the right spot.
Fortunately, the more we understand it, the easier it is to notice when something isn’t right. Being aware is your first defense and helps make sure your digital assistant works for you, not for someone else.
Infographic—Why Automation Needs Accountability
Blog—The Invisible Layer: Governing AI Inside iPaaS Platforms