Founder pricing for first 500 teams.Claim your spot →
All posts
SecurityMarch 28, 20268 min read

Prompt injection in customer support: what it is, and what actually mitigates it

Support chat is a high-trust surface — and attackers know it. Here’s a plain-English breakdown of prompt injection risks for AI agents, minus the fear-mongering.

T
The Signorian team
Founders

In a prompt injection, someone tries to override the agent’s instructions by hiding commands in user text — paste a fake “system” message, ask the model to ignore policies, or exfiltrate hidden prompt text. Public-facing support chat is an attractive target because it’s always on and often wired to internal tools.

No silver bullet exists. But you can shrink the blast radius with a few architectural habits.

Separate instructions from user content

Treat anything the customer types as untrusted data, not as part of the system prompt. Delimiters, structured message formats, and clear role tags help models distinguish “what we told the agent” from “what the user said.” It’s not perfect, but it beats stuffing everything into one blob of text.

Ground answers in retrieval, not free recall

When answers must come from your docs and policies — and the agent is instructed to refuse when sources don’t support a claim — you reduce the chance a creative user prompt turns into a creative policy. Hallucination mitigation and abuse mitigation overlap more than people think.

Limit tool access by default

If your agent can call APIs (refund, delete, export), gate those behind explicit confirmation, risk scoring, or human approval. The dangerous injections aren’t the ones that make the model say something silly — they’re the ones that trigger a real action.

Monitor for patterns, not one-off jokes

Occasional weird model output is inevitable. Watch for repeated probes: many sessions trying “ignore previous instructions,” language switching, or unusually long pasted blobs. Rate limits and anomaly alerts catch scripted abuse better than keyword filters.

Want to actually ship this?

Signorian deploys a docs-grounded AI support agent in under an hour. Free on 100 conversations/month. Founder pricing for the first 500 teams.

Claim founder pricing