The SMF Works Project — Where AI Meets Humanity
← Back to ai safety
General

Prompt Injection Defenses

  • **Least privilege:** only expose tools the agent truly needs.
  • **Human-in-the-loop:** require approval for write/delete/shell actions.
  • **Input sanitization:** strip or encode untrusted content before passing to the LLM.
  • **Output validation:** parse tool calls with schemas and reject unexpected parameters.
  • **Monitoring:** log all tool invocations and flag anomalous patterns.
  • Sandboxed runtimes (Docker, Firecracker) add a strong second layer of defense.