|
Projects
Enhancing Security in LLMs
We build next-generation defenses to protect users and organizations from prompt-injection attacks on AI tools.
Jump Down:

Robust defenses against prompt injection
As AI assistants become embedded in everyday workflows, attackers are finding new ways to manipulate them. Prompt injection is one of the most important and fastest-evolving threats: It can cause an AI to ignore safeguards, leak confidential data, or carry out actions the user never intended. These attacks are especially dangerous in real deployments, where the “attack” may be spread across many conversation turns, hidden inside different media like audio, or introduced through interactions among multiple AI agents.
This project builds next-generation defenses designed for the real world. Our approach is interactive and robust: The system continuously evaluates intent and risk, accumulates evidence across a full conversation (not just a single message), and monitors agent-to-agent behavior for suspicious influence. We aim to significantly improve detection of multi-turn attacks, strengthen resilience in multi-modal settings, and provide safeguards for multi-agent systems—delivering trustworthy protection that helps organizations deploy LLMs with confidence.
Project Team
Associated ICSI Group
About
Focus Areas
- Adversarial AI and Robust Machine Learning
- AI Safety, Alignment, and Trustworthiness
- Generative AI and Foundation Models
Get in touch
Want to discuss opportunities to work with ICSI? We’d love to hear from you.
