Projects

DNN Attack Detection

We are creating tools to detect backdoor attacks on Deep Neural Networks (DNNs).


Concept illustration related to attack detection and AI

Enabling attack detection for DNNs

This project, titled “Backdoor Detection via Eigenvalues, Hessians, Internal Behaviors and Robust Statistics” aims to create effective tools to detect backdoor attacks on Deep Neural Networks (DNNs).

DNNs, a type of machine learning algorithm, achieve impressive performance but can be challenging to train. As a result, it is common to outsource the training of a model (known as Machine Learning as a Service, MLaaS) or to use third-party pre-trained networks and then perform fine-tuning or transfer learning. However, these practices open security vulnerability because it is possible for an adversary (e.g., the MLaaS provider) to change the model architecture in a way that leaves a “back door” that can later be exploited. This can be done, for example, by polluting training data or directly changing the model weights, creating so-called Trojans within the model.

To address this issue, we are designing mechanisms to understand the effects of Trojans on DNNs’ internal behaviors. For this work, we use ideas from robust statistics, scientific computing, and random matrix theory, providing a unique perspective to enable the development of effective tools for backdoor detection.

About

Sponsors


  • Intelligence Advanced Research Projects Activity
  • Army Research Office

Focus Areas


  • Machine Learning (Supervised, Unsupervised, Reinforcement Learning)

Get in touch

Want to discuss opportunities to work with ICSI? We’d love to hear from you.

2150 Shattuck Ave., #250
Berkeley, CA 94704

+1 (510) 666-2900

contact @ icsi.berkeley.edu