Justin Gilmer, Ryan P. Adams, Ian Goodfellow, David Andersen, George E. Dahl (July 2018)
Presented by Christabella Irwanto
To study security, we need a threat model that is
What are they allowed to do?
Can they do this instead?
Can they pick their own starting point?
Security setting | Constraints on input (human perception) | Starting point |
---|---|---|
Indistinguishable perturbation | Changes must be undetectable | Fixed |
Content-preserving perturbation | Change must preserve content | Fixed |
Non-suspicious input | Input must look real | Any input |
Content-constrained input | Input must preserve content or function | Any input |
Unconstrained | Any | Any |
Suspicious adversarial defenses
If security is the motivation, we should
Let’s look at common motivating scenarios for the standard ruleset in the literature
… simply covering sign, or knocking it over
Figure 3: “knocked over stop sign attack” is 100% successful in “tricking” the model, robust to lighting and perspective changes, and even worse, already occurs “in the wild”!