Justin Gilmer, Ryan P. Adams, Ian Goodfellow, David Andersen, George E. Dahl (July 2018)
Presented by Christabella Irwanto
To study security, we need a threat model that is
What are they allowed to do?
Can they do this instead?
Can they pick their own starting point?
|Security setting||Constraints on input (human perception)||Starting point|
|Indistinguishable perturbation||Changes must be undetectable||Fixed|
|Content-preserving perturbation||Change must preserve content||Fixed|
|Non-suspicious input||Input must look real||Any input|
|Content-constrained input||Input must preserve content or function||Any input|
Suspicious adversarial defenses
If security is the motivation, we should
Let’s look at common motivating scenarios for the standard ruleset in the literature
… simply covering sign, or knocking it over
Figure 3: “knocked over stop sign attack” is 100% successful in “tricking” the model, robust to lighting and perspective changes, and even worse, already occurs “in the wild”!