Slides are available here, made from the same org file that this Hugo blogpost was generated from.

Membership Inference Attacks against Machine Learning Models 🔗

Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov (2017)

Presented by Christabella Irwanto

Machine learning as a service 🔗

The elements of output vector are in [0, 1] and sum up to 1.

Machine learning privacy 🔗

In the context of the overall ML pipeline, we are considering a malicious client

Basic membership inference attack 🔗

E.g. patients’ clinical records in disease-related models.

Adversary model 🔗

Key contributions 🔑 🔗

Membership inference approach 🔗

For a given labeled data record \((\mathbf{x}, y)\) and a model \(f\)’s prediction vector \(\mathbf{y} = f(\mathbf{x})\), determine if \((\mathbf{x}, y)\) was in the model’s training dataset \(D^{train}_{target}\)

How is this even possible? 🔗

End-to-end attack process 🔗

How to train \(f_{attack}\) without detailed knowledge of \(f_{target}\) or its training set? 🔗

Shadow models 🔗

Any overlap and the attack will perform better

  • The training datasets of the shadow models may overlap.
  • Shadow models must be trained similarly to target model, either with same training algorithm and model structure if known, or with the same ML service.
  • All models’ internal parameters are trained independently.

Synthesizing datasets for \(f_{shadow}\) 🔗

Model-based synthesis 🔗

  • (1) Search the space of possible data records to find high confidence inputs, and (2) sample from these records.
  • Initialization values sampled uniformly at random from entire possible range.
  • Hill-climbing objective
    • not met: \(k\) features randomized again
    • met, but not sufficient: add to rejections count and reduce \(k\) if too many rejects
      • controls diameter of search around accepted record
    • met, and sufficient: select record with probability \(y_c\)

Training dataset for \(f_{attack}\) 🔗

Training \(f_{attack}\) 🔗

Experiments 🔗

Evaluation methodology 🔗

Results 🔗

Effect of overfitting 🔗

Precision on CIFAR against CNN (Fig. 4) 🔗

Precision on Purchase Dataset against all target models 🔗

Failure modes 🔗

Effect of noisy shadow data on precision (Fig. 8) 🔗

Real data 10% noise 20% noise
Precision 0.678 0.666 0.613
Recall 0.98 0.99 1.00

Real data vs synthetic data (Fig. 9) 🔗

algorithm cannot synthesize representatives of these classes via search.

Why do the attacks work? 🔗

Overfitting from train-test gap 🔗

Models with higher generalizability are less vulnerable to membership inference attack

Relating accuracy and uncertainty of \(\mathbf{y}\) to membership 🔗

Mitigation strategies 🔗

Evaluation of strategies 🔗

Conclusion 🔗

Model inversion 🔗

Privacy-preserving machine learning 🔗

ML Models that Remember Too Much (MTRTM) 🔗

Commentary 🔗

Commentary 🔗

Discussion topics 🤔 🔗

Discussion topics 🤔 🔗