Defenses

Implementations of defense methods that are used to strengthen the resilience of deep learning models against adversarial examples.

Description

Similar with ../Attacks/, we first define and implement the defense class (e.g., NATDefense within NAT.py for the NAT defense) in Defenses/DefenseMethods/ folder, then we write the corresponding testing code (e.g., NAT_Test.py) to strengthen the original raw model and save the defense-enhanced models into the directory of DefenseEnhancedModels/.

Implemented Defenses

We implement 10 representative complete defenses, including four categories: adversarial-training-based defenses, gradient-masking-based defenses, input-transformation-based defenses and region-based classification.

NAT: A. Kurakin, et al., "Adversarial machine learning at scale," in ICLR, 2017.
EAT: F. Tram`er, et al., "Ensemble adversarial training: Attacks and defenses," in ICLR, 2018.
PAT: A. Madry, et al., "Towards deep learning models resistant to adversarial attacks," in ICLR, 2018.
DD: N. Papernot, et al., "Distillation as a defense to adversarial perturbations against deep neural networks," in S&P, 2016.
IGR: A. S. Ross et al., "Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients," in AAAI, 2018.
EIT: C. Guo, et al., "Countering adversarial images using input transformations," in ICLR, 2018.
RT: C. Xie, et al., "Mitigating adversarial effects through randomization," in ICLR, 2018.
PD: Y. Song, et al., "Pixeldefend: Leveraging generative models to understand and defend against adversarial examples," in ICLR, 2018.
TE: J. Buckman, et al., "Thermometer encoding: One hot way to resist adversarial examples," in ICLR, 2018.
RC: X. Cao et al., "Mitigating evasion attacks to deep neural networks via region-based classification," in ACSAC, 2017.

Usage

Preparation of defense-enhanced models with specific defense parameters that will be used in our evaluation.

Attacks	Commands with default parameters
NAT	python NAT_Test.py --dataset=MNIST --adv_ratio=0.3 --clip_max=0.3 --eps_mu=0 --eps_sigma=50 python NAT_Test.py --dataset=CIFAR10 --adv_ratio=0.3 --clip_max=0.1 --eps_mu=0 --eps_sigma=15
EAT	python EAT_Test.py --dataset=MNIST --train_externals=True --eps=0.3 --alpha=0.05 python EAT_Test.py --dataset=CIFAR10 --train_externals=True --eps=0.0625 --alpha=0.03125
PAT	python PAT_Test.py --dataset=MNIST --eps=0.3 --step_num=40 --step_size=0.01 python PAT_Test.py --dataset=CIFAR10 --eps=0.03137 --step_num=7 --step_size=0.007843
DD	python DD_Test.py --dataset=MNIST --initial=False --temp=50 python DD_Test.py --dataset=CIFAR10 --initial=False --temp=50
IGR	python IGR_Test.py --dataset=MNIST --lambda_r=316 python IGR_Test.py --dataset=CIFAR10 --lambda_r=10
EIT	python EIT_Test.py --dataset=MNIST --crop_size=26 --lambda_tv=0.03 --JPEG_quality=85 --bit_depth=4 python EIT_Test.py --dataset=CIFAR10 --crop_size=30 --lambda_tv=0.03 --JPEG_quality=85 --bit_depth=4
RT	python RT_Test.py --dataset=MNIST --resize=31 python RT_Test.py --dataset=CIFAR10 --resize=36
PD	python PD_Test.py --dataset=MNIST --epsilon=0.3 python PD_Test.py --dataset=CIFAR10 --epsilon=0.0627
TE	python TE_Test.py --dataset=MNIST --level=16 --steps=40 --attack_eps=0.3 --attack_step_size=0.01 python TE_Test.py --dataset=CIFAR10 --level=16 --steps=7 --attack_eps=0.031 --attack_step_size=0.01
RC	python RC_Test.py --dataset=MNIST --search=True --radius_min=0 --radius_max=0.3 --radius_step=0.01 --num_points=1000 python RC_Test.py --dataset=CIFAR10 --gpu_index=2 --search=True --radius_min=0.0 --radius_max=0.1 --radius_step=0.01 --num_points=1000

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Defenses

Description

Implemented Defenses

Usage

Files

README.md

Latest commit

History

README.md

File metadata and controls

Defenses

Description

Implemented Defenses

Usage