CLIPMIA

This is an official repository for Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study (ICCV2023).

In this work, we take a first step towards developing practical MIAs against large-scale multi-modal models.

We introduce a simple baseline strategy by thresholding the cosine similarity between text and image features of a target point and propose further enhancing the baseline by aggregating cosine similarity across transformations of the target.

We also present a new weakly supervised attack method that leverages ground-truth non-members (e.g., obtained by using the publication date of a target model and the timestamps of the open data) to further enhance the attack.

Steps to Reproduce the Attacks

Install Anaconda Environment

Create an anaconda environment to manage packages and dependencies [see environment.yml].

Download Dataset

Download datasets such as LAION, CC12M, and CC3M using the repository: img2dataset.

Check Duplication

Utilize "Inspect_image_overlapping.ipynb" and "Inspect_text_overlapping.ipynb" to inspect and save the overlapping text, and URLs between datasets.

Hyperparameter Setting
- Non-Train Datasets
  - args.val_data-nontrain-1: Path for non-train dataset 1.
  - args.val_num_samples-nontrain: Number of samples in non-train dataset 1.
  - Repeat for other non-train datasets.
- Pseudo-Train Datasets
  - args.val_data-train: Path for pseudo-train dataset 1.
  - args.val_num_samples-nontrain: Number of samples in pseudo-train dataset 1.
  - Repeat for other pseudo-train datasets.
- Evaluation Datasets
  - args.train_data-1: Path for the primary training dataset.
  - —train_num_samples-1: Number of samples in the training dataset.
  - args.val_data-1: Path for the first non-training dataset.
  - —val_num_samples-1: Number of samples in the first non-training dataset.
  - Repeat for other evaluation datasets.

Execute the Attack

Run main.py --model ViT-B-32 (with other arguments).

For any questions, please contact myeongseob@vt.edu.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Inspect_image_overlapping.ipynb		Inspect_image_overlapping.ipynb
Inspect_text_overlapping.ipynb		Inspect_text_overlapping.ipynb
LICENSE		LICENSE
README.md		README.md
data.py		data.py
environment.yml		environment.yml
main.py		main.py
nontrain_selection.py		nontrain_selection.py
params.py		params.py
pseudotrain_selection.py		pseudotrain_selection.py
text_preprocessing.py		text_preprocessing.py
train_attackmodel.py		train_attackmodel.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CLIPMIA

Steps to Reproduce the Attacks

About

Releases

Packages

Contributors 2

Languages

License

reds-lab/CLIP-MIA

Folders and files

Latest commit

History

Repository files navigation

CLIPMIA

Steps to Reproduce the Attacks

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages