Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT Add adversarial suffix attack GCG #180

Open
wants to merge 45 commits into
base: main
Choose a base branch
from

Conversation

NaijingGuo
Copy link
Contributor

@NaijingGuo NaijingGuo commented Apr 29, 2024

Description

Optimizer for adversarial suffix using GCG.
suffix attack from https://arxiv.org/pdf/2307.15043.pdf

Tests and Documentation

TODO:

Tests and documentation to be added.

@romanlutz romanlutz linked an issue Apr 29, 2024 that may be closed by this pull request
pyrit/adv_suffix/data/advbench/harmful_behaviors.csv Outdated Show resolved Hide resolved
pyrit/adv_suffix/experiments/main.py Outdated Show resolved Hide resolved
pyrit/adv_suffix/experiments/parse_script.py Outdated Show resolved Hide resolved
del control_cands, loss
gc.collect()

print("Current length:", len(self.workers[0].tokenizer(next_control).input_ids[1:]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not urgent, but eventually we'd want to use the logger the same way we do elsewhere in the repo. That way, you can specify the desired log level and only see the corresponding output.

pyrit/adv_suffix/experiments/evaluate_individual.py Outdated Show resolved Hide resolved
@romanlutz romanlutz requested a review from dlmgary April 29, 2024 20:03
@NaijingGuo NaijingGuo marked this pull request as draft May 5, 2024 21:17
@romanlutz
Copy link
Contributor

FYI #192 (comment)

@NaijingGuo NaijingGuo changed the title [DRAFT] FEAT Add adversarial suffix attack GCG FEAT Add adversarial suffix attack GCG May 24, 2024
@NaijingGuo NaijingGuo marked this pull request as ready for review May 24, 2024 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FEAT Add adversarial suffix attack GCG
3 participants