-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT: Adding Master Key Jailbreak #248
Conversation
@microsoft-github-policy-service agree company="Microsoft" |
await self._prompt_normalizer.send_prompt_async( | ||
normalizer_request=NormalizerRequest([target_master_prompt_obj]), | ||
target=self._prompt_target, | ||
conversation_id=conversation_id, | ||
labels=self._global_memory_labels, | ||
orchestrator_identifier=self.get_identifier(), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will probably work most of the time, but I wonder if we might need some error handling in case things don't go as planned. Have we seen any issues during testing that we might want to handle here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I havn't run into any issues with this part of the code yet, but I'll keep an eye out for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Targets handle retries if that's what you mean @dlmgary
) | ||
) | ||
|
||
batch_results = await asyncio.gather(*tasks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PromptNormalizer
handles all of this and does the asyncio.gather()
for you. Is there a a reason why we're not using that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't support multi-turn conversations at this point. It may be possible to extend it by plumbing through the conversation ID but there will be more work on this orchestrator anyway and I don't want to hold it up further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The prompt normalizer just sends the prompts directly using the send_prompt_async function, but here I'm using the send_master_key_prompt_async function instead becuase I need it to the master key prompt first, and then follow it up with the attack prompt.
def _chunked_prompts(self, prompts, size): | ||
for i in range(0, len(prompts), size): | ||
yield prompts[i : i + size] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Add type hints.
- Is there a specific reason why we're using generators here? I'm a fan but it seems a bit of an anti-pattern here.
- This function might not be needed at all, consider removing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is copied from prompt normalizer batching code. I suppose it makes it a tad cleaner than having this indexing logic mixed in with the rest of the code. I don't really care either way tbh.
Description
Opening this after accidently closing #235 due to git errors.
Tests and Documentation