FEAT: Adding Master Key Jailbreak #248

SafwanA02 · 2024-06-18T01:54:23Z

Description

Opening this after accidently closing #235 due to git errors.

implemented batching to improve efficiency in "send_master_key_with_prompts_async" function in master_key_orchestrator
added "print_conversation" function to master_key_orchestrator
addressed comments from previous PR

Tests and Documentation

added test to make sure that conversation id's are the same in corresponding prompts

SafwanA02 · 2024-06-18T01:57:01Z

10. ie

@microsoft-github-policy-service agree company="Microsoft"

This reverts commit 5a85020.

pyrit/orchestrator/master_key_orchestrator.py

doc/demo/7_master_key.ipynb

tests/orchestrator/test_master_key_orchestrator.py

dlmgary · 2024-06-18T18:36:11Z

pyrit/orchestrator/master_key_orchestrator.py

+        await self._prompt_normalizer.send_prompt_async(
+            normalizer_request=NormalizerRequest([target_master_prompt_obj]),
+            target=self._prompt_target,
+            conversation_id=conversation_id,
+            labels=self._global_memory_labels,
+            orchestrator_identifier=self.get_identifier(),
+        )


This will probably work most of the time, but I wonder if we might need some error handling in case things don't go as planned. Have we seen any issues during testing that we might want to handle here?

I havn't run into any issues with this part of the code yet, but I'll keep an eye out for it.

Targets handle retries if that's what you mean @dlmgary

dlmgary · 2024-06-18T18:42:31Z

pyrit/orchestrator/master_key_orchestrator.py

+                    )
+                )
+
+            batch_results = await asyncio.gather(*tasks)


The PromptNormalizer handles all of this and does the asyncio.gather() for you. Is there a a reason why we're not using that?

It doesn't support multi-turn conversations at this point. It may be possible to extend it by plumbing through the conversation ID but there will be more work on this orchestrator anyway and I don't want to hold it up further.

The prompt normalizer just sends the prompts directly using the send_prompt_async function, but here I'm using the send_master_key_prompt_async function instead becuase I need it to the master key prompt first, and then follow it up with the attack prompt.

dlmgary · 2024-06-18T18:46:26Z

pyrit/orchestrator/master_key_orchestrator.py

+    def _chunked_prompts(self, prompts, size):
+        for i in range(0, len(prompts), size):
+            yield prompts[i : i + size]


Add type hints.

Is there a specific reason why we're using generators here? I'm a fan but it seems a bit of an anti-pattern here.

This function might not be needed at all, consider removing it.

This is copied from prompt normalizer batching code. I suppose it makes it a tad cleaner than having this indexing logic mixed in with the rest of the code. I don't really care either way tbh.

…ents"

updated master_key_orchestrator

2bedf53

SafwanA02 added 3 commits June 17, 2024 20:14

Revert "MAINT Making Prompt Converters Async (Azure#211)"

157cabe

This reverts commit 5a85020.

Merge remote-tracking branch 'origin' into master_key_orchestrator

cfb5025

added init file changes

701183d

SafwanA02 changed the title ~~FEAT: Updating Master Key Jailbreak~~ FEAT: Adding Master Key Jailbreak Jun 18, 2024

merged main branch into master key branch

cefd482