Releases: leondz/garak
Releases · leondz/garak
v0.9.0.13
What's Changed
garak's 1st birthday! 🎂
Headlines in this birthday release:
- Multimodal support! LLaVa + FigStep - HUGE thanks to @DavidLee528
- LiteLLM generator support - thanks to @Tien-Cheng!
- DoNotAnswer probe for prompts an LLM should decline - thanks to @AhsanAyub!
- LangChain Serve generator support - thanks to @GustavFredrikson!
- Support for NIM generators, NVIDIA Inference Microservices
- OpenAI abstraction with parallelisation support
- Windows compatibility enhanced, with optional testing in the workflow
- Hugging Face speedup
plugins
- LiteLLM generator by @Tien-Cheng in #572
- Support for using 'langchain serve' endpoints by @GustavFredrikson in #588
- Enable parallel OpenAI calls by @jmartin-tech in #645
- Multi-modal Jailbreaking Attack on LLaVA by @DavidLee528 in #587
- bump openai module version to match paths in latest litellm by @leondz in #664
- generator: NIM by @leondz in #637
- Probe: Do Not Answer by @AhsanAyub in #608
architecture
- change supported pythons to 3.10-3.12 by @leondz in #503
- add more detailed TAP docs by @leondz in #504
- add multiple-result aggregator by @leondz in #505
- add post buff hook by @erickgalinkin in #506
- Add fleshed-out docs to all probes by @leondz in #507
- add bibtex under citation info in readme by @leondz in #511
- define broad test, all probes, 1 gen per by @leondz in #514
- Feature/taxonomy payloads by @leondz in #519
- include paraphrasing in broad conf by @leondz in #521
- choose whether buffing will also include the original prompt by @leondz in #523
- add config var for capping max # buffed prompts to add per buff by @leondz in #526
- document Probe.probe(); skip a buff hook if no buffs by @leondz in #527
- add type hints to base.Probe; fix base probe rst by @leondz in #528
- Bump datasets package by @shubhobm in #536
- Add ConversationalPipeline for huggingface models by @erickgalinkin in #539
- add generator for supporting openai module v0.x by @leondz in #553
- Update README.md by @erickgalinkin in #558
- Minor typo in FAQ by @jmartin-tech in #562
- Add additional error message when doc is None type by @DavidLee528 in #566
- shared constant & string literal by @jmartin-tech in #571
- Spelling corrections for multiple locations by @jmartin-tech in #564
- Reduce Huggingface GPU utilization by @erickgalinkin in #567
- skip
verbose
flag in secondary parser by @jmartin-tech in #576 - Added project twitter link and corrected a grammatic error by @codebrain001 in #578
- Convert GGML to expect GGUF format by @jmartin-tech in #581
- Update workflows: CLA asst bump, PR & manual testing by @leondz in #591
- add test de-duping using skip-duplicate-actions by @leondz in #597
- Remove
#!
entries from files not intended as executables by @jmartin-tech in #612 - Further align shebangs with code that has executable entry points by @leondz in #613
- interactive mode intro by @leondz in #614
- add tests for
ggml
generator by @jmartin-tech in #618 - add var for generator context_len and populate this for some generators by @leondz in #616
- allow generators.Base.generate() to take an optional param specifying generation count by @leondz in #600
- Enable windows tests as github action by @jmartin-tech in #626
- add on-demand macos testing by @leondz in #631
- macOS test install from correct path by @jmartin-tech in #633
- consolidate test file cleanup by @jmartin-tech in #634
- bump discord link by @leondz in #648
- meta the arguments a bit for GET vs other request types by @jmartin-tech in #640
stability
- fail gracefully if nvcf rejects input; compact zalgo prompts by @leondz in #509
- log & skip past NVCF 4xx errors by @leondz in #533
- fix empty autodan prompts & poor detector behaviour by @leondz in #534
- Fix AutoDAN issues by @erickgalinkin in #537
- fix bad nonetype handling in atkgen probe by @leondz in #538
- Division by zero error fixed in HTML report generation by @CoderMayhem in #545
- cap cohere lib version by @leondz in #569
- rm deprecated model from example by @leondz in #575
- Attack fixes by @erickgalinkin in #555
- More regex as string literal by @jmartin-tech in #586
- Bugfix/action dedupe by @leondz in #598
- wrap cli exec to gracefully catch keyboard exit signal by @jmartin-tech in #603
- Enforce warning output for
garak
classes. by @jmartin-tech in #605 - bump hf transformers v to avoid transformers#30076 by @leondz in #636
- update avidtools to remove typing reference by @jmartin-tech in #639
- torch v bump by @leondz in #649
- Pause FigStepTiny by @leondz in #652
- Bugfix/visual jailbreak pause by @leondz in #653
- limit push test to main by @jmartin-tech in #661
- Update MANIFEST.in so all resources are installed by @JKL98ISR in #660
- handle extant but closed
hitlogfile
file by @leondz in #665
New Contributors
- @CoderMayhem made their first contribution in #545
- @jmartin-tech made their first contribution in #562
- @codebrain001 made their first contribution in #578
- @Tien-Cheng made their first contribution in #572
- @GustavFredrikson made their first contribution in #588
- @JKL98ISR made their first contribution in #660
- @AhsanAyub made their first contribution in #608
Full Changelog: v0.9.0.12...v0.9.0.13
v0.9.0.12
What's Changed
plugins
- New encoding probes by @zmackie in #459
- OpenAI upgrade by @erickgalinkin in #477
- Low Resource Languages Buff by @erickgalinkin in #478
- Add Rasa generator by @rgstephens in #453
- Tree of Attacks by @erickgalinkin in #446
functionality improvements
- support multiple buffs by @leondz in #497
- wrap exception printing in repr by @leondz in #425
- add generators.function docs & examples by @leondz in #437
- update doc indices, add test to check them by @leondz in #450
- fix & unify REST generator timeout param names; set default request timeout to 20s by @leondz in #451
- add test to keep requirements in sync by @leondz in #465
- docs for buffs by @leondz in #466
- autosearch in the configs/ subdir for configs (no yaml extension should be given) by @leondz in #467
- Update function.py by @erickgalinkin in #500
- add warning when using a lite/default profile by @leondz in #476
- rename default output dir to garak_runs/; by @leondz in #488
- update openai model list by @leondz in #494
- make test_openai generation tests skip if no OAI API key set by @leondz in #491
fixes
- html report now uses correct basedir by @leondz in #439
- typos & clarifications in rest generator by @leondz in #436
- update manifest by @leondz in #454
- Avoid divide by zero error by @erickgalinkin in #458
- Fix/test pytest-8.0.0 order by @leondz in #472
- Check & enable Python 3.12 support by @leondz in #475
- move pathlib uses to _config.transient.basedir by @leondz in #499
- catch & handle HF hub exceptions loading dataset for package hallucination by @leondz in #470
New Contributors
- @zmackie made their first contribution in #459
- @rgstephens made their first contribution in #453
Full Changelog: v0.9.0.11...v0.9.0.12
v0.9.0.11.post1
v0.9.0.11
What's Changed
- Probe for repetition-based nudging into replay/spurious generation by @leondz in #404
- Probe for invisible text prompt injections by @leondz in #397
- Probe for the 'DAN in the wild' paper's library of jailbreak prompts by @leondz in #405
- Probe for NYT & The Guardian content in training data by @leondz in #402
- Add NVIDIA cloud functions generator by @leondz in #398
- Add toxicity generation deep test config by @leondz in #413
- Generator enhancements and minor improvements by @shubhobm in #391
- Update HF inf api generator to match their current expectations by @leondz in #400
- Invoke garak on the command line, with
garak
by @leondz in #410 - Mitigate continuation probe oversensitivity by @leondz in #394
- Handle nvcf container timeouts by @leondz in #399
- Fixing Exception Cause By Type Error When Scanning LLMs Via Replicate by @DavidLee528 in #401
- Make sure triggers attempt.note is saved in hitlog by @leondz in #403
- Repeat replay now optionally overrides generator max len by @leondz in #408
- Replay.Repeat now preserves attempt when restoring generator max_tokens by @leondz in #409
- Gracefully handle NVCF request timeouts & other failures by @leondz in #411
- Fix deprecated encoding by @leondz in #412
- Better coverage in mitigation bypass detector
Full Changelog: v0.9.0.10...v0.9.0.11
v0.9.0.10
- Probes can now be selected by MISP tag, e.g. owasp:llm01
- garak now automatically creates an HTML report on completion
- HTML reports can be grouped by module but also by probe tag category, so you can see e.g. top-level scores for prompt injection, hallucination, and so on
- logs now go to a dedicated log dir by default, to keep things clean
- new buffs: encoding.Base64, encoding.CharCode
- new generator: NeMo guardrails
- new probe: AutoDAN
- RealToxicityProbes now only loads local lists, much faster
- update OpenAI models list
- fix attempt parameter stability
- better logging of config params
- atk is now atkgen
Contributions from @erickgalinkin , @drazvan . Enjoy & Happy holidays! 🎅🎄
What's Changed
- Attempt no longer uses mutable type defaults by @leondz in #360
- Add NeMoGuardrails generator (WIP). by @drazvan in #345
- add test for mutable defaults bug in attempt.Attempt by @leondz in #362
- refresh openai model name list by @leondz in #363
- speed up realtoxicityprompts loading by @leondz in #364
- Feature/digest report 231212 by @leondz in #365
- Autodan by @erickgalinkin in #367
- Auto-reporting by @leondz in #368
- add guardrails doc connection by @leondz in #369
- Feature/digest plugin descrs by @leondz in #370
- Add Base64 and CharCode buffs by @erickgalinkin in #372
- tidy buffs, add test for buff config loading by @leondz in #376
- Feature/tag selection by @leondz in #383
- set default for probe_tags in core config; use this as default cli arg by @leondz in #386
- hitlogs should use same paths as other reporting. add test for this by @leondz in #387
- Feature/reporting categories by @leondz in #389
New Contributors
Full Changelog: v0.9.0.9...v0.9.0.10
v0.9.0.9
garak v0.9.0.9
- Added GCG jailbreak probe (probes.gcg.GCG_Probe)
- Add support for NVIDIA Optimum (generators.huggingface.OptimumPipeline)
- Add OWASP tags to probes
- Add fast & slow paraphrase buffs (buffs.paraphrase.Fast, buffs.paraphrase.PegasusT5)
- Support for config files: there's a core config, site config, and a CLI config, and all can be used to set system, run, and plugin parameters
- Supply some sample config files for a few different styles of garak run
- Progress bar for buffs
- Added debugging REST server for dev
- Move RealToxicityPrompts resources to their own subdir
Thanks to @erickgalinkin @drazvan @DavidLee528
v0.9.0.8
- Rename ART to AG (Attack Generator)
- Add generator support for NeMo LLM
- Add generator support for OctoML
- Add generic REST connector, with configs
- Add option to parallelise requests
- Add option to parallelise attempts
- Include AutoDAN probe
- Added "interactive mode", where you get a garak CLI 🎉
- Fix continuation probe trigger alignment
- Fix RTP prompts to be aggressive
- Add support for langchain LLM interface
- Upgrade in avidtools
- Improve checking for detector names in probes
- Turn-by-turn visual indicator on attack generator probe
v0.9.0.7
- tests, tests, tests
- docstrings in many classes, also in the documentation (https://reference.garak.ai/)
- improved package hallucination probe prompts
- speedup on package hallucination detector scan
v0.9.0.6
New in garak!
- integrated vulnerability reporting: vulnerabilities found with garak can now be directly reported to AVID @shubhobm
- package hallucination: added a probe for detecting package hallucination
- docs are up: reference guide is here, https://reference.garak.ai/
- primary/extended detectors: it's now possible to designate a primary detector for a probe (when using the default probewise harness)
- multiple payloads for encoding module: as well as the default option, there's slurs and xss injection attempts; access them with
--probe_options '{"encoding.options": ["default", "slurs", "xss"]}'
(adjust to taste) - fine-tune perspective api backoff for bandwidth: never wait sixty seconds, the window use to determine rate limit
- doc fixes: @mkonxd
- hitlog entries now more self-contained: store how many generations were targeted with that prompt
- remove shortnames: from probes and detectors
- move encoding injection module to use triggers: finer-grained detection, means fewer false positives