Release v0.5.0 · Azure/PyRIT

What's Changed

PyRIT now has a website
We've been working on standardizing orchestrators in terms of naming and functionality:
- The endpoint (of type PromptTarget) that PyRIT attacks will be referred to as objective_target.
- The endpoint (of type PromptChatTarget) that helps us craft attacks will be referred to as adversarial_chat.
- Beyond that, we've settled on a common interface for multi-turn orchestrators with a shared result object.
- Instead of an attack_strategy arg we require a file path called adversarial_chat_system_prompt_path to make the connection to the adversarial_chat target clearer. Some orchestrators have a default for this, of course.
- The initial prompt to the adversarial_chat is now called adversarial_chat_seed_prompt to also help with clarity and connection to adversarial_chat
- Sometimes we use multiple scorers. For that reason, objective_scorer will be the scorer that decides if the objective has been achieved. Other scorers have similarly specific names, e.g., on_topic_scorer in the CrescendoOrchestrator
- The new standard name for all orchestrators to execute an attack is run_attack_async.
The standardization is not fully completed yet but will continue in future releases. So far, CrescendoOrchestrator, TreeOfAttacksWithPruningOrchestrator, and RedTeamingOrchestrator have been adjusted.
Support for a centralized database using Azure SQL as an optional alternative to a local DuckDB database.
Introduced (multi-modal) SeedPrompts and SeedPromptDatasets as a starting point for red teaming ops with integration to our databases.
New orchestrators and auxiliary attacks:
- FuzzerOrchestrator with 5 template converters
- GCG support via Azure ML pipelines to optimize adversarial suffixes
- FlipAttackOrchestrator
New targets:
- HuggingFaceChatTarget
- HTTPTarget
- Open AI and Azure Open AI targets were refactored to simplify the logic. They now share a common interface OpenAITarget and you can decide between Azure vs. Open AI using is_azure_target=True or False.
New datasets:
- HarmBench
- PKU-SafeRLHF
- wmdp-bio, wmdp-chem, and wmdp-cyber (now fetchable from the original data source)
- AdvBench
- Decoding Trust Stereotypes
- LLM-LAT/harmful-dataset
- tdc23 red teaming dataset
- TrustAIRLab/forbidden_question_set
- LibrAI 'Do Not Answer' Dataset
New converters:
- QRCodeConverter
- AzureSpeechAudioToTextConverter
- URLConverter
- HumanInTheLoopConverter
- ColloquialWordswapConverter
- UnicodeConfusableConverter (updated with new functionality)
- CharSwapGenerator
- MaliciousQuestionGeneratorConverter
- AsciiSmugglerConverter
- MathPromptConverter
- AudioFrequencyConverter
- ZeroWidthConverter
- DiacriticConverter
New scorers:
- SelfAskRefusalScorer
- HumanInTheLoopScorer
- InsecureCodeScorer
We generally use a .env file to configure details of endpoints that PyRIT needs to execute. A new .env.local override file allow for further customization.
Finally, PyRIT now comes with several extras that you can install using pip install pyrit[<extra>]
- dev includes developer dependencies that you shouldn't need unless you plan on contributing to the project.
- torch includes just pytorch which is needed for some targets (e.g. Hugging Face) or auxiliary attacks (e.g., GCG) but not core functionality. This allows you to choose whether you want to install it.
- gcg includes extra dependencies that are only needed for running GCG. Since this requires dedicated compute (ideally with GPU) you can choose whether it is required for you.
- all includes all of the above.

Full list of changes

MAINT Update release version to 0.4.1.dev0 by @rdheekonda in #342
[FEAT] QRCodeConverter by @jsong468 in #339
[MAINT] Delete output_filename arg in image/text and text/image converters by @jsong468 in #344
MAINT Update Release Instructions by @rdheekonda in #345
FEAT: Add Likert scoring definition and prompt templates for persuasion and deception by @saphirqi7 in #307
[FEAT] Add "task" to the scoring memory entry by @jsong468 in #349
FEAT: Add fetch function for datasets from HarmBench #270 by @KutalVolkan in #341
FEAT Add SQL Entra Auth for Azure SQL Server by @elgertam in #330
[MAINT] Fix typos in OllamaChatTarget by @riedgar-ms in #357
[FEAT] Azure Speech Audio to Text Converter by @jsong468 in #352
FEAT: Add Rate Limit (RPM) Threshold Parameter to Prompt Targets by @nina-msft in #331
FIX: correct type of the top_p argument in various PromptTarget classes by @s-zanella in #366
FEAT Add ability to fetch PKU-SafeRLHF Data by @enrajka in #374
FEAT: Refusal Scorer by @rlundeen2 in #371
FEAT Add ability to fetch wmdp-bio, wmdp-chem, and wmdp-cyber datasets by @mshirsekar1 in #380
TEST skip failing auth test after the new azure.identity version was released by @romanlutz in #387
FEAT Added AdvBench dataset by @enrajka in #383
FEAT: Fuzzer orchestrator by @gseetha04 in #360
FIX Crescendo Bug and Improve Scorer Metaprompt Handling by @rdheekonda in #389
FEAT: Add Centralized DB Support Using Azure by @rdheekonda in #379
FIX: Updating memory and fixing bugs by @rlundeen2 in #394
FEAT: Handling duplicate memory for PromptRequestPiece/Score entries by @jsong468 in #369
[FEAT] Decoding Trust Stereotypes Dataset by @jsong468 in #385
FEAT Centralized DB Support for Azure Speech Converters by @rdheekonda in #402
FEAT add additional template converters for fuzzer orchestrator (crossover, similar, rephrase) by @roeybc in #378
DOC: Update Custom Targets Demo Docs by @nina-msft in #404
FEAT New URL Converter by @jbolor21 in #399
[FEAT] HumanInTheLoop Converter by @jsong468 in #401
DOC: Updating RTO example to use gpt4o for scoring by @rlundeen2 in #408
MAINT: Crescendo and Score Refactor by @rlundeen2 in #405
FEAT: Colloquial Wordswap Attack by @eugeniavkim in #406
FEAT emoji jailbreak by @romanlutz in #314
MAINT: Add Refusal docs and Filter logic by @rlundeen2 in #431
DOC: Moving rate limiting to target by @rlundeen2 in #433
FEAT: optimized huggingface model support by @KutalVolkan in #354
DOC Enhance Azure SQL Database Setup and Permissions Documentation by @rdheekonda in #434
FIX Azure SQL DB Permissions by @rdheekonda in #440
FIX: Handle JSON markdown format exceptions by @meisman-ms in #435
FEAT: Add ability to send prepend to the conversation in PromptSendingOrchestrator by @rlundeen2 in #441
FEAT: Homoglyph Attack by @KutalVolkan in #407
FEAT: Charswap Attack by @KutalVolkan in #403
Add Python option for generate docs scripts by @sf-msft in #375
FEAT: Violent Durian Attack Strategy by @KutalVolkan in #398
FEAT GCG algorithm and AML pipeline by @blakebullwinkel in #381
MAINT: Adding original values as score metadata for Azure Safety and Likert Scorers by @rlundeen2 in #445
[DOC] Note on notebooks by @riedgar-ms in #460
FIX: Fixing pre-commit check_links by @rlundeen2 in #462
FEAT: Adding Flip Attack by @rlundeen2 in #456
[FIX] Allow AAD Auth for AzureContentFilterScorer by @riedgar-ms in #455
FEAT: Adding New Generic HTTP Target by @jbolor21 in #446
MAINT: Rounds in CrescendoOrchestrator are now "Turns" by @jsong468 in #470
DOC Add doc changes for database setup by @eugeniavkim in #476
FEAT: OpenAI Target Refactor by @rlundeen2 in #466
DOC: Edit Image Text Converter Docs by @jbolor21 in #477
FEAT: Malicious Question Generator by @KutalVolkan in #397
FIX: Changed AzureSpeechTextToAudioConverter input_type to text and added converter input_supported tests by @jsong468 in #472
FEAT added ascii smuggler converter by @gio-msft in #479
DOC Fix Invalid MD File Referenced in Deploy HF Model to Azure ML Module by @rdheekonda in #485
FIX: Re-Ran Jupytext on Crescendo Notebook by @jsong468 in #484
FIX Warnings in pipelines (Issue #442) by @Tiger-Du in #481
FEAT Add LLM-LAT/harmful-dataset #420 by @SnehaDharne in #437
FIX: Small Notebook Fixes and env_example updates by @jsong468 in #487
FEAT add tdc23 red teaming dataset by @Lakshmiaddepalli in #438
MAINT Adding TrueFalseQuestion to initialize scorer more easily by @rlundeen2 in #488
MAINT: Stripping json in llm scorers by @rlundeen2 in #489
DOC: Adds citation section to README.md by @dlmgary in #491
FIX Updating env variable for DALL E by @eugeniavkim in #492
FIX: Remove Duplicate Import Statement in Documentation Examples by @douyipu in #495
FIX changed OpenAIChatTarget default values by @blakebullwinkel in #496
[DRAFT] FEAT: MathPromptConverter to Transform Prompts into Mathematical Problems by @KutalVolkan in #490
FIX Set Unique Conversation IDs (RedTeamingOrchestrator) by @nina-msft in #468
MAINT: Consolidate UnicodeConfusableConverter and HomoglyphGeneratorConverter by @jsong468 in #497
Fix PromptMemoryEntry columns data types to support non-English values by @rdheekonda in #499
FIX Added "Invalid prompt" OAI error to bad request exception handler by @blakebullwinkel in #500
MAINT: Consistency Improvements by @rlundeen2 in #498
[DRAFT] DOC: Add Skeleton Key Attack Demo by @KutalVolkan in #502
FIX Include max_completion_tokens argument for OpenAIChatTarget by @nina-msft in #501
FEAT: Add audio frequency converter by @michellemorales in #478
FIX: Separating OpenAIChatTarget Arguments by @rlundeen2 in #505
MAINT: Refactor azure ml target by @jsong468 in #463
MAINT: Adding MultiTurn Abstract Orchestrator Interface by @rlundeen2 in #504
FEAT Add TrustAIRLab/forbidden_question_set Dataset #453 by @ritikakumar0204 in #503
FEAT: database connector to store and retrieve prompts, prompt templates, and prompt groups by @romanlutz in #396
FIX fix references to renamed powershell files by @mhaoda in #510
FEAT Add export for conversations and scores by @eugeniavkim in #517
FIX: Removed unnecessary add_response_entries_to_memory mocking and changed normalized target 'endpoint' param by @jsong468 in #521
MAINT: Removing SeedPromptTemplate by @rlundeen2 in #520
MAINT: Remove many shot Template by @rlundeen2 in #522
FEAT: Add Zero-Width-Converter by @KutalVolkan in #519
FEAT: Add Diacritics Converter by @KutalVolkan in #518
MAINT: Standardizing Multi-Turn Orchestrators by @rlundeen2 in #509
MAINT: Removing attack strategy by @rlundeen2 in #525
FEAT add seed prompt dataset loading function for legacy datasets by @romanlutz in #524
DOC Add jupyterbook project site page by @sf-msft in #430
FIX outdated link by @romanlutz in #533
FEAT: Functionality to update PromptMemoryEntries by @jsong468 in #531
FEAT HITL Scorers by @jbolor21 in #493
MAINT: Add Centralized Memory Management by @rdheekonda in #527
MAINT Update DuckDB Memory Demo Notebook Documentation by @rdheekonda in #536
FIX use cluster for compute by @romanlutz in #538
FIX Remove aria2c dependency from HuggingFace Target by @nina-msft in #530
[FIX] Fix broken azure_auth test by @jsong468 in #544
FIX import tkinter only when using it to avoid import errors on ubuntu/macos by @romanlutz in #542
DOC publish to GH pages when pushing changes to main by @romanlutz in #545
FIX Fuzzer Converter Templates by @rdheekonda in #546
FEAT: Add Insecure Code Scorer by @KutalVolkan in #523
MAINT: Updating refusal scorer to work without tasks by @rlundeen2 in #547
DOC bring back numbering for user guide, raise build issues as errors, and fix warnings by @romanlutz in #549
FIX remove unnecessary threshold arg by @romanlutz in #550
MAINT: Allowing prepending conversations in PSO from memory by @rlundeen2 in #555
FEAT Enhance .env loading with optional .env.local overrides by @rdheekonda in #559
MAINT update dependencies to separate torch into an extra, prune unnecessary ones, and related small fixes by @romanlutz in #556
FIX remove timezone info, pass timestamp around when retrieving data from DB by @romanlutz in #560
Fix TAP Orchestrator Invalid Argument by @rdheekonda in #561
DOC: Relocate use_huggingface_chat_target notebook and script to targets directory by @KutalVolkan in #558
FIX: Fixing bug in doc and adding repr to models by @rlundeen2 in #564
MAINT: TAP Multi-turn refactor by @rlundeen2 in #562
FEAT: Add LibrAI 'Do Not Answer' Dataset by @KutalVolkan in #565
DOC: Add batch scoring example for SelfAskTrueFalseScorer by @KutalVolkan in #563
FIX: Fixing and improving crescendo adversarial_chat prompt by @rlundeen2 in #570
FIX repair component governance by @romanlutz in #557
FEAT: Pass arguments to http client by @AlexRRR in #554
[FEAT] Global Memory Labels by @jsong468 in #571
FIX release related fixes by @romanlutz in #575

New Contributors

@saphirqi7 made their first contribution in #307
@riedgar-ms made their first contribution in #357
@s-zanella made their first contribution in #366
@enrajka made their first contribution in #374
@mshirsekar1 made their first contribution in #380
@gseetha04 made their first contribution in #360
@roeybc made their first contribution in #378
@eugeniavkim made their first contribution in #406
@meisman-ms made their first contribution in #435
@sf-msft made their first contribution in #375
@gio-msft made their first contribution in #479
@Tiger-Du made their first contribution in #481
@SnehaDharne made their first contribution in #437
@Lakshmiaddepalli made their first contribution in #438
@douyipu made their first contribution in #495
@michellemorales made their first contribution in #478
@ritikakumar0204 made their first contribution in #503
@mhaoda made their first contribution in #510
@AlexRRR made their first contribution in #554

Full Changelog: v0.4.0...v0.5.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.5.0

What's Changed

Full list of changes

New Contributors

Contributors