Skip to content

v0.5.0

Latest
Compare
Choose a tag to compare
@romanlutz romanlutz released this 27 Nov 00:28
· 3 commits to main since this release
c3a1a48

What's Changed

  • PyRIT now has a website

  • We've been working on standardizing orchestrators in terms of naming and functionality:

    • The endpoint (of type PromptTarget) that PyRIT attacks will be referred to as objective_target.
    • The endpoint (of type PromptChatTarget) that helps us craft attacks will be referred to as adversarial_chat.
    • Beyond that, we've settled on a common interface for multi-turn orchestrators with a shared result object.
    • Instead of an attack_strategy arg we require a file path called adversarial_chat_system_prompt_path to make the connection to the adversarial_chat target clearer. Some orchestrators have a default for this, of course.
    • The initial prompt to the adversarial_chat is now called adversarial_chat_seed_prompt to also help with clarity and connection to adversarial_chat
    • Sometimes we use multiple scorers. For that reason, objective_scorer will be the scorer that decides if the objective has been achieved. Other scorers have similarly specific names, e.g., on_topic_scorer in the CrescendoOrchestrator
    • The new standard name for all orchestrators to execute an attack is run_attack_async.

    The standardization is not fully completed yet but will continue in future releases. So far, CrescendoOrchestrator, TreeOfAttacksWithPruningOrchestrator, and RedTeamingOrchestrator have been adjusted.

  • Support for a centralized database using Azure SQL as an optional alternative to a local DuckDB database.

  • Introduced (multi-modal) SeedPrompts and SeedPromptDatasets as a starting point for red teaming ops with integration to our databases.

  • New orchestrators and auxiliary attacks:

    • FuzzerOrchestrator with 5 template converters
    • GCG support via Azure ML pipelines to optimize adversarial suffixes
    • FlipAttackOrchestrator
  • New targets:

    • HuggingFaceChatTarget
    • HTTPTarget
    • Open AI and Azure Open AI targets were refactored to simplify the logic. They now share a common interface OpenAITarget and you can decide between Azure vs. Open AI using is_azure_target=True or False.
  • New datasets:

    • HarmBench
    • PKU-SafeRLHF
    • wmdp-bio, wmdp-chem, and wmdp-cyber (now fetchable from the original data source)
    • AdvBench
    • Decoding Trust Stereotypes
    • LLM-LAT/harmful-dataset
    • tdc23 red teaming dataset
    • TrustAIRLab/forbidden_question_set
    • LibrAI 'Do Not Answer' Dataset
  • New converters:

    • QRCodeConverter
    • AzureSpeechAudioToTextConverter
    • URLConverter
    • HumanInTheLoopConverter
    • ColloquialWordswapConverter
    • UnicodeConfusableConverter (updated with new functionality)
    • CharSwapGenerator
    • MaliciousQuestionGeneratorConverter
    • AsciiSmugglerConverter
    • MathPromptConverter
    • AudioFrequencyConverter
    • ZeroWidthConverter
    • DiacriticConverter
  • New scorers:

    • SelfAskRefusalScorer
    • HumanInTheLoopScorer
    • InsecureCodeScorer
  • We generally use a .env file to configure details of endpoints that PyRIT needs to execute. A new .env.local override file allow for further customization.

  • Finally, PyRIT now comes with several extras that you can install using pip install pyrit[<extra>]

    • dev includes developer dependencies that you shouldn't need unless you plan on contributing to the project.
    • torch includes just pytorch which is needed for some targets (e.g. Hugging Face) or auxiliary attacks (e.g., GCG) but not core functionality. This allows you to choose whether you want to install it.
    • gcg includes extra dependencies that are only needed for running GCG. Since this requires dedicated compute (ideally with GPU) you can choose whether it is required for you.
    • all includes all of the above.

Full list of changes

New Contributors

Full Changelog: v0.4.0...v0.5.0