PlumX Metrics
Embed PlumX Metrics

Cipher-Prompt: Towards a Safe Diffusion Model via Learning Cryptographic Prompts

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), ISSN: 1611-3349, Vol: 14374 LNAI, Page: 322-332
2024
  • 0
    Citations
  • 0
    Usage
  • 0
    Captures
  • 0
    Mentions
  • 0
    Social Media
Metric Options:   Counts1 Year3 Year

Conference Paper Description

Security and privacy concerns associated with large generative models have recently attracted significant attention. In particular, there is a pressing need to address potential negative issues resulting from the generation of inappropriate images, including explicit, violent, or politically sensitive content. In this work, we propose a lightweight approach to learn cryptographic prompts, named Cipher-prompt, to prevent diffusion models from generating undesirable images that are semantically related to protected prompts. Cipher-prompt utilizes an untargeted attack objective to optimize a black-box model and generate perturbations that maximize the semantic distance between the protected class and the generated images. Therefore, Cipher-prompt does not require retraining or fine-tuning of the generative model or images as the training dataset. To evaluate the effectiveness of our proposed Cipher-prompt, we conduct thorough qualitative and quantitative experiments, measuring the protection failure rate and collateral impact rate. Experimental results show the efficacy of the proposed Cipher-prompt in balancing risk mitigation with the utility of diffusion-based image generation models.

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know