Immunizing Images from Text to Image Editing via Adversarial Cross-Attention

Explainable & Ethical AI
Published: arXiv: 2509.10359v1
Authors

Matteo Trippodo Federico Becattini Lorenzo Seidenari

Abstract

Recent advances in text-based image editing have enabled fine-grained manipulation of visual content guided by natural language. However, such methods are susceptible to adversarial attacks. In this work, we propose a novel attack that targets the visual component of editing methods. We introduce Attention Attack, which disrupts the cross-attention between a textual prompt and the visual representation of the image by using an automatically generated caption of the source image as a proxy for the edit prompt. This breaks the alignment between the contents of the image and their textual description, without requiring knowledge of the editing method or the editing prompt. Reflecting on the reliability of existing metrics for immunization success, we propose two novel evaluation strategies: Caption Similarity, which quantifies semantic consistency between original and adversarial edits, and semantic Intersection over Union (IoU), which measures spatial layout disruption via segmentation masks. Experiments conducted on the TEDBench++ benchmark demonstrate that our attack significantly degrades editing performance while remaining imperceptible.

Paper Summary

Problem
The problem addressed by this research paper is the susceptibility of text-based image editing methods to adversarial attacks. These methods, which allow for fine-grained manipulation of visual content guided by natural language, can be exploited by malicious users to perform unwanted edits on images. This can lead to the creation of edited images that are difficult to distinguish from original ones, raising concerns about intellectual property and the spread of misinformation.
Key Innovation
The key innovation of this work is the Attention Attack, a novel adversarial attack that targets the visual component of editing methods. This attack disrupts the cross-attention between a textual prompt and the visual representation of the image by using an automatically generated caption of the source image as a proxy for the edit prompt. This breaks the alignment between the contents of the image and their textual description, making it difficult for editing methods to produce accurate results.
Practical Impact
The practical impact of this research is significant. By developing an effective adversarial attack, the authors demonstrate that it is possible to immunize images from unwanted edits. This has important implications for the development of image editing methods, as it highlights the need for robustness against adversarial attacks. Additionally, the novel evaluation strategies proposed by the authors, Caption Similarity and semantic Intersection over Union, provide a more accurate way to assess the effectiveness of image editing methods.
Analogy / Intuitive Explanation
Imagine a game of "spot the difference" where the rules are designed to make it difficult for the player to identify the changes made to an image. In this game, the Attention Attack is like a clever opponent that generates a misleading description of the original image, making it hard for the editing method to produce an accurate result. By disrupting the alignment between the textual and visual tokens, the Attention Attack creates a "difference" that is difficult to spot, resulting in an undesirable visual artifact.
Paper Information
Categories:
cs.CV
Published Date:

arXiv ID:

2509.10359v1

Quick Actions