Attention Distillation: A Unified Approach to Visual Characteristics Transfer

Shenzhen University
CVPR 2025

*Corresponding Author

Given a reference image, our approach can faithfully reproduce its visual characteristics in synthesis, providing a unified framework for a wide range of example-based image synthesis applications, such as artistic style transfer, appearance transfer, stylespecific text-to-image generation, and various texture synthesis tasks.

Abstract

Recent advances in generative diffusion models have shown a notable inherent understanding of image style and semantics. In this paper, we leverage the self-attention features from pretrained diffusion networks to transfer the visual characteristics from a reference to generated images. Unlike previous work that uses these features as plug-andplay attributes, we propose a novel attention distillation loss calculated between the ideal and current stylization results, based on which we optimize the synthesized image via backpropagation in latent space. Next, we propose an improved Classifier Guidance that integrates attention distillation loss into the denoising sampling process, further accelerating the synthesis and enabling a broad range of image generation applications. Extensive experiments have demonstrated the extraordinary performance of our approach in transferring the examples’ style, appearance, and texture to new images in synthesis.

Method

Gallery of Attention Distillation Results

Qualitative Comparisons




User Study

More Applications