dc.contributor.author | Li, Mengtian | en_US |
dc.contributor.author | Dong, Yi | en_US |
dc.contributor.author | Lin, Minxuan | en_US |
dc.contributor.author | Huang, Haibin | en_US |
dc.contributor.author | Wan, Pengfei | en_US |
dc.contributor.author | Ma, Chongyang | en_US |
dc.contributor.editor | Chaine, Raphaëlle | en_US |
dc.contributor.editor | Deng, Zhigang | en_US |
dc.contributor.editor | Kim, Min H. | en_US |
dc.date.accessioned | 2023-10-09T07:34:52Z | |
dc.date.available | 2023-10-09T07:34:52Z | |
dc.date.issued | 2023 | |
dc.identifier.issn | 1467-8659 | |
dc.identifier.uri | https://doi.org/10.1111/cgf.14952 | |
dc.identifier.uri | https://diglib.eg.org:443/handle/10.1111/cgf14952 | |
dc.description.abstract | In this work, we introduce a new approach for face stylization. Despite existing methods achieving impressive results in this task, there is still room for improvement in generating high-quality artistic faces with diverse styles and accurate facial reconstruction. Our proposed framework, MMFS, supports multi-modal face stylization by leveraging the strengths of StyleGAN and integrates it into an encoder-decoder architecture. Specifically, we use the mid-resolution and high-resolution layers of StyleGAN as the decoder to generate high-quality faces, while aligning its low-resolution layer with the encoder to extract and preserve input facial details. We also introduce a two-stage training strategy, where we train the encoder in the first stage to align the feature maps with StyleGAN and enable a faithful reconstruction of input faces. In the second stage, the entire network is fine-tuned with artistic data for stylized face generation. To enable the fine-tuned model to be applied in zero-shot and one-shot stylization tasks, we train an additional mapping network from the large-scale Contrastive-Language-Image-Pre-training (CLIP) space to a latent w+ space of fine-tuned StyleGAN. Qualitative and quantitative experiments show that our framework achieves superior performance in both one-shot and zero-shot face stylization tasks, outperforming state-of-the-art methods by a large margin. | en_US |
dc.publisher | The Eurographics Association and John Wiley & Sons Ltd. | en_US |
dc.subject | CCS Concepts: Computing methodologies -> Image processing | |
dc.subject | Computing methodologies | |
dc.subject | Image processing | |
dc.title | Multi-Modal Face Stylization with a Generative Prior | en_US |
dc.description.seriesinformation | Computer Graphics Forum | |
dc.description.sectionheaders | Virtual Humans | |
dc.description.volume | 42 | |
dc.description.number | 7 | |
dc.identifier.doi | 10.1111/cgf.14952 | |
dc.identifier.pages | 10 pages | |