Abstract
Deep generative models allow the synthesis of realistic human faces from freehand sketches or semantic maps. However, although they are flexible, sketches and semantic maps provide too much freedom for manipulation, and thus, are not easy for novice users to control. In this study, we present DeepFaceReshaping, a novel landmark-based deep generative framework for interactive face reshaping. To edit the shape of a face realistically by manipulating a small number of face landmarks, we employ neural shape deformation to reshape individual face components. Furthermore, we propose a novel Transformer-based partial refinement network to synthesize the reshaped face components conditioned on the edited landmarks, and fuse the components to generate the entire face using a local-to-global approach. In this manner, we limit possible reshaping effects within a feasible component-based face space. Thus, our interface is intuitive even for novice users, as confirmed by a user study. Our experiments demonstrate that our method outperforms traditional warping-based approaches and recent deep generative techniques.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Gu, S.; Bao, J.; Yang, H.; Chen, D.; Wen, F.; Yuan, L. Mask-guided portrait editing with conditional GANs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3436–3445, 2019.
Lee, C. H.; Liu, Z.; Wu, L.; Luo, P. MaskGAN: Towards diverse and interactive facial image manipulation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5549–5558, 2020.
Zhu, P.; Abdal, R.; Qin, Y.; Wonka, P. SEAN: Image synthesis with semantic region-adaptive normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5104–5113, 2020.
Portenier, T.; Hu, Q.; Szabó, A.; Bigdeli, S. A.; Favaro, P.; Zwicker, M. FaceShop: Deep sketch-based face image editing. arXiv preprint arXiv:1804.08972, 2018.
Jo, Y.; Park, J. SC-FEGAN: Face editing generative adversarial network with user’s sketch and color. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 1745–1753, 2019.
Blanz, V.; Vetter, T. A morphable model for the synthesis of 3D faces. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, 187–194, 1999.
Li, H.; Weise, T.; Pauly, M. Example-based facial rigging. ACM Transactions on Graphics Vol. 29, No. 4, Article No. 32, 2010.
Chen, S. Y.; Su, W.; Gao, L.; Xia, S.; Fu, H. DeepFaceDrawing: Deep generation of face images from sketches. ACM Transactions on Graphics Vol. 39, No. 4, Article No. 72, 2020.
Han, X.; Gao, C.; Yu, Y. DeepSketch2Face: A deep learning based sketching system for 3D face and caricature modeling. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 126, 2017.
Zhou, S.; Fu, H.; Liu, L.; Cohen-Or, D.; Han, X. Parametric reshaping of human bodies in images. ACM Transactions on Graphics Vol. 29, No. 4, Article No. 126, 2010.
Litany, O.; Bronstein, A.; Bronstein, M.; Makadia, A. Deformable shape completion with graph convolutional autoencoders. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1886–1895, 2018.
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In: Proceedings of the Advances in Neural Information Processing Systems, 2672–2680, 2014.
Mirza, M.; Osindero, S. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784, 2014.
Isola, P.; Zhu, J. Y.; Zhou, T.; Efros, A. A. Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1125–1134, 2017.
Wang, T. C.; Liu, M. Y.; Zhu, J. Y.; Tao, A.; Kautz, J.; Catanzaro, B. High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8798–8807, 2018.
Park, T.; Liu, M. Y.; Wang, T. C.; Zhu, J. Y. Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2337–2346, 2019.
Wang, T. C.; Liu, M. Y.; Zhu, J. Y.; Liu, G.; Tao, A.; Kautz, J.; Catanzaro, B. Video-to-video synthesis. arXiv preprint arXiv:1808.06601, 2018.
Wang, T. C.; Liu, M. Y.; Tao, A.; Liu, G.; Kautz, J.; Catanzaro, B. Few-shot video-to-video synthesis. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, 5013–5024, 2019.
Sangkloy, P.; Lu, J.; Fang, C.; Yu, F.; Hays, J. Scribbler: Controlling deep image synthesis with sketch and color. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5400–5409, 2017.
Karras, T.; Laine, S.; Aila, T. A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4401–4410, 2019.
Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. Analyzing and improving the image quality of StyleGAN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8110–8119, 2020.
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L. U.; Polosukhin, I. Attention is all you need. In: Proceedings of the 31st Conference on Neural Information Processing Systems, 2017.
Parmar, N.; Vaswani, A.; Uszkoreit, J.; Kaiser, L.; Shazeer, N.; Ku, A.; Tran, D. Image transformer. In: Proceedings of the 35th International Conference on Machine Learning, 4055–4064, 2018.
Jiang, Y.; Chang, S.; Wang, Z. TransGAN: Two pure transformers can make one strong GAN, and that can scale up. arXiv preprint arXiv:2102.07074, 2021.
Esser, P.; Rombach, R.; Ommer, B. Taming transformers for high-resolution image synthesis. arXiv preprint arXiv:2012.09841, 2020.
Hudson, D. A.; Zitnick, C. L. Generative adversarial transformers. arXiv preprint arXiv:2103.01209, 2021.
Leyvand, T.; Cohen-Or, D.; Dror, G.; Lischinski, D. Data-driven enhancement of facial attractiveness. In: Proceedings of the ACM SIGGRAPH Papers, Article No. 38, 2008.
Tang, X.; Sun, W.; Yang, Y. L.; Jin, X. Parametric reshaping of portraits in videos. In: Proceedings of the 29th ACM International Conference on Multimedia, 4689–4697, 2021.
Kaufmann, P.; Wang, O.; Sorkine-Hornung, A.; Sorkine-Hornung, O.; Smolic, A.; Gross, M. Finite element image warping. Computer Graphics Forum Vol. 32, No. 2pt1, 31–39, 2013.
Xiao, Q.; Tang, X.; Wu, Y.; Jin, L.; Yang, Y. L.; Jin, X. Deep shapely portraits. In: Proceedings of the 28th ACM International Conference on Multimedia, 1800–1808, 2020.
Averbuch-Elor, H.; Cohen-Or, D.; Kopf, J.; Cohen, M. F. Bringing portraits to life. ACM Transactions on Graphics Vol. 36, No. 6, Article No. 196, 2017.
Han, X.; Hou, K.; Du, D.; Qiu, Y.; Cui, S.; Zhou, K.; Yu, Y. CaricatureShop: Personalized and photorealistic caricature sketching. IEEE Transactions on Visualization and Computer Graphics Vol. 26, No. 7, 2349–2361, 2020.
Zhao, L.; Han, F.; Peng, X.; Zhang, X.; Kapadia, M.; Pavlovic, V.; Metaxas, D. N. Cartoonish sketch-based face editing in videos using identity deformation transfer. arXiv preprint arXiv:1703.08738, 2017.
Xiao, T.; Hong, J.; Ma, J. ELEGANT: Exchanging latent encodings with GAN for transferring multiple face attributes. In: Proceedings of the European Conference on Computer Vision, 168–184, 2018.
Shen, Y.; Gu, J.; Tang, X.; Zhou, B. Interpreting the latent space of GANs for semantic face editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9243–9252, 2020.
Richardson, E.; Alaluf, Y.; Patashnik, O.; Nitzan, Y.; Azar, Y.; Shapiro, S.; Cohen-Or, D. Encoding in style: A StyleGAN encoder for image-to-image translation. arXiv preprint arXiv:2008.00951, 2020.
Pidhorskyi, S.; Adjeroh, D. A.; Doretto, G. Adversarial latent autoencoders. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14104–14113, 2020.
Alaluf, Y.; Patashnik, O.; Cohen-Or, D. Only a matter of style: Age transformation using a style-based regression model. arXiv preprint arXiv:2102.02754, 2021.
Yang, S.; Wang, Z.; Liu, J.; Guo, Z. Deep plastic surgery: Robust and controllable image editing with human-drawn sketches. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12360. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. Eds. Springer Cham, 601–617, 2020.
Chen, S. Y.; Liu, F. L.; Lai, Y. K.; Rosin, P. L.; Li, C.; Fu, H.; Gao, L. DeepFaceEditing: Deep generation of face images from sketches. ACM Transactions on Graphics Vol 40, No. 4, Article No. 90, 2021.
Schaefer, S.; McPhail, T.; Warren, J. Image deformation using moving least squares. In: Proceedings of the ACM SIGGRAPH Papers, 533–540, 2006.
Tan, Z.; Chai, M.; Chen, D.; Liao, J.; Chu, Q.; Yuan, L.; Tulyakov, S.; Yu, N. MichiGAN: Multi-input-conditioned hair image generation for portrait editing. ACM Transactions on Graphics Vol. 39, No. 4, Article No. 95, 2020.
Zakharov, E.; Shysheya, A.; Burkov, E.; Lempitsky, V. Few-shot adversarial learning of realistic neural talking head models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 9459–9468, 2019.
Gong, X.; Chen, W.; Chen, T.; Wang, Z. Sandwich batch normalization: A drop-In replacement for feature distribution heterogeneity. arXiv preprint arXiv:2102.11382, 2021.
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. Lecture Notes in Computer Science, Vol. 9351. Navab, N.; Hornegger, J.; Wells, W. M.; Frangi, A. F. Eds. Springer Cham, 234–241, 2015.
Deng, J.; Guo, J.; Xue, N.; Zafeiriou, S. ArcFace: Additive angular margin loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4690–4699, 2019.
Wang, Z.; Bovik, A. C.; Sheikh, H. R.; Simoncelli, E. P. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing Vol. 13, No. 4, 600–612, 2004.
Face++. Available at https://www.faceplusplus.com/dense-facial-landmarks/
Zhang, P.; Zhang, B.; Chen, D.; Yuan, L.; Wen, F. Cross-domain correspondence learning for exemplar-based image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5143–5153, 2020.
Zakharov, E.; Ivakhnenko, A.; Shysheya, A.; Lempitsky, V. Fast bi-layer neural synthesis of one-shot realistic head avatars. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12357. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. Eds. Springer Cham, 524–540, 2020.
Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: Proceedings of the 31st Conference on Neural Information Processing Systems, 6626–6637, 2017.
Huang, Y.; Wang, Y.; Tai, Y.; Liu, X.; Shen, P.; Li, S.; Li, J.; Huang, F. CurricularFace: Adaptive curriculum learning loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5901–5910, 2020.
Horé, A.; Ziou, D. Is there a relationship between peak-signal-to-noise ratio and structural similarity index measure? IET Image Processing Vol. 7, No. 1, 12–24, 2013.
Wang, Z.; Bovik, A. C. Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures. IEEE Signal Processing Magazine Vol. 26, No. 1, 98–117, 2009.
Acknowledgements
This work was supported by grants from the Open Research Projects of Zhejiang Lab (No. 2021KE0AB06), the National Natural Science Foundation of China (Nos. 62061136007 and 62102403), the Beijing Municipal Natural Science Foundation for Distinguished Young Scholars (No. JQ21013), and the Open Project Program of the State Key Laboratory of Virtual Reality Technology and Systems, Beihang University (No. VRLAB2022C07).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Shu-Yu Chen received her Ph.D. degree in computer science and technology from the University of Chinese Academy of Sciences. She is currently working as an assistant professor in the Institute of Computing Technology, Chinese Academy of Sciences. Her research interests include computer graphics and computer vision.
Yue-Ren Jiang received his master degree in computer science and technology from the University of Chinese Academy of Sciences. His research interests include computer graphics and computer vision.
Hongbo Fu received his B.S. degree in information sciences from Peking University, China, in 2002 and his Ph.D. degree in computer science from the Hong Kong University of Science and Technology in 2007. He is a full professor at the School of Creative Media, City University of Hong Kong. His primary research interests fall in the fields of computer graphics and human-computer interaction. He has served as an associate editor of The Visual Computer, Computers & Graphics, and Computer Graphics Forum.
Xinyang Han is an undergraduate at University of Chinese Academy of Sciences. His research interests include computer graphics.
Zitao Liu received his Ph.D degree in computer science from University of Pittsburgh in 2016. He is currently the head of Engineering, ThinkAcademy International at TAL Education Group. Before joining TAL, he was a senior research scientist at Pinterest. His research interests include AI in education, multimodal knowledge representation, and user modeling. He serves as the Executive Committee of the International AI in Education Society.
Rong Li received his Ph D degree from Zhejiang University, Hangzhou, China, in 2015. He is now a senior researcher in science and art research center of Zhejiang Laboratory, Hangzhou, China. His current research interests include the areas of computer vision, computer graphics and deep learning, specifically, for theory and practice of virtual content generation.
Lin Gao received his Ph.D. degree in computer science from Tsinghua University. He is currently an associate professor at the Institute of Computing Technology, Chinese Academy of Sciences. He has been awarded the Newton Advanced Fellowship from the Royal Society and the AG Young Researcher Award. His research interests include computer graphics and geometric processing.
Supplementary Material
Supplementary material, approximately 20.6 MB.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.
About this article
Cite this article
Chen, SY., Jiang, YR., Fu, H. et al. DeepFaceReshaping: Interactive deep face reshaping via landmark manipulation. Comp. Visual Media 10, 949–963 (2024). https://doi.org/10.1007/s41095-023-0373-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41095-023-0373-1