基于重建—分类学习的伪造人脸检测

播放视频

视频文档

基于重建—分类学习的伪造人脸检测

下载 21

白玉兰开源

发布于

1296

人观看

#信息技术

现有伪造人脸检测算法大多聚焦于输入图像中特定的合成模式如噪声特征、局部纹理和频域信息等来辨别伪造人脸。然而，随着伪造技术的发展，过度关注特定的已知合成模式容易造成无法识别全新合成方法生成的伪造样本。同时，图像传输过程中的压缩、模糊、饱和度失调等噪声也可能破坏已知的合成模式，从而影响伪造人脸检测算法的准确度。基于此，我们从一个新的视角来探索伪造人脸检测任务，设计了一个名为RECCE的“重建—分类”学习框架，通过重建真实人脸图像来学习真实人脸的共性表征，并根据分类任务来挖掘真实人脸与伪造人脸的本质差异。简单来说，我们利用真实人脸图像训练了一个重建网络，并利用重建网络的隐层特征来对真实与伪造人脸进行分类。由于伪造人脸与真实人脸在数据分布上存在不一致，因此伪造人脸的重建误差更明显，且反映了潜在的伪造区域。我们在常用伪造人脸检测数据集如FF++、WildDeepfake和DFDC上进行了大量实验，实验结果验证了我们方法相较于现有方法的优越性能。

曹隽逸，上海交通大学人工智能研究院在读硕士生，导师为马超副教授。主要研究方向为人脸安全、视觉识别等深度学习理论与方法研究。目前在ACM MM，CVPR等CCF推荐会议与期刊上发表论文3篇。

展开查看详情

1 . End-to-End Reconstruction-Classification Learning for Face Forgery Detection1 Junyi Cao, Chao Ma, Taiping Yao, Shen Chen, Shouhong Ding, Xiaokang Yang April 2022 1: Accepted by CVPR 2022.

2 . 1 Background 2 Methodology Catalogue 3 Experiments 4 Conclusion

3 . Background 01

4 . Face Forgery in our daily life Credit: https://www.youtube.com/watch?v=cQ54GDm1eL0

5 . Face Forgery in our daily life It was reported that Tsai Ing- wen's face had been swapped by deepfake software in Oct. 2021. The Ukrainian TV station broadcasted a live news program in which Zelensky called on Ukrainians to lay down their weapons. Eventually, it was found the video was a deepfake. Credit: https://guoxue.ifeng.com/c/8AXocEvYuFv, https://new.qq.com/omn/20220318/20220318A0D6FK00.html

6 .Background – Face Forgery Detection A deepfake or face forgery is content produced by artificial intelligence which seems authentic in the eyes of a human being. Face forgery detection has received increasing attention due to the concern on malicious abuse of digitally forged facial images. Can you identify the forged faces below? All the listed faces are fake!

7 .Background – Face Forgery Detection A deepfake or face forgery is content produced by artificial intelligence which seems authentic in the eyes of a human being. Face forgery detection has received increasing attention due to the concern on malicious abuse of digitally forged facial images. Can you identify the forged faces below? These are the corresponding source faces (i.e., genuine faces).

8 .Background – Face Forgery Detection Impersonation Identity Theft Fake Spread of Misinformation Malicious abuse of Face Forgery

9 .Background – Face Forgery Detection Impersonation Identity Theft Fake Spread of Misinformation Malicious abuse of Face Forgery Therefore, it is of paramount importance to develop effective face forgery detectors to separate forged faces from real ones.

10 . Methodology 02

11 . Existing Approach Early Attempts Early efforts1,2 follow the classic Input Real image classification pipeline which CNN Faces Fake directly takes input faces and Bi-classification performs binary classification. 1. Huy H. Nguyen, Junichi Yamagishi, and Isao Echizen. Capsule-forensics: Using capsule networks to detect forged images and videos. In ICASSP, 2019. 2. Andreas Rössler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. Faceforensics++: Learning to detect manipulated facial images. In ICCV, 2019. 饮水思源爱国荣校 www.sjtu.edu.cn

12 . Existing Approach Early Attempts Drawbacks: • CNN backbones inherited from Input Real general image classification CNN models fails to capture subtle Faces Fake forgery traces. Bi-classification • Hard to generalize on new types of face forgeries. 1. Huy H. Nguyen, Junichi Yamagishi, and Isao Echizen. Capsule-forensics: Using capsule networks to detect forged images and videos. In ICASSP, 2019. 2. Andreas Rössler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. Faceforensics++: Learning to detect manipulated facial images. In ICCV, 2019. 饮水思源爱国荣校 www.sjtu.edu.cn

13 . Existing Approach Recent research Recent works resort to Input specific forgery pattern such Real Faces CNN Fake as noise characteristics1, Emphasize textural information2 and Bi-classification frequency statistics3 to better detect forgery artifacts resided in fake faces. Texture Frequency Noise 1. Peng Zhou, Xintong Han, Vlad I. Morariu, and Larry S.Davis. Two-stream neural networks for tampered face detection. In CVPR Workshops, 2017. Specific Forgery Pattern 2. Hanqing Zhao, Wenbo Zhou, Dongdong Chen, Tianyi Wei, Weiming Zhang, and Nenghai Yu. Multi-attentional deepfake detection. In CVPR, 2021. 3. Yuyang Qian, Guojun Yin, Lu Sheng, Zixuan Chen, and Jing Shao. Thinking in frequency: Face forgery detection by mining frequency-aware clues. In ECCV, 2020. 饮水思源爱国荣校 www.sjtu.edu.cn

14 . Existing Approach Recent research [CVPRW, 17] 饮水思源爱国荣校 www.sjtu.edu.cn

15 . Existing Approach Recent research [ECCV, 20] 饮水思源爱国荣校 www.sjtu.edu.cn

16 . Existing Approach Recent research [CVPR, 21] 饮水思源爱国荣校 www.sjtu.edu.cn

17 . Existing Approach Benefits • Specific forgery patterns can reveal the subtle forgery clues resided in fake faces. • These methods achieve high detection accuracy under within-dataset evaluation. Challenges • Emphasis on specific patterns causes specialization of learned representations to known forgery types presented in the training set, easily incurring overfitting. • The performance highly depends on the quality of the patterns of interest, therefore, may fail when these patterns are corrupted. 饮水思源爱国荣校 www.sjtu.edu.cn

18 . Our Idea Focus on the distribution discrepancy Real Face Reconstructed Face Real Fake 1. Explore the essential characteristics of real faces to be aware of forgery patterns that are even unknown. 2. Mine the discrepancy between real and fake faces by enhancing the network reasoning about forgery clues. 饮水思源爱国荣校 www.sjtu.edu.cn

19 . Our Method – RECCE Methodology To capture the essential discrepancy between real and fake faces, we propose the REConstruction-Classification lEarning (RECCE) framework, which consists of three main schemes. Reconstruction learning • Reconstruction network based on encoder-decoder structure Multi-scale Graph Reasoning (MGR) • Reasoning about forgery clues by combining encoder output and decoder features in a multi-scale way Reconstruction Guided Attention (RGA) • Highlight the probably forged regions on embedding features to facilitate final classification 饮水思源爱国荣校 www.sjtu.edu.cn

20 . Our Method – RECCE The Schematic diagram of the proposed framework. 饮水思源爱国荣校 www.sjtu.edu.cn

21 . Our Method – Reconstruction Learning Since face forgery methods are always diverse, we argue that exploring the common characteristics of genuine faces is more suitable than overfitting specific forgery patterns presented in the training set. As such, we propose to perform reconstruction learning to restore real images only. Reconstruction Learning 1 ෥=𝒙+𝜼 𝒙 ෝ=ℱ 𝒙 𝒙 ෥ ℒr = ෝ𝑖 − 𝒙𝑖 ෍ 𝒙 1 𝑅 𝑖∈𝑅 1. Add white noises 2. Go through rec. net 3. Compute rec. loss w.r.t. real faces 饮水思源爱国荣校 www.sjtu.edu.cn

22 . Our Method – Reconstruction Learning Besides the reconstruction difference, we use a metric-learning loss to make real images close while real and fake images faraway in the embedding space. In this way, we ensure the discrepancy between real and fake faces is strengthened in the feature space. Metric-learning Loss 1 1 ℒm = ത ത ෍ 𝑑 𝐅𝑖 , 𝐅𝑗 − ෍ 𝑑 𝐅ത𝑖 , 𝐅ത𝑗 𝑁𝑅𝑅 𝑁𝑅𝐹 𝑖∈𝑅,𝑗∈𝑅 𝑖∈𝑅,𝑗∈𝐹 𝐚 𝐛 1− 𝐚 ⋅ 𝐛 2 2 where 𝑑 𝐚, 𝐛 = is a pair-wise distant function based on the cosine distance. 2 饮水思源爱国荣校 www.sjtu.edu.cn

23 . Our Method – Multi-scale Graph Reasoning When applying the metric-learning loss to the decoder, the useful information to separate real and fake images is embedded in the decoder as well. Since different face forgery techniques result in forged traces across various scales, the multi-scale structure encourages the learning of comprehensive forgery clues. As forgery traces usually appears in local areas, the graphs are proposed to model the local relationship between encoder output and decoder features adaptively. 饮水思源爱国荣校 www.sjtu.edu.cn

24 . Our Method – Multi-scale Graph Reasoning 𝑖 ℎ1 ×𝑤1 Aggregated vertex 𝐅enc ≜ 𝑉enc = 𝐯enc 𝑖=1 𝑁 ℎ2 ×𝑤2 𝑖 𝑖,𝑗 𝑖 𝐅dec ≜ 𝑉dec = 𝑖 𝐯dec 𝑖=1 𝐯agg = ෍ 𝑎𝑗 𝐯෤dec ⨂ 1 − 𝜓 𝐯enc 𝑗=1 𝑖 𝑖,𝑗 𝑁 𝒩 𝐯enc = 𝐯dec denotes the set of 𝑗=1 𝑖 . vertices in 𝑉dec which is linked to 𝐯enc Concretely, graph reasoning aggregates the 𝑖 information from 𝒩 𝐯enc to enrich the 𝑖 feature representations of 𝐯enc for better reasoning about forgery clues. 饮水思源爱国荣校 www.sjtu.edu.cn

25 . Our Method – Reconstruction Guided Attention The reconstructed forged faces largely differ from the input forged faces in visual appearance. Input Reconstruction This motivates us to use the reconstruction difference to indicate the probably manipulated traces. Fake Difference mark ෝ−𝒙 𝐦= 𝒙 Attended output features Real ′ 𝐅enh = 𝜎 𝑓1 𝐦 ⨂𝑓2 𝐅enh ′ 𝐅att = 𝐅enh + 𝐅enh 饮水思源爱国荣校 www.sjtu.edu.cn

26 . Our Method – Reconstruction Guided Attention The reconstructed forged faces largely differ from the input forged faces in visual appearance. This motivates us to use the reconstruction difference to indicate the probably manipulated traces. Difference mark ෝ−𝒙 𝐦= 𝒙 Attended output features ′ 𝐅enh = 𝜎 𝑓1 𝐦 ⨂𝑓2 𝐅enh ′ 𝐅att = 𝐅enh + 𝐅enh 饮水思源爱国荣校 www.sjtu.edu.cn

27 . Our Method – Loss Function During training, we jointly optimize the reconstruction network and the classification network in an end-to-end manner. The total loss function consists of the cross entropy loss ℒcls for binary classification, the reconstruction loss ℒr , and the metric-learning loss ℒm . Total Loss ℒ = ℒcls + 𝜆1 ℒr + 𝜆2 ℒm where 𝜆1 and 𝜆2 are weight parameters for balancing different losses. 饮水思源爱国荣校 www.sjtu.edu.cn

28 . Experiments 03

29 . Experiments – Datasets & Evaluation Metrics We adopt the commonly-used benchmark datasets for face forgery detection, including FaceForensics++(FF++)1, Celeb-DF2, WildDeepfake3, and DFDC4. Following previous works, we report Accuracy (Acc), Area Under the Receiver Operating Characteristic Curve (AUC), and Equal Error Rate (EER) to evaluate the proposed method and existing competitors. A higher Acc or AUC with a lower EER indicates a better result. 1. Andreas Rössler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. Faceforensics++: Learning to detect manipulated facial images. In ICCV, 2019. 2. Yuezun Li, Xin Yang, Pu Sun, Honggang Qi, and Siwei Lyu. Celeb-DF: A large-scale challenging dataset for deepfake forensics. In CVPR, 2020. 3. Bojia Zi, Minghao Chang, Jingjing Chen, Xingjun Ma, and Yu-Gang Jiang. WildDeepfake: A challenging real-world dataset for deepfake detection. In ACM MM, 2020. 4. Brian Dolhansky, Joanna Bitton, Ben Pflaum, Jikuo Lu, Russ Howes, Menglin Wang, and Cristian Canton Ferrer. The deepfake detection challenge (dfdc) dataset. arXiv preprint arXiv:2006.07397, 2020. 饮水思源爱国荣校 www.sjtu.edu.cn

4点赞

4收藏

21下载