- 微博 QQ QQ空间 贴吧
- 视频嵌入链接 文档嵌入链接
1 . End-to-End Reconstruction-Classification Learning for Face Forgery Detection1 Junyi Cao, Chao Ma, Taiping Yao, Shen Chen, Shouhong Ding, Xiaokang Yang April 2022 1: Accepted by CVPR 2022.
2 . 1 Background 2 Methodology Catalogue 3 Experiments 4 Conclusion
3 . Background 01
4 . Face Forgery in our daily life Credit: https://www.youtube.com/watch?v=cQ54GDm1eL0
5 . Face Forgery in our daily life It was reported that Tsai Ing- wen's face had been swapped by deepfake software in Oct. 2021. The Ukrainian TV station broadcasted a live news program in which Zelensky called on Ukrainians to lay down their weapons. Eventually, it was found the video was a deepfake. Credit: https://guoxue.ifeng.com/c/8AXocEvYuFv, https://new.qq.com/omn/20220318/20220318A0D6FK00.html
6 .Background – Face Forgery Detection A deepfake or face forgery is content produced by artificial intelligence which seems authentic in the eyes of a human being. Face forgery detection has received increasing attention due to the concern on malicious abuse of digitally forged facial images. Can you identify the forged faces below? All the listed faces are fake!
7 .Background – Face Forgery Detection A deepfake or face forgery is content produced by artificial intelligence which seems authentic in the eyes of a human being. Face forgery detection has received increasing attention due to the concern on malicious abuse of digitally forged facial images. Can you identify the forged faces below? These are the corresponding source faces (i.e., genuine faces).
8 .Background – Face Forgery Detection Impersonation Identity Theft Fake Spread of Misinformation Malicious abuse of Face Forgery
9 .Background – Face Forgery Detection Impersonation Identity Theft Fake Spread of Misinformation Malicious abuse of Face Forgery Therefore, it is of paramount importance to develop effective face forgery detectors to separate forged faces from real ones.
10 . Methodology 02
11 . Existing Approach Early Attempts Early efforts1,2 follow the classic Input Real image classification pipeline which CNN Faces Fake directly takes input faces and Bi-classification performs binary classification. 1. Huy H. Nguyen, Junichi Yamagishi, and Isao Echizen. Capsule-forensics: Using capsule networks to detect forged images and videos. In ICASSP, 2019. 2. Andreas Rössler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. Faceforensics++: Learning to detect manipulated facial images. In ICCV, 2019. 饮水思源 爱国荣校 www.sjtu.edu.cn
12 . Existing Approach Early Attempts Drawbacks: • CNN backbones inherited from Input Real general image classification CNN models fails to capture subtle Faces Fake forgery traces. Bi-classification • Hard to generalize on new types of face forgeries. 1. Huy H. Nguyen, Junichi Yamagishi, and Isao Echizen. Capsule-forensics: Using capsule networks to detect forged images and videos. In ICASSP, 2019. 2. Andreas Rössler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. Faceforensics++: Learning to detect manipulated facial images. In ICCV, 2019. 饮水思源 爱国荣校 www.sjtu.edu.cn
13 . Existing Approach Recent research Recent works resort to Input specific forgery pattern such Real Faces CNN Fake as noise characteristics1, Emphasize textural information2 and Bi-classification frequency statistics3 to better detect forgery artifacts resided in fake faces. Texture Frequency Noise 1. Peng Zhou, Xintong Han, Vlad I. Morariu, and Larry S.Davis. Two-stream neural networks for tampered face detection. In CVPR Workshops, 2017. Specific Forgery Pattern 2. Hanqing Zhao, Wenbo Zhou, Dongdong Chen, Tianyi Wei, Weiming Zhang, and Nenghai Yu. Multi-attentional deepfake detection. In CVPR, 2021. 3. Yuyang Qian, Guojun Yin, Lu Sheng, Zixuan Chen, and Jing Shao. Thinking in frequency: Face forgery detection by mining frequency-aware clues. In ECCV, 2020. 饮水思源 爱国荣校 www.sjtu.edu.cn
14 . Existing Approach Recent research [CVPRW, 17] 饮水思源 爱国荣校 www.sjtu.edu.cn
15 . Existing Approach Recent research [ECCV, 20] 饮水思源 爱国荣校 www.sjtu.edu.cn
16 . Existing Approach Recent research [CVPR, 21] 饮水思源 爱国荣校 www.sjtu.edu.cn
17 . Existing Approach Benefits • Specific forgery patterns can reveal the subtle forgery clues resided in fake faces. • These methods achieve high detection accuracy under within-dataset evaluation. Challenges • Emphasis on specific patterns causes specialization of learned representations to known forgery types presented in the training set, easily incurring overfitting. • The performance highly depends on the quality of the patterns of interest, therefore, may fail when these patterns are corrupted. 饮水思源 爱国荣校 www.sjtu.edu.cn
18 . Our Idea Focus on the distribution discrepancy Real Face Reconstructed Face Real Fake 1. Explore the essential characteristics of real faces to be aware of forgery patterns that are even unknown. 2. Mine the discrepancy between real and fake faces by enhancing the network reasoning about forgery clues. 饮水思源 爱国荣校 www.sjtu.edu.cn
19 . Our Method – RECCE Methodology To capture the essential discrepancy between real and fake faces, we propose the REConstruction-Classification lEarning (RECCE) framework, which consists of three main schemes. Reconstruction learning • Reconstruction network based on encoder-decoder structure Multi-scale Graph Reasoning (MGR) • Reasoning about forgery clues by combining encoder output and decoder features in a multi-scale way Reconstruction Guided Attention (RGA) • Highlight the probably forged regions on embedding features to facilitate final classification 饮水思源 爱国荣校 www.sjtu.edu.cn
20 . Our Method – RECCE The Schematic diagram of the proposed framework. 饮水思源 爱国荣校 www.sjtu.edu.cn
21 . Our Method – Reconstruction Learning Since face forgery methods are always diverse, we argue that exploring the common characteristics of genuine faces is more suitable than overfitting specific forgery patterns presented in the training set. As such, we propose to perform reconstruction learning to restore real images only. Reconstruction Learning 1 =𝒙+𝜼 𝒙 ෝ=ℱ 𝒙 𝒙 ℒr = ෝ𝑖 − 𝒙𝑖 𝒙 1 𝑅 𝑖∈𝑅 1. Add white noises 2. Go through rec. net 3. Compute rec. loss w.r.t. real faces 饮水思源 爱国荣校 www.sjtu.edu.cn
22 . Our Method – Reconstruction Learning Besides the reconstruction difference, we use a metric-learning loss to make real images close while real and fake images faraway in the embedding space. In this way, we ensure the discrepancy between real and fake faces is strengthened in the feature space. Metric-learning Loss 1 1 ℒm = ത ത 𝑑 𝐅𝑖 , 𝐅𝑗 − 𝑑 𝐅ത𝑖 , 𝐅ത𝑗 𝑁𝑅𝑅 𝑁𝑅𝐹 𝑖∈𝑅,𝑗∈𝑅 𝑖∈𝑅,𝑗∈𝐹 𝐚 𝐛 1− 𝐚 ⋅ 𝐛 2 2 where 𝑑 𝐚, 𝐛 = is a pair-wise distant function based on the cosine distance. 2 饮水思源 爱国荣校 www.sjtu.edu.cn
23 . Our Method – Multi-scale Graph Reasoning When applying the metric-learning loss to the decoder, the useful information to separate real and fake images is embedded in the decoder as well. Since different face forgery techniques result in forged traces across various scales, the multi-scale structure encourages the learning of comprehensive forgery clues. As forgery traces usually appears in local areas, the graphs are proposed to model the local relationship between encoder output and decoder features adaptively. 饮水思源 爱国荣校 www.sjtu.edu.cn
24 . Our Method – Multi-scale Graph Reasoning 𝑖 ℎ1 ×𝑤1 Aggregated vertex 𝐅enc ≜ 𝑉enc = 𝐯enc 𝑖=1 𝑁 ℎ2 ×𝑤2 𝑖 𝑖,𝑗 𝑖 𝐅dec ≜ 𝑉dec = 𝑖 𝐯dec 𝑖=1 𝐯agg = 𝑎𝑗 𝐯dec ⨂ 1 − 𝜓 𝐯enc 𝑗=1 𝑖 𝑖,𝑗 𝑁 𝒩 𝐯enc = 𝐯dec denotes the set of 𝑗=1 𝑖 . vertices in 𝑉dec which is linked to 𝐯enc Concretely, graph reasoning aggregates the 𝑖 information from 𝒩 𝐯enc to enrich the 𝑖 feature representations of 𝐯enc for better reasoning about forgery clues. 饮水思源 爱国荣校 www.sjtu.edu.cn
25 . Our Method – Reconstruction Guided Attention The reconstructed forged faces largely differ from the input forged faces in visual appearance. Input Reconstruction This motivates us to use the reconstruction difference to indicate the probably manipulated traces. Fake Difference mark ෝ−𝒙 𝐦= 𝒙 Attended output features Real ′ 𝐅enh = 𝜎 𝑓1 𝐦 ⨂𝑓2 𝐅enh ′ 𝐅att = 𝐅enh + 𝐅enh 饮水思源 爱国荣校 www.sjtu.edu.cn
26 . Our Method – Reconstruction Guided Attention The reconstructed forged faces largely differ from the input forged faces in visual appearance. This motivates us to use the reconstruction difference to indicate the probably manipulated traces. Difference mark ෝ−𝒙 𝐦= 𝒙 Attended output features ′ 𝐅enh = 𝜎 𝑓1 𝐦 ⨂𝑓2 𝐅enh ′ 𝐅att = 𝐅enh + 𝐅enh 饮水思源 爱国荣校 www.sjtu.edu.cn
27 . Our Method – Loss Function During training, we jointly optimize the reconstruction network and the classification network in an end-to-end manner. The total loss function consists of the cross entropy loss ℒcls for binary classification, the reconstruction loss ℒr , and the metric-learning loss ℒm . Total Loss ℒ = ℒcls + 𝜆1 ℒr + 𝜆2 ℒm where 𝜆1 and 𝜆2 are weight parameters for balancing different losses. 饮水思源 爱国荣校 www.sjtu.edu.cn
28 . Experiments 03
29 . Experiments – Datasets & Evaluation Metrics We adopt the commonly-used benchmark datasets for face forgery detection, including FaceForensics++(FF++)1, Celeb-DF2, WildDeepfake3, and DFDC4. Following previous works, we report Accuracy (Acc), Area Under the Receiver Operating Characteristic Curve (AUC), and Equal Error Rate (EER) to evaluate the proposed method and existing competitors. A higher Acc or AUC with a lower EER indicates a better result. 1. Andreas Rössler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. Faceforensics++: Learning to detect manipulated facial images. In ICCV, 2019. 2. Yuezun Li, Xin Yang, Pu Sun, Honggang Qi, and Siwei Lyu. Celeb-DF: A large-scale challenging dataset for deepfake forensics. In CVPR, 2020. 3. Bojia Zi, Minghao Chang, Jingjing Chen, Xingjun Ma, and Yu-Gang Jiang. WildDeepfake: A challenging real-world dataset for deepfake detection. In ACM MM, 2020. 4. Brian Dolhansky, Joanna Bitton, Ben Pflaum, Jikuo Lu, Russ Howes, Menglin Wang, and Cristian Canton Ferrer. The deepfake detection challenge (dfdc) dataset. arXiv preprint arXiv:2006.07397, 2020. 饮水思源 爱国荣校 www.sjtu.edu.cn