基于多尺度时空特征和篡改概率改善换脸检测的跨库性能

胡永健; 卓思超; 刘琲贝; 王宇飞; 李纪成

doi:10.12141/j.issn.1000-565X.230105

华南理工大学学报(自然科学版) >

2024 , Vol. 52 >Issue 6: 110 - 119

DOI: https://doi.org/10.12141/j.issn.1000-565X.230105

计算机科学与技术

基于多尺度时空特征和篡改概率改善换脸检测的跨库性能

胡永健 ,
卓思超 ,
刘琲贝 ,
王宇飞 ,
李纪成

展开

^1.华南理工大学电子与信息工程学院，广东广州 510640
^2.广东警官学院刑事技术学院，广东广州 510440

胡永健（1962—），男，教授，博士生导师，主要从事多媒体信息安全、图像处理、人工智能及其应用等研究。E-mail： eeyjhu@scut.edu.cn

刘琲贝（1980—），女，讲师，硕士生导师，主要从事多媒体信息安全研究。E-mail： eebbliu@scut.edu.cn

收稿日期: 2023-03-13

网络出版日期: 2023-11-07

基金资助

广东省重点建设学科科研能力提升项目(2021ZDJS047);广州开发区国际科技合作项目(2022GH15);中国人民警察大学国家基金培育课题(JJPY202402);广东省普通高校特色创新项目（自然科学）(2023KTSCX093)

收起

Improvement of Cross-Dataset Performance of Face Forgery Detection Based on Multi-Scale Spatiotemporal Features and Tampering Probabilities

HU Yongjian ,
ZHUO Sichao ,
LIU Beibei ,
WANG Yufei ,
LI Jicheng

Expand

^1.School of Electronic and Information Engineering，South China University of Technology，Guangzhou 510640，Guangdong，China
^2.School of Criminal Science and Technology，Guangdong Police College，Guangzhou 510440，Guangdong，China

Received date: 2023-03-13

Online published: 2023-11-07

Supported by

the Scientific Research Capability Improvement Program for Key Discipline Construction of Guangdong Province(2021ZDJS047);the Characteristic Innovation Project of Colleges and Universities in Guangdong Province （Natural Science）(2023KTSCX093)

Fold

摘要

目前大多DeepFake换脸检测算法过于依赖局部特征，尽管库内检测性能尚佳，但容易出现过拟合，导致跨库检测性能不理想，即泛化性能不够好。有鉴于此，文中提出一种基于多尺度时空特征和篡改概率的换脸视频检测算法，目的是利用假脸视频中广泛存在的帧间时域不连续性缺陷来解决现有检测算法在跨库、跨伪造方式和视频压缩时性能明显下降的问题，改善泛化检测能力。该算法包括3个模块：为检测假脸视频在时域上留下的不连续痕迹，设计了一个多尺度时空特征提取模块；为自适应计算多尺度时空特征之间的时空域关联性，设计了一个三维双注意力机制模块；为预测随机选取的像素点的篡改概率和构造监督掩膜，设计了一个辅助监督模块。将所提出的算法在FF++、DFD、DFDC、CDF等公开大型标准数据库中进行实验，并与基线算法和近期发布的同类算法进行对比。结果显示：文中算法在保持库内平均检测性能优良的同时，跨库检测和抗视频压缩时的综合性能最好，跨伪造方法检测时的综合性能中等偏上。实验结果验证了文中算法的有效性。

关键词： 换脸检测; 跨库性能; 多尺度时空特征; 注意力机制; 篡改概率; 三维点云重建

本文引用格式

胡永健 , 卓思超 , 刘琲贝 , 王宇飞 , 李纪成 . 基于多尺度时空特征和篡改概率改善换脸检测的跨库性能[J]. 华南理工大学学报(自然科学版), 2024 , 52(6) : 110 -119 . DOI: 10.12141/j.issn.1000-565X.230105

Abstract

Most existing Deepfake face forgery detection algorithms suffer from the insufficient generalization performance despite that their intra-dataset detection performance is fairly good. This is because these methods mainly rely on local features that are prone to overfitting, which leads to unsatisfactory cross-dataset detection performance. In order to solve this problem, a face forgery detection method based on multi-scale spatiotemporal features and tampering probability is proposed, which helps to maintain good performance for cross-dataset testing, cross-forgery testing as well as video compression by detecting the inevitable temporal inconsistency between continuous frames in deepfake videos. The proposed detection method consists of three modules: a multi-scale spatiotemporal feature extraction module is employed to reveal the discontinuous traces of fake videos in the temporal domain, a three-dimension dual-attention module is designed to adaptively compute the correlation between multi-scale spatiotemporal features, and an auxiliary supervision module is used to predict the tampering probabilities of randomly selected pixels to form a supervision mask. Then, the proposed algorithm is compared with the baseline algorithm and the latest relevant works on large-scale public standard databases such as FF++, DFD, DFDC and CDF. Experimental results have show that the proposed algorithm has the best overall performance for cross-dataset testing and video compression, and has the above-average performance for cross-forgery testing. Meanwhile, it maintains good average performance for all intra-dataset testing. All the experiments demonstrate the effectiveness of the proposed algorithm.

Key words： face forgery detection; cross-dataset performance; multi-scale spatiotemporal feature; attention mechanism; tampering probability; 3D point cloud reconstruction

参考文献

1	LI Y，LYU S ．Exposing DeepFake videos by detecting face warping artifacts［C］∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition．Long Beach：IEEE，2019：46-52．
2	LI J， XIE H， LI J，et al ．Frequency-aware discriminative feature learning supervised by single-center loss for face forgery detection［C］∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition．Virtual：IEEE，2021：6458-6467．
3	DURALL R， KEUPER M， KEUPER J ．Watch your up-convolution：CNN-based generative deep neural networks are failing to reproduce spectral distributions［C］∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern recognition．Seattle：IEEE，2020：7890-7899．
4	LI L， BAO J， ZHANG T，et al ．Face X-ray for more general face forgery detection［C］∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition．Seattle：IEEE，2020：5001-5010．
5	胡永健，高逸飞，刘琲贝，等．基于图像分割网络的深度假脸视频篡改检测［J］．电子与信息学报，2021，43（1）：162-170．
	HU Yongjian， GAO Yifei， LIU Beibei，et al ．Deepfake videos detection based on image segmentation with deep neural networks［J］．Journal of Electronics & Information Technology，2021，43（1）：162-170．
6	YANG X， LI Y，LYU S ．Exposing deep fakes using inconsistent head poses［C］∥Proceedings of the 2019 IEEE International Conference on Acoustics，Speech and Signal Processing．Bredo：IEEE，2019：8261-8265．
7	HALIASSOS A， VOUGIOUKAS K， PETRIDIS S，et al ．Lips don’t lie：a generalisable and robust approach to face forgery detection［C］∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition．Virtual：IEEE，2021：5039-5049．
8	AMERINI I， CALDELLI R ．Exploiting prediction error inconsistencies through LSTM-based classifiers to detect deepfake videos［C］∥Proceedings of the 2020 ACM Workshop on Information Hiding and Multimedia Security．Denver：ACM，2020：97-102．
9	MASI I， KILLEKAR A， MASCARENHAS R，et al ．Two-branch recurrent network for isolating deepfakes in videos［C］∥Proceedings of the 16th European Confe-rence on Computer Vision．Glasgow：Springer，2020：23-28．
10	ZHENG Y， BAO J， CHEN D，et al ．Exploring temporal coherence for more general video face forgery detection［C］∥Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition．Virtual：IEEE，2021：15044-15054．
11	LI X， WAN J， JIN Y，et al ．3DPC-Net：3D point cloud network for face anti-spoofing［C］∥Proceedings of the 2020 IEEE International Joint Conference on Biometrics．Houston：IEEE，2020：1-8．
12	SHI X， CHEN Z， WANG H，et al ．Convolutional LSTM network：a machine learning approach for precipitation nowcasting［J］．Advances in Neural Information Processing Systems，2015，28：802-810．
13	ROSSLER A， COZZOLINO D， VERDOLIVA L，et al ．Faceforensics++：learning to detect manipulated facial images［C］∥Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition．Long Beach：IEEE，2019：1-11．
14	LIU H， LI X， ZHOU W，et al ．Spatial-phase shallow learning：rethinking face forgery detection in frequency domain［C］∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition．Virtual：IEEE，2021：772-781．
15	LUO Y， ZHANG Y， YAN J，et al ．Generalizing face forgery detection with high-frequency features［C］∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition．Virtual：IEEE，2021：16317-16326．
16	CHEN S， YAO T， CHEN Y，et al ．Local relation learning for face forgery detection［C］∥Proceedings of the AAAI Conference on Artificial Intelligence．Virtual：AAAI，2021，35（2）：1081-1088．
17	NI Y， MENG D， YU C，et al ．CORE：consistent representation learning for face forgery detection［C］∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition．New Orleans：IEEE，2022：12-21．
18	SUN K， YAO T， CHEN S，et al ．Dual contrastive learning for general face forgery detection［C］∥Proceedings of the AAAI Conference on Artificial Intelligence．Virtual：AAAI，2022：2316-2324．

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献