收稿日期: 2022-04-27
网络出版日期: 2022-11-28
基金资助
国家自然科学基金资助项目(52072130);广东省自然科学基金资助项目(2020A1515010349);华南理工大学中央高校基本科研业务费专项资金资助项目(2020ZYGXZR085)
Traffic Data Imputation Based on Self-Supervised Learning
Received date: 2022-04-27
Online published: 2022-11-28
Supported by
the National Natural Science Foundation of China(52072130);the Natural Science Foundation of Guangdong Province(2020A1515010349)
区域高速公路网收费站数量众多,每日产生海量收费数据,但由于设备、网络等因素,部分站点数据传输存在延迟现象,在此情况下已传输的数据往往不能满足实时流量预测的要求。为了实现实时交通数据补全和动态交通流量预测,文中首先提出了一种基于自监督学习的用于高速公路交通流量数据缺失补全的方法,该方法采用了基于注意力机制的时间序列模型(Seq2Seq-Att);然后使用自监督学习方式对模型进行训练;最后,以广东省高速公路网的80个收费站为例,验证方法的可靠性。结果表明:文中的数据补全方法能够灵活捕捉交通数据中的缺失情况,并根据数据自身的内在关联性,给出合理的补全值;该方法总体优于其他方法,且在不同缺失率下都有较好表现,总体MAPE约为17.7%、WMAPE为12.8%;在高缺失率情况下,该方法相比于其他补全方法有明显的优势。交通量预测结果表明,使用该方法补全的数据进行交通流预测的预测精度接近使用完整数据的情况。
周楚昊, 林培群, 闫明月 . 基于自监督学习的交通数据补全算法[J]. 华南理工大学学报(自然科学版), 2023 , 51(4) : 101 -114 . DOI: 10.12141/j.issn.1000-565X.220237
In the regional highway network, there are numerous toll stations generating massive amounts of data on a daily basis. However, due to equipment and network issues, there may be delays in data transmission for some stations. In such cases, the transmitted data may not be sufficient to meet the requirements for real-time traffic flow prediction. To achieve real-time traffic data imputation and dynamic traffic flow prediction, this paper firstly proposed a method for data imputation of highway traffic flow data based on self-supervised learning, which adopts time series model based on attention mechanism (Seq2Seq-Att). Then the self-supervised learning method was used to train the model. Finally, the reliability of the method was verified by taking 80 toll stations in the highway network of Guangdong province as an example. The results show that the method in this paper can flexibly capture the missing pattern in traffic data and give a reasonable value according to the internal correlation of the data. This method is generally superior to other methods and has good performance under different missing rates. The overall MAPE is about 17.7% and the WMAPE is 12.8%. In the case of high missing rate, this method has obvious advantages over other methods. The results of traffic volume prediction indicate that the prediction accuracy of traffic flow prediction using the data completed by this method is close to the situation of using complete data.
| 1 | NEWMAN D A .Missing data:five practical guidelines[J].Organizational Research Methods,2014,17(4):372-411. |
| 2 | 孟鸿程,陈淑燕 .交通流缺失数据处理方法比较分析[J].交通信息与安全,2018,36(2):61-67. |
| MENG Hong-cheng, CHEN Shu-yan .A comparative analysis of data imputation methods for missing traffic flow data[J].Journal of Transport Information and Safety,2018,36(2):61-67. | |
| 3 | 黎镭,陈蔼祥,姚赞杰 .EM算法对缺失数据极大似然估计的实证研究[J].统计学与应用,2018,7(2):210-220. |
| LI Lei, CHEN Ai-xiang, YAO Zan-jie .An empirical study on maximum likelihood estimation of missing data by EM algorithm[J].Statistics and Application,2018,7(2):210-220. | |
| 4 | 韩卫国,王劲峰,胡建军 .交通流量数据缺失值的插补方法[J].交通信息与安全,2005,23(1):39-42. |
| HAN Wei-guo, WANG Jin-feng, HU Jian-jun .Imputation methods for missing values in traffic flow data[J].Journal of Transport Information and Safety,2005,23(1):39-42. | |
| 5 | ROYSTON P .Multiple imputation of missing values[J].The Stata Journal,2004,4(3):227-241. |
| 6 | LI L, ZHANG J, WANG Y,et al .Multiple imputation for incomplete traffic accident data using chained equations[C]∥Proceedings of the 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).New York:IEEE,2017:1-5. |
| 7 | 陆化普,屈闻聪,孙智源 .基于S-G滤波的交通流故障数据识别与修复算法[J].土木工程学报,2015,48(5):123-128. |
| LU Hua-pu, QU Wen-cong, SUN Zhi-yuan .Detection and repair algorithm of traffic erroneous data based on S-G filtering[J].China Civil Engineering Journal,2015,48(5):123-128. | |
| 8 | 秦一菲,马明辉,王岩松,等 .基于改进KNN算法的交通流异常数据修复方法[J].计算机测量与控制,2018,26(12):180-184. |
| QIN Yi-fei, MA Ming-hui, WANG Yan-song,et al .A recovery method for abnormal traffic flow data based on improved KNN algorithm[J].Computer Measurement & Control,2018,26(12):180-184. | |
| 9 | KAUR M, SARBJEET S, NAVEEN A .Missing traffic data imputation using a dual-stage error-corrected boosting regressor with uncertainty estimation[J].Information Sciences,2022,586:344-373. |
| 10 | ZHANG Y, ZHOU B, CAI X,et al .Missing value imputation in multivariate time series with end-to-end generative adversarial networks[J].Information Sciences,2021,551:67-82. |
| 11 | CAO W, WANG D, LI J,et al .BRITS:bidirectional recurrent imputation for time series[C]∥Proceedings of the 32nd Conference on Neural Information Processing Systems (NIPS).California:NIPS,2018:10.48550/arXiv.1805.10572. |
| 12 | ZHANG Z, LIN X, LI M,et al .A customized deep learning approach to integrate network-scale online traffic data imputation and prediction[J].Transportation Research Part C,2021,132:103372. |
| 13 | 王力,李敏,闫佳庆,等 .基于生成式对抗网络的路网交通流数据补全方法[J].交通运输系统工程与信息,2018,18(6):63-71. |
| WANG Li, LI Min, YAN Jia-qing,et al .Urban traffic flow data recovery method based on generative adversarial network[J].Journal of Transportation Systems Engineering and Information Technology,2018,18(6):63-71. | |
| 14 | LIANG Y, ZHAN Z, SUN L .Memory-augmented dynamic graph convolution networks for traffic data imputation with diverse missing patterns[J].Transportation Research Part C,2022,143:103826. |
| 15 | LEE H, HWANG S J, SHIN J .Self-supervised label augmentation via input transformations[C]∥Proceedings of the International Conference on Machine Learning (ICML).San Diego:JMLR,2020:10.48550/arXiv.1910.05872. |
| 16 | HE K, CHEN X, XIE S,et al .Masked autoencoders are scalable vision learners[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Los Alamitos:IEEE Computer Society,2022:15979-15988. |
| 17 | DEVLIN J, CHANG M, LEE K,et al .BERT:pre-training of deep bidirectional transformers for language understanding[C]∥Proceedings of the 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies (NAACL-HLT).Stroudsburg:ACL,2019:4171-4186. |
| 18 | 李山海,吴艳雄,刘玉龙,等 .基于编解码深度神经网络的交通流预测模型[J].电子技术与软件工程,2021(14):138-141. |
| LI Shan-hai, WU Yan-xiong, LIU Yu-long,et al .Traffic flow prediction model based on encoder-decoder deep neural network[J].Electronic Technology & Software Engineering,2021(14):138-141. | |
| 19 | GRAVES A, SCHMIDHUBER J .Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J].Neural Networks,2005,18(5/6):602-610. |
| 20 | VASWANI A, SHAZEER N, PARMAR N,et al .Attention is all you need[C]∥Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS).California:NIPS,2017:10.48550/arXiv.1706.03762. |
| 21 | RUDER S .An overview of gradient descent optimization algorithms[J].arXiv preprint arXiv:. |
| 22 | 郑智泉,王孟孟,田维琦 .基于加权K近邻算法的缺失数据填补研究[J].智能计算机与应用,2021,11(11):31-33. |
| ZHENG Zhi-quan, WANG Meng-meng, TIAN Wei-qi .Research on missing data filling based on weighted k-nearest neighbor algorithm[J].Intelligent Computer and Applications,2021,11(11):31-33. | |
| 23 | 黄子敬 .基于时空注意力机制的高速公路多收费站多时段出口流量预测方法研究[D].广州:华南理工大学,2020. |
/
| 〈 |
|
〉 |