Traffic & Transportation Engineering

Traffic Data Imputation Based on Self-Supervised Learning

Expand
  • 1.School of Civil Engineering and Transportation,South China University of Technology,Guangzhou 510640,Guangdong,China
    2.Highway Monitoring & Response Center,MOT,Beijing 100088,China
周楚昊(1994-),男,博士,主要从事交通大数据、智能交通等研究。E-mail:505192138@qq.com

Received date: 2022-04-27

  Online published: 2022-11-28

Supported by

the National Natural Science Foundation of China(52072130);the Natural Science Foundation of Guangdong Province(2020A1515010349)

Abstract

In the regional highway network, there are numerous toll stations generating massive amounts of data on a daily basis. However, due to equipment and network issues, there may be delays in data transmission for some stations. In such cases, the transmitted data may not be sufficient to meet the requirements for real-time traffic flow prediction. To achieve real-time traffic data imputation and dynamic traffic flow prediction, this paper firstly proposed a method for data imputation of highway traffic flow data based on self-supervised learning, which adopts time series model based on attention mechanism (Seq2Seq-Att). Then the self-supervised learning method was used to train the model. Finally, the reliability of the method was verified by taking 80 toll stations in the highway network of Guangdong province as an example. The results show that the method in this paper can flexibly capture the missing pattern in traffic data and give a reasonable value according to the internal correlation of the data. This method is generally superior to other methods and has good performance under different missing rates. The overall MAPE is about 17.7% and the WMAPE is 12.8%. In the case of high missing rate, this method has obvious advantages over other methods. The results of traffic volume prediction indicate that the prediction accuracy of traffic flow prediction using the data completed by this method is close to the situation of using complete data.

Cite this article

ZHOU Chuhao, LIN Peiqun, YAN Mingyue . Traffic Data Imputation Based on Self-Supervised Learning[J]. Journal of South China University of Technology(Natural Science), 2023 , 51(4) : 101 -114 . DOI: 10.12141/j.issn.1000-565X.220237

References

1 NEWMAN D A .Missing data:five practical guidelines[J].Organizational Research Methods201417(4):372-411.
2 孟鸿程,陈淑燕 .交通流缺失数据处理方法比较分析[J].交通信息与安全201836(2):61-67.
  MENG Hong-cheng, CHEN Shu-yan .A comparative analysis of data imputation methods for missing traffic flow data[J].Journal of Transport Information and Safety201836(2):61-67.
3 黎镭,陈蔼祥,姚赞杰 .EM算法对缺失数据极大似然估计的实证研究[J].统计学与应用20187(2):210-220.
  LI Lei, CHEN Ai-xiang, YAO Zan-jie .An empirical study on maximum likelihood estimation of missing data by EM algorithm[J].Statistics and Application20187(2):210-220.
4 韩卫国,王劲峰,胡建军 .交通流量数据缺失值的插补方法[J].交通信息与安全200523(1):39-42.
  HAN Wei-guo, WANG Jin-feng, HU Jian-jun .Imputation methods for missing values in traffic flow data[J].Journal of Transport Information and Safety200523(1):39-42.
5 ROYSTON P .Multiple imputation of missing values[J].The Stata Journal20044(3):227-241.
6 LI L, ZHANG J, WANG Y,et al .Multiple imputation for incomplete traffic accident data using chained equations[C]∥Proceedings of the 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).New York:IEEE,2017:1-5.
7 陆化普,屈闻聪,孙智源 .基于S-G滤波的交通流故障数据识别与修复算法[J].土木工程学报201548(5):123-128.
  LU Hua-pu, QU Wen-cong, SUN Zhi-yuan .Detection and repair algorithm of traffic erroneous data based on S-G filtering[J].China Civil Engineering Journal201548(5):123-128.
8 秦一菲,马明辉,王岩松,等 .基于改进KNN算法的交通流异常数据修复方法[J].计算机测量与控制201826(12):180-184.
  QIN Yi-fei, MA Ming-hui, WANG Yan-song,et al .A recovery method for abnormal traffic flow data based on improved KNN algorithm[J].Computer Measurement & Control201826(12):180-184.
9 KAUR M, SARBJEET S, NAVEEN A .Missing traffic data imputation using a dual-stage error-corrected boosting regressor with uncertainty estimation[J].Information Sciences2022586:344-373.
10 ZHANG Y, ZHOU B, CAI X,et al .Missing value imputation in multivariate time series with end-to-end generative adversarial networks[J].Information Sciences2021551:67-82.
11 CAO W, WANG D, LI J,et al .BRITS:bidirectional recurrent imputation for time series[C]∥Proceedings of the 32nd Conference on Neural Information Processing Systems (NIPS).California:NIPS,2018:10.48550/arXiv.1805.10572.
12 ZHANG Z, LIN X, LI M,et al .A customized deep learning approach to integrate network-scale online traffic data imputation and prediction[J].Transportation Research Part C2021132:103372.
13 王力,李敏,闫佳庆,等 .基于生成式对抗网络的路网交通流数据补全方法[J].交通运输系统工程与信息201818(6):63-71.
  WANG Li, LI Min, YAN Jia-qing,et al .Urban traffic flow data recovery method based on generative adversarial network[J].Journal of Transportation Systems Engineering and Information Technology201818(6):63-71.
14 LIANG Y, ZHAN Z, SUN L .Memory-augmented dynamic graph convolution networks for traffic data imputation with diverse missing patterns[J].Transportation Research Part C2022143:103826.
15 LEE H, HWANG S J, SHIN J .Self-supervised label augmentation via input transformations[C]∥Proceedings of the International Conference on Machine Learning (ICML).San Diego:JMLR,2020:10.48550/arXiv.1910.05872.
16 HE K, CHEN X, XIE S,et al .Masked autoencoders are scalable vision learners[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Los Alamitos:IEEE Computer Society,2022:15979-15988.
17 DEVLIN J, CHANG M, LEE K,et al .BERT:pre-training of deep bidirectional transformers for language understanding[C]∥Proceedings of the 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies (NAACL-HLT).Stroudsburg:ACL,2019:4171-4186.
18 李山海,吴艳雄,刘玉龙,等 .基于编解码深度神经网络的交通流预测模型[J].电子技术与软件工程2021(14):138-141.
  LI Shan-hai, WU Yan-xiong, LIU Yu-long,et al .Traffic flow prediction model based on encoder-decoder deep neural network[J].Electronic Technology & Software Engineering2021(14):138-141.
19 GRAVES A, SCHMIDHUBER J .Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J].Neural Networks200518(5/6):602-610.
20 VASWANI A, SHAZEER N, PARMAR N,et al .Attention is all you need[C]∥Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS).California:NIPS,2017:10.48550/arXiv.1706.03762.
21 RUDER S .An overview of gradient descent optimization algorithms[J].arXiv preprint arXiv:.
22 郑智泉,王孟孟,田维琦 .基于加权K近邻算法的缺失数据填补研究[J].智能计算机与应用202111(11):31-33.
  ZHENG Zhi-quan, WANG Meng-meng, TIAN Wei-qi .Research on missing data filling based on weighted k-nearest neighbor algorithm[J].Intelligent Computer and Applications202111(11):31-33.
23 黄子敬 .基于时空注意力机制的高速公路多收费站多时段出口流量预测方法研究[D].广州:华南理工大学,2020.
Outlines

/