交通运输工程

基于自监督学习的交通数据补全算法

展开
  • 1.华南理工大学 土木与交通学院,广东 广州 510640
    2.交通运输部路网监测与应急处置中心,北京 100088
周楚昊(1994-),男,博士,主要从事交通大数据、智能交通等研究。E-mail:505192138@qq.com

收稿日期: 2022-04-27

  网络出版日期: 2022-11-28

基金资助

国家自然科学基金资助项目(52072130);广东省自然科学基金资助项目(2020A1515010349);华南理工大学中央高校基本科研业务费专项资金资助项目(2020ZYGXZR085)

Traffic Data Imputation Based on Self-Supervised Learning

Expand
  • 1.School of Civil Engineering and Transportation,South China University of Technology,Guangzhou 510640,Guangdong,China
    2.Highway Monitoring & Response Center,MOT,Beijing 100088,China
周楚昊(1994-),男,博士,主要从事交通大数据、智能交通等研究。E-mail:505192138@qq.com

Received date: 2022-04-27

  Online published: 2022-11-28

Supported by

the National Natural Science Foundation of China(52072130);the Natural Science Foundation of Guangdong Province(2020A1515010349)

摘要

区域高速公路网收费站数量众多,每日产生海量收费数据,但由于设备、网络等因素,部分站点数据传输存在延迟现象,在此情况下已传输的数据往往不能满足实时流量预测的要求。为了实现实时交通数据补全和动态交通流量预测,文中首先提出了一种基于自监督学习的用于高速公路交通流量数据缺失补全的方法,该方法采用了基于注意力机制的时间序列模型(Seq2Seq-Att);然后使用自监督学习方式对模型进行训练;最后,以广东省高速公路网的80个收费站为例,验证方法的可靠性。结果表明:文中的数据补全方法能够灵活捕捉交通数据中的缺失情况,并根据数据自身的内在关联性,给出合理的补全值;该方法总体优于其他方法,且在不同缺失率下都有较好表现,总体MAPE约为17.7%、WMAPE为12.8%;在高缺失率情况下,该方法相比于其他补全方法有明显的优势。交通量预测结果表明,使用该方法补全的数据进行交通流预测的预测精度接近使用完整数据的情况。

本文引用格式

周楚昊, 林培群, 闫明月 . 基于自监督学习的交通数据补全算法[J]. 华南理工大学学报(自然科学版), 2023 , 51(4) : 101 -114 . DOI: 10.12141/j.issn.1000-565X.220237

Abstract

In the regional highway network, there are numerous toll stations generating massive amounts of data on a daily basis. However, due to equipment and network issues, there may be delays in data transmission for some stations. In such cases, the transmitted data may not be sufficient to meet the requirements for real-time traffic flow prediction. To achieve real-time traffic data imputation and dynamic traffic flow prediction, this paper firstly proposed a method for data imputation of highway traffic flow data based on self-supervised learning, which adopts time series model based on attention mechanism (Seq2Seq-Att). Then the self-supervised learning method was used to train the model. Finally, the reliability of the method was verified by taking 80 toll stations in the highway network of Guangdong province as an example. The results show that the method in this paper can flexibly capture the missing pattern in traffic data and give a reasonable value according to the internal correlation of the data. This method is generally superior to other methods and has good performance under different missing rates. The overall MAPE is about 17.7% and the WMAPE is 12.8%. In the case of high missing rate, this method has obvious advantages over other methods. The results of traffic volume prediction indicate that the prediction accuracy of traffic flow prediction using the data completed by this method is close to the situation of using complete data.

参考文献

1 NEWMAN D A .Missing data:five practical guidelines[J].Organizational Research Methods201417(4):372-411.
2 孟鸿程,陈淑燕 .交通流缺失数据处理方法比较分析[J].交通信息与安全201836(2):61-67.
  MENG Hong-cheng, CHEN Shu-yan .A comparative analysis of data imputation methods for missing traffic flow data[J].Journal of Transport Information and Safety201836(2):61-67.
3 黎镭,陈蔼祥,姚赞杰 .EM算法对缺失数据极大似然估计的实证研究[J].统计学与应用20187(2):210-220.
  LI Lei, CHEN Ai-xiang, YAO Zan-jie .An empirical study on maximum likelihood estimation of missing data by EM algorithm[J].Statistics and Application20187(2):210-220.
4 韩卫国,王劲峰,胡建军 .交通流量数据缺失值的插补方法[J].交通信息与安全200523(1):39-42.
  HAN Wei-guo, WANG Jin-feng, HU Jian-jun .Imputation methods for missing values in traffic flow data[J].Journal of Transport Information and Safety200523(1):39-42.
5 ROYSTON P .Multiple imputation of missing values[J].The Stata Journal20044(3):227-241.
6 LI L, ZHANG J, WANG Y,et al .Multiple imputation for incomplete traffic accident data using chained equations[C]∥Proceedings of the 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).New York:IEEE,2017:1-5.
7 陆化普,屈闻聪,孙智源 .基于S-G滤波的交通流故障数据识别与修复算法[J].土木工程学报201548(5):123-128.
  LU Hua-pu, QU Wen-cong, SUN Zhi-yuan .Detection and repair algorithm of traffic erroneous data based on S-G filtering[J].China Civil Engineering Journal201548(5):123-128.
8 秦一菲,马明辉,王岩松,等 .基于改进KNN算法的交通流异常数据修复方法[J].计算机测量与控制201826(12):180-184.
  QIN Yi-fei, MA Ming-hui, WANG Yan-song,et al .A recovery method for abnormal traffic flow data based on improved KNN algorithm[J].Computer Measurement & Control201826(12):180-184.
9 KAUR M, SARBJEET S, NAVEEN A .Missing traffic data imputation using a dual-stage error-corrected boosting regressor with uncertainty estimation[J].Information Sciences2022586:344-373.
10 ZHANG Y, ZHOU B, CAI X,et al .Missing value imputation in multivariate time series with end-to-end generative adversarial networks[J].Information Sciences2021551:67-82.
11 CAO W, WANG D, LI J,et al .BRITS:bidirectional recurrent imputation for time series[C]∥Proceedings of the 32nd Conference on Neural Information Processing Systems (NIPS).California:NIPS,2018:10.48550/arXiv.1805.10572.
12 ZHANG Z, LIN X, LI M,et al .A customized deep learning approach to integrate network-scale online traffic data imputation and prediction[J].Transportation Research Part C2021132:103372.
13 王力,李敏,闫佳庆,等 .基于生成式对抗网络的路网交通流数据补全方法[J].交通运输系统工程与信息201818(6):63-71.
  WANG Li, LI Min, YAN Jia-qing,et al .Urban traffic flow data recovery method based on generative adversarial network[J].Journal of Transportation Systems Engineering and Information Technology201818(6):63-71.
14 LIANG Y, ZHAN Z, SUN L .Memory-augmented dynamic graph convolution networks for traffic data imputation with diverse missing patterns[J].Transportation Research Part C2022143:103826.
15 LEE H, HWANG S J, SHIN J .Self-supervised label augmentation via input transformations[C]∥Proceedings of the International Conference on Machine Learning (ICML).San Diego:JMLR,2020:10.48550/arXiv.1910.05872.
16 HE K, CHEN X, XIE S,et al .Masked autoencoders are scalable vision learners[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Los Alamitos:IEEE Computer Society,2022:15979-15988.
17 DEVLIN J, CHANG M, LEE K,et al .BERT:pre-training of deep bidirectional transformers for language understanding[C]∥Proceedings of the 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies (NAACL-HLT).Stroudsburg:ACL,2019:4171-4186.
18 李山海,吴艳雄,刘玉龙,等 .基于编解码深度神经网络的交通流预测模型[J].电子技术与软件工程2021(14):138-141.
  LI Shan-hai, WU Yan-xiong, LIU Yu-long,et al .Traffic flow prediction model based on encoder-decoder deep neural network[J].Electronic Technology & Software Engineering2021(14):138-141.
19 GRAVES A, SCHMIDHUBER J .Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J].Neural Networks200518(5/6):602-610.
20 VASWANI A, SHAZEER N, PARMAR N,et al .Attention is all you need[C]∥Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS).California:NIPS,2017:10.48550/arXiv.1706.03762.
21 RUDER S .An overview of gradient descent optimization algorithms[J].arXiv preprint arXiv:.
22 郑智泉,王孟孟,田维琦 .基于加权K近邻算法的缺失数据填补研究[J].智能计算机与应用202111(11):31-33.
  ZHENG Zhi-quan, WANG Meng-meng, TIAN Wei-qi .Research on missing data filling based on weighted k-nearest neighbor algorithm[J].Intelligent Computer and Applications202111(11):31-33.
23 黄子敬 .基于时空注意力机制的高速公路多收费站多时段出口流量预测方法研究[D].广州:华南理工大学,2020.
文章导航

/