Journal of South China University of Technology(Natural Science Edition) ›› 2023, Vol. 51 ›› Issue (4): 101-114.doi: 10.12141/j.issn.1000-565X.220237
Special Issue: 2023年交通运输工程
• Traffic & Transportation Engineering • Previous Articles Next Articles
ZHOU Chuhao1 LIN Peiqun1 YAN Mingyue2
Received:
2022-04-27
Online:
2023-04-25
Published:
2022-12-02
Contact:
林培群(1980-),男,教授,博士生导师,主要从事交通大数据、智能交通等研究。
E-mail:pqlin@scut.edu.cn
About author:
周楚昊(1994-),男,博士,主要从事交通大数据、智能交通等研究。E-mail:505192138@qq.com
Supported by:
CLC Number:
ZHOU Chuhao, LIN Peiqun, YAN Mingyue. Traffic Data Imputation Based on Self-Supervised Learning[J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(4): 101-114.
Table 2
Errors for different data imputation methods under different missing rates"
方法 | MAPE | WMAPE | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
缺失率20% | 缺失率40% | 缺失率60% | 缺失率80% | 平均值 | 缺失率20% | 缺失率40% | 缺失率60% | 缺失率80% | 平均值 | |
Proposed | 0.172 | 0.174 | 0.177 | 0.184 | 0.177 | 0.125 | 0.126 | 0.128 | 0.133 | 0.128 |
Proposed (无时间) | 0.179 | 0.183 | 0.188 | 0.205 | 0.189 | 0.129 | 0.131 | 0.135 | 0.142 | 0.134 |
LSTM-AutoEncoder | 0.182 | 0.187 | 0.193 | 0.208 | 0.193 | 0.131 | 0.133 | 0.135 | 0.144 | 0.136 |
GAN | 0.213 | 0.229 | 0.242 | 0.258 | 0.235 | 0.132 | 0.139 | 0.146 | 0.157 | 0.144 |
LSTM | 0.205 | 0.225 | 0.242 | 0.262 | 0.234 | 0.129 | 0.136 | 0.145 | 0.156 | 0.142 |
Interpolate | 0.166 | 0.174 | 0.193 | 0.25 | 0.196 | 0.112 | 0.118 | 0.13 | 0.168 | 0.132 |
KNN | 0.174 | 0.185 | 0.209 | 0.277 | 0.211 | 0.12 | 0.129 | 0.145 | 0.189 | 0.146 |
Ffill | 0.202 | 0.218 | 0.245 | 0.306 | 0.243 | 0.141 | 0.152 | 0.172 | 0.213 | 0.17 |
Mean | 0.511 | 0.551 | 0.602 | 0.67 | 0.583 | 0.287 | 0.309 | 0.338 | 0.377 | 0.328 |
Median | 0.48 | 0.54 | 0.605 | 0.675 | 0.575 | 0.3 | 0.331 | 0.364 | 0.399 | 0.349 |
Mode | 0.449 | 0.488 | 0.531 | 0.575 | 0.51 | 0.373 | 0.395 | 0.417 | 0.436 | 0.405 |
方法 | RMSE | MAE | ||||||||
缺失率20% | 缺失率40% | 缺失率60% | 缺失率80% | 平均值 | 缺失率20% | 缺失率40% | 缺失率60% | 缺失率80% | 平均值 | |
Proposed | 20.37 | 20.6 | 21.06 | 21.84 | 20.97 | 13.61 | 13.73 | 14.00 | 14.56 | 13.97 |
Proposed (无时间) | 21.07 | 21.32 | 21.92 | 22.93 | 21.81 | 14.09 | 14.28 | 14.68 | 15.45 | 14.63 |
LSTM-AutoEncoder | 21.33 | 21.52 | 21.99 | 23.35 | 22.05 | 14.26 | 14.44 | 14.77 | 15.74 | 14.80 |
GAN | 20.97 | 21.83 | 22.96 | 24.52 | 22.57 | 14.41 | 15.11 | 15.97 | 17.11 | 15.65 |
LSTM | 20.61 | 21.51 | 22.70 | 24.40 | 22.31 | 14.08 | 14.87 | 15.79 | 17.06 | 15.45 |
Interpolate | 18.79 | 19.82 | 22.15 | 28.88 | 22.41 | 12.34 | 12.95 | 14.32 | 18.48 | 14.52 |
KNN | 20.09 | 21.53 | 24.29 | 31.68 | 24.4 | 13.23 | 14.19 | 15.96 | 20.78 | 16.04 |
Ffill | 23.49 | 25.4 | 28.8 | 35.33 | 28.25 | 15.53 | 16.72 | 18.88 | 23.35 | 18.62 |
Mean | 42.99 | 46.27 | 50.51 | 56.13 | 48.98 | 31.51 | 33.96 | 37.16 | 41.44 | 36.02 |
Median | 47.41 | 51.8 | 56.3 | 60.84 | 54.09 | 32.92 | 36.35 | 40.01 | 43.86 | 38.28 |
Mode | 64.19 | 66.72 | 68.94 | 70.73 | 67.65 | 40.98 | 43.42 | 45.77 | 47.85 | 44.5 |
Table 3
Errors’ comparison of traffic volume prediction using imputation data under different missing rates"
预测 方法 | 数据 类型 | MAPE | MAE | RMSE | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
缺失率0 | 缺失率20% | 缺失率40% | 缺失率60% | 缺失率80% | 缺失率0 | 缺失率20% | 缺失率40% | 缺失率60% | 缺失率80% | 缺失率0 | 缺失率20% | 缺失率40% | 缺失率60% | 缺失率80% | ||
KNN | 缺失 | 0.512 | 0.713 | 1.203 | 1.699 | 134.6 | 197.6 | 256.5 | 308.7 | 178.4 | 251.9 | 325.0 | 417.6 | |||
补全 | 0.092 | 0.098 | 0.097 | 0.097 | 51.68 | 53.19 | 53.82 | 54.48 | 73.66 | 75.19 | 76.70 | 77.89 | ||||
完整 | 0.086 | 46.78 | 68.10 | |||||||||||||
GBRT | 缺失 | 0.214 | 0.308 | 0.449 | 0.737 | 86.4 | 107.3 | 143.5 | 202.6 | 113.4 | 136.2 | 176.2 | 244.5 | |||
补全 | 0.092 | 0.099 | 0.101 | 0.104 | 51.41 | 54.10 | 56.17 | 56.63 | 71.64 | 73.46 | 79.58 | 78.84 | ||||
完整 | 0.128 | 64.53 | 88.54 | |||||||||||||
RF | 缺失 | 0.225 | 0.412 | 0.631 | 0.956 | 88.7 | 125.7 | 176.5 | 238.4 | 115.7 | 159.2 | 213.7 | 279.5 | |||
补全 | 0.088 | 0.093 | 0.094 | 0.101 | 50.46 | 52.39 | 52.60 | 56.43 | 69.09 | 73.05 | 72.01 | 79.61 | ||||
完整 | 0.111 | 58.35 | 80.67 |
1 | NEWMAN D A .Missing data:five practical guidelines[J].Organizational Research Methods,2014,17(4):372-411. |
2 | 孟鸿程,陈淑燕 .交通流缺失数据处理方法比较分析[J].交通信息与安全,2018,36(2):61-67. |
MENG Hong-cheng, CHEN Shu-yan .A comparative analysis of data imputation methods for missing traffic flow data[J].Journal of Transport Information and Safety,2018,36(2):61-67. | |
3 | 黎镭,陈蔼祥,姚赞杰 .EM算法对缺失数据极大似然估计的实证研究[J].统计学与应用,2018,7(2):210-220. |
LI Lei, CHEN Ai-xiang, YAO Zan-jie .An empirical study on maximum likelihood estimation of missing data by EM algorithm[J].Statistics and Application,2018,7(2):210-220. | |
4 | 韩卫国,王劲峰,胡建军 .交通流量数据缺失值的插补方法[J].交通信息与安全,2005,23(1):39-42. |
HAN Wei-guo, WANG Jin-feng, HU Jian-jun .Imputation methods for missing values in traffic flow data[J].Journal of Transport Information and Safety,2005,23(1):39-42. | |
5 | ROYSTON P .Multiple imputation of missing values[J].The Stata Journal,2004,4(3):227-241. |
6 | LI L, ZHANG J, WANG Y,et al .Multiple imputation for incomplete traffic accident data using chained equations[C]∥Proceedings of the 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).New York:IEEE,2017:1-5. |
7 | 陆化普,屈闻聪,孙智源 .基于S-G滤波的交通流故障数据识别与修复算法[J].土木工程学报,2015,48(5):123-128. |
LU Hua-pu, QU Wen-cong, SUN Zhi-yuan .Detection and repair algorithm of traffic erroneous data based on S-G filtering[J].China Civil Engineering Journal,2015,48(5):123-128. | |
8 | 秦一菲,马明辉,王岩松,等 .基于改进KNN算法的交通流异常数据修复方法[J].计算机测量与控制,2018,26(12):180-184. |
QIN Yi-fei, MA Ming-hui, WANG Yan-song,et al .A recovery method for abnormal traffic flow data based on improved KNN algorithm[J].Computer Measurement & Control,2018,26(12):180-184. | |
9 | KAUR M, SARBJEET S, NAVEEN A .Missing traffic data imputation using a dual-stage error-corrected boosting regressor with uncertainty estimation[J].Information Sciences,2022,586:344-373. |
10 | ZHANG Y, ZHOU B, CAI X,et al .Missing value imputation in multivariate time series with end-to-end generative adversarial networks[J].Information Sciences,2021,551:67-82. |
11 | CAO W, WANG D, LI J,et al .BRITS:bidirectional recurrent imputation for time series[C]∥Proceedings of the 32nd Conference on Neural Information Processing Systems (NIPS).California:NIPS,2018:10.48550/arXiv.1805.10572. |
12 | ZHANG Z, LIN X, LI M,et al .A customized deep learning approach to integrate network-scale online traffic data imputation and prediction[J].Transportation Research Part C,2021,132:103372. |
13 | 王力,李敏,闫佳庆,等 .基于生成式对抗网络的路网交通流数据补全方法[J].交通运输系统工程与信息,2018,18(6):63-71. |
WANG Li, LI Min, YAN Jia-qing,et al .Urban traffic flow data recovery method based on generative adversarial network[J].Journal of Transportation Systems Engineering and Information Technology,2018,18(6):63-71. | |
14 | LIANG Y, ZHAN Z, SUN L .Memory-augmented dynamic graph convolution networks for traffic data imputation with diverse missing patterns[J].Transportation Research Part C,2022,143:103826. |
15 | LEE H, HWANG S J, SHIN J .Self-supervised label augmentation via input transformations[C]∥Proceedings of the International Conference on Machine Learning (ICML).San Diego:JMLR,2020:10.48550/arXiv.1910.05872. |
16 | HE K, CHEN X, XIE S,et al .Masked autoencoders are scalable vision learners[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Los Alamitos:IEEE Computer Society,2022:15979-15988. |
17 | DEVLIN J, CHANG M, LEE K,et al .BERT:pre-training of deep bidirectional transformers for language understanding[C]∥Proceedings of the 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies (NAACL-HLT).Stroudsburg:ACL,2019:4171-4186. |
18 | 李山海,吴艳雄,刘玉龙,等 .基于编解码深度神经网络的交通流预测模型[J].电子技术与软件工程,2021(14):138-141. |
LI Shan-hai, WU Yan-xiong, LIU Yu-long,et al .Traffic flow prediction model based on encoder-decoder deep neural network[J].Electronic Technology & Software Engineering,2021(14):138-141. | |
19 | GRAVES A, SCHMIDHUBER J .Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J].Neural Networks,2005,18(5/6):602-610. |
20 | VASWANI A, SHAZEER N, PARMAR N,et al .Attention is all you need[C]∥Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS).California:NIPS,2017:10.48550/arXiv.1706.03762. |
21 | RUDER S .An overview of gradient descent optimization algorithms[J].arXiv preprint arXiv:. |
22 | 郑智泉,王孟孟,田维琦 .基于加权K近邻算法的缺失数据填补研究[J].智能计算机与应用,2021,11(11):31-33. |
ZHENG Zhi-quan, WANG Meng-meng, TIAN Wei-qi .Research on missing data filling based on weighted k-nearest neighbor algorithm[J].Intelligent Computer and Applications,2021,11(11):31-33. | |
23 | 黄子敬 .基于时空注意力机制的高速公路多收费站多时段出口流量预测方法研究[D].广州:华南理工大学,2020. |
[1] | YANG Yonghong, TANG Zude, WANG Chun, ZHU Guanru. Influence of Alignment Indexes of Highway Turning Section on Safety Performance of Concrete Guardrail [J]. Journal of South China University of Technology(Natural Science Edition), 2024, 52(1): 90-99. |
[2] | HUANG Yan, FU Xinsha, ZENG Yanjie, et al. 3D Modeling Method of Highway Based on Lidar Odometer [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(7): 129-138. |
[3] | DONG Ping, WEI Shuyang, LIU Mingbo. Scheduling Strategies for Electric Vehicle Participation in Electricity Markets Under Multi-Network Collaboration [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(12): 83-94. |
[4] | WU Yanxia, ZHOU Tong, HUANG Shuai, et al. Research on Roadside Environment Safety Based on Driver’s Attention Distribution Model [J]. Journal of South China University of Technology(Natural Science Edition), 2022, 50(9): 49-57. |
[5] | WANG Xiaofei, LI Siyu, CHEN Mi, et al. Influence of the Combination Equilibrium of Horizontal and Crest Vertical Curves on Highway Safety [J]. Journal of South China University of Technology(Natural Science Edition), 2022, 50(7): 76-84. |
[6] | SONG Jian, WANG Wenlong, LI Dong, et al. Injection molding part size prediction method based on Stacking integration learning [J]. Journal of South China University of Technology(Natural Science Edition), 2022, 50(6): 19-26. |
[7] | FU Xinsha, ZENG Yanjie, MA Li, et al. Weather Recognition of Highway Surveillance Scenes Based on Light-Weight Deep Neural Network [J]. Journal of South China University of Technology(Natural Science Edition), 2022, 50(3): 1-8. |
[8] | HE Yongming, FENG Jia, QUAN Cong, et al. Braking Collision Avoidance System for Vehicles Driving on Superhighway Based on Co-simulation [J]. Journal of South China University of Technology(Natural Science Edition), 2022, 50(10): 19-28. |
[9] | LIN Peiqun XIA Yu ZHOU Chuhao. Freeway Travel Time Prediction Based on Spatial and Temporal Characteristics of Road Networks [J]. Journal of South China University of Technology(Natural Science Edition), 2021, 49(8): 1-11. |
[10] | JIA Ruo, DAI Shenghong, HUANG Ni, et al. Literature Review on Traffic Congestion Identification Methods [J]. Journal of South China University of Technology (Natural Science Edition), 2021, 49(4): 124-139. |
[11] | ZHAO Jing, WANG Xuancang, FAN Zhenyang, et al. Evaluation on Performance of Asphalt Pavement Based on Support Vector Machine [J]. Journal of South China University of Technology (Natural Science Edition), 2020, 48(9): 116-123. |
[12] | CHANG Xin, LI Haijian, RONG Jian, et al. Analysis of Capacity for Mixed Traffic Flow with Connected Vehicle Platoon on Freeway [J]. Journal of South China University of Technology (Natural Science Edition), 2020, 48(4): 142-148. |
[13] | ZHANG Ziye, LI Mingchang, LIANG Lingrui, et al. Improved Transfer Learning Algorithm Based on Cross-domain in Recommendation System [J]. Journal of South China University of Technology (Natural Science Edition), 2020, 48(11): 99-106. |
[14] | YUAN Hua CHEN Zehao. Short-term Traffic Flow Prediction Based on Temporal Convolutional Networks [J]. Journal of South China University of Technology (Natural Science Edition), 2020, 48(11): 107-113,122. |
[15] | . Elastic Property Prediction of Materials Based on Machine Learning and Feature Selection [J]. Journal of South China University of Technology (Natural Science Edition), 2019, 47(5): 48-55. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||