Journal of South China University of Technology(Natural Science Edition) ›› 2025, Vol. 53 ›› Issue (10): 14-28.doi: 10.12141/j.issn.1000-565X.240378
• Traffic Safety • Previous Articles Next Articles
ZHANG Zhiqing, YU Xiaozheng, ZHU Leipeng, SUN Yufeng, LI Yixin
Received:2024-07-20
Online:2025-10-25
Published:2025-05-06
About author:张志清(1965—),男,博士,教授,主要从事道路安全研究。E-mail: zhangzhiqing@bjut.edu.cn
Supported by:CLC Number:
ZHANG Zhiqing, YU Xiaozheng, ZHU Leipeng, SUN Yufeng, LI Yixin. Identifying Key Causes of Accidents for Autonomous Vehicles Based on CTGAN[J]. Journal of South China University of Technology(Natural Science Edition), 2025, 53(10): 14-28.
Table 1
Statistical characteristics of variables"
| 子系统 | 变量 | 统计特征 | ||
|---|---|---|---|---|
| 频率 | 占比/% | |||
| 因变量 | 最高受伤程度 (H_I_S) | 0—未受伤 | 680 | 92.14 |
| 1—受伤 | 58 | 7.86 | ||
| 时间 | 事故月份 (C_M) | 0—春季 | 115 | 15.58 |
| 1—夏季 | 203 | 27.51 | ||
| 2—秋季 | 246 | 33.33 | ||
| 3—冬季 | 174 | 23.58 | ||
事故周 (C_W) | 0—周末 | 174 | 23.58 | |
| 1—周周中 | 564 | 76.42 | ||
事故时间 (C_H) | 0—早高峰 | 77 | 10.43 | |
| 1—晚高峰 | 102 | 13.82 | ||
| 2—其他 | 559 | 75.75 | ||
| 人 | 驾驶员类型 (D_T) | 0—消费者 | 708 | 95.90 |
| 1—商用/测试 | 30 | 4.00 | ||
| 车 | 行驶里程 (M_L) | 0—小于等于50,000 | 539 | 73.00 |
| 1—大于50,000 | 168 | 22.80 | ||
碰撞主体 (C_T) | 0—乘用车 | 191 | 25.88 | |
| 1—卡车/厢式货车 | 64 | 8.67 | ||
| 2—摩托车 | 4 | 0.54 | ||
| 3—行人 | 5 | 0.68 | ||
| 4—固定物体 | 185 | 25.07 | ||
| 5—其他 | 289 | 39.16 | ||
碰撞前运动状态 (P_M) | 0—向前行驶 | 463 | 62.74 | |
| 1—掉头 | 22 | 2.98 | ||
| 2—车道偏离 | 49 | 6.64 | ||
| 3—变道 | 12 | 1.63 | ||
| 4—停车 | 19 | 2.57 | ||
| 5—其他 | 23 | 3.12 | ||
气囊是否打开 (A_B) | 0—是 | 116 | 15.72 | |
| 1—否 | 622 | 84.28 | ||
碰撞前速度 (P_S) | 0—小于等于32.2 km/h | 104 | 14.09 | |
| 1—大于32.2 km/h且小于等于64.4 km/h | 187 | 25.34 | ||
| 2—大于64.4 km/h且小于96.6 km/h | 160 | 21.68 | ||
| 3—大于等于96.6 km/h | 156 | 21.14 | ||
| 路 | 道路类型 (R_T) | 0—高速公路 | 432 | 58.54 |
| 1—街道 | 92 | 12.47 | ||
| 2—交叉口 | 69 | 9.35 | ||
| 3—停车场 | 2 | 0.27 | ||
| 4—乡村道路 | 30 | 4.07 | ||
| 5—其他 | 113 | 15.31 | ||
路面状况 (R_S) | 0—干燥 | 415 | 56.23 | |
| 1—雪/融雪/冰 | 7 | 0.95 | ||
| 2—潮湿 | 123 | 16.67 | ||
| 3—其他 | 193 | 26.15 | ||
碰撞描述 (R_D) | 0—无特殊情况 | 463 | 62.74 | |
| 1—交通事故 | 26 | 3.52 | ||
| 2—工作区 | 12 | 1.63 | ||
| 3—标志标线缺失/不清晰 | 3 | 0.41 | ||
| 4—其他 | 30 | 4.07 | ||
| 环境 | 光照条件 (L_T) | 0—白天 | 307 | 41.60 |
| 1—黎明 / 黄昏 | 33 | 4.47 | ||
| 2—黑夜 | 204 | 27.64 | ||
天气状况 (W_T) | 0—晴朗 | 355 | 48.10 | |
| 1—雪天 | 7 | 0.95 | ||
| 2—多云 | 62 | 8.40 | ||
| 3—雾天 | 1 | 0.14 | ||
| 4—雨天 | 106 | 14.36 | ||
Table 5
Classification algorithm hyperparameter tuning results"
| 分类算法 | 参数名称 | 最优值 |
|---|---|---|
| LR | penalty | L1 |
| C | 1.087 | |
| solver | liblinear | |
| DT | max_depth | 44 |
| min_samples_split | 2 | |
| min_samples_leaf | 1 | |
| RF | n_estimators | 165 |
| max_depth | 18 | |
| max_features | 4 | |
| min_samples_leaf | 1 | |
| min_samples_split | 19 | |
| criterion | gini | |
| XGB | learning_rate | 0.276 |
| n_estimators | 275 | |
| subsample | 0.857 | |
| max_depth | 32 | |
| SVM | C | 3.029 |
| kernel | rbf | |
| gamma | 0.3126 | |
| probability | True |
Table 7
Classification performance for each classification augmentation method"
| 分类算法 | 数据增强 算法 | 召回率 | 特异性 | 加权F1分数 | AUC |
|---|---|---|---|---|---|
| LR | None | 0.00 | 1.00 | 0.89 | 0.72 |
| CTGAN | 0.67 | 0.62 | 0.71 | 0.65 | |
| CopulaGAN | 0.41 | 0.75 | 0.78 | 0.66 | |
| SMOTE | 0.47 | 0.81 | 0.82 | 0.68 | |
| ADASYN | 0.53 | 0.81 | 0.83 | 0.68 | |
| DT | None | 0.06 | 0.95 | 0.87 | 0.45 |
| CTGAN | 0.71 | 0.75 | 0.80 | 0.73 | |
| CopulaGAN | 0.47 | 0.74 | 0.78 | 0.60 | |
| SMOTE | 0.35 | 0.88 | 0.86 | 0.62 | |
| ADASYN | 0.41 | 0.89 | 0.87 | 0.70 | |
| RF | None | 0.24 | 0.97 | 0.90 | 0.81 |
| CTGAN | 0.82 | 0.84 | 0.87 | 0.86 | |
| CopulaGAN | 0.35 | 0.77 | 0.79 | 0.70 | |
| SMOTE | 0.24 | 0.96 | 0.90 | 0.82 | |
| ADASYN | 0.29 | 0.95 | 0.90 | 0.79 | |
| XGB | None | 0.18 | 0.99 | 0.91 | 0.80 |
| CTGAN | 0.65 | 0.81 | 0.84 | 0.80 | |
| CopulaGAN | 0.29 | 0.81 | 0.81 | 0.62 | |
| SMOTE | 0.35 | 0.96 | 0.91 | 0.77 | |
| ADASYN | 0.47 | 0.94 | 0.91 | 0.79 | |
| SVM | None | 0.35 | 0.92 | 0.88 | 0.50 |
| CTGAN | 0.88 | 0.74 | 0.80 | 0.74 | |
| CopulaGAN | 0.65 | 0.72 | 0.78 | 0.70 | |
| SMOTE | 0.18 | 0.96 | 0.89 | 0.78 | |
| ADASYN | 0.11 | 0.97 | 0.89 | 0.79 |
Table 8
Comparison of computational complexity and running efficiency of data augmentation algorithms"
数据增强 算法 | 训练复杂度 | 生成复杂度 | 运行 效率 | 计算 资源 |
|---|---|---|---|---|
| CTGAN | O(Eσh2τ)=O(7.3×1010) | O(σh2)=O(131 072) | 4 min 7 s | 高(GPU) |
| CopulaGAN | O(τ2+d3+τd)=O(1.95×105) | O(τd+d2)=O(6 675) | 3 min 51 s | 中(CPU) |
| SMOTE | O(τkd)=O(32 250) | O(τkd)=O(32 250) | <1 s | 低(CPU) |
| ADASYN | O(τkd+τ)=O(32 680) | O(τkd)=O(32 250) | <1 s | 低(CPU) |
| [1] | KUO P F, HSU W T, LORD D,et al .Classification of autonomous vehicle crash severity:solving the pro-blems of imbalanced datasets and small sample size[J].Accident Analysis & Prevention,2024,205:107666/1-13. |
| [2] | MEASE D, WYNER A J, BUJA A .Boosted classification trees and class probability/quantile estimation[J].Journal of Machine Learning Research,2007,8:409-439. |
| [3] | HE H, GARCIA E A .Learning from imbalanced data[J].IEEE Transactions on Knowledge and Data Engineering,2009,21(9):1263-1284. |
| [4] | HE H, BAI Y, GARCIA E A,et al .ADASYN:adaptive synthetic sampling approach for imbalanced learning[C]∥ Proceeding of 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).Hong Kong:IEEE,2008:1322-1328. |
| [5] | BARUA S, ISLAM M M, YAO X,et al .MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning[J].IEEE Transactions on Knowledge and Data Engineering,2012,26(2):405-425. |
| [6] | TANG B, HE H .KernelADASYN:Kernel based adaptive synthetic data generation for imbalanced learning[C]∥ Proceeding of 2015 IEEE Congress on Evolutionary Computation (CEC).Sendai:IEEE,2015:664-671. |
| [7] | ZHU S. Analysis of the severity of vehicle-bicycle crashes with data mining techniques[J].Journal of Safety Research,2020,76:218-227. |
| [8] | CAI Q, ABDEL-ATY M, YUAN J,et al .Real-time crash prediction on expressways using deep generative models[J].Transportation Research Part C:Emerging Technologies,2020,117:102697/1-14. |
| [9] | MIRZA M, OSINDERO S .Conditional generative adversarial nets[J].arXiv preprint arXiv:,2014. |
| [10] | RADFORD A, METZ L, CHINTALA S .Unsupervised representation learning with deep convolutional generative adversarial networks[J].arXiv preprint arXiv:,2015. |
| [11] | ARJOVSKY M, CHINTALA S .Bottou. Wasserstein GAN[J].arXiv preprint arXiv:,2017. |
| [12] | ZHOU D, ZHANG H, LI Q,et al .Coutfitgan:learning to synthesize compatible outfits supervised by silhouette masks and fashion styles[J].IEEE Tran-sactions on Multimedia,2022,25(1):4986-5001. |
| [13] | ZHOU D, ZHANG H, YANG K,et al .Learning to synthesize compatible fashion items using semantic alignment and collocation classification:an outfit ge-neration framework[J].IEEE Transactions on Neural Networks and Learning Systems,2022,35(4):5226-5240. |
| [14] | FIORE U, DE SANTIS A, PERLA F,et al .Using generative adversarial networks for improving classification effectiveness in credit card fraud detection[J].Information Sciences, 2019,479:448-455. |
| [15] | ZHANG H, YU X, REN P,et al .Deep adversarial learning in intrusion detection:a data augmentation enhanced framework[J].arXiv preprint arXiv:,2019. |
| [16] | LI Y, YANG Z, XING L .Crash injury severity prediction considering data imbalance:a wasserstein ge-nerative adversarial network with gradient penalty approach[J].Accident Analysis & Prevention,2023,192:107271/1-18. |
| [17] | ZHOU B, ZHOU Q, LI Z .Addressing data imba-lance in crash data: evaluating generative adversarial network’s efficacy against conventional methods[J].IEEE Access,2025,13:2929-2944. |
| [18] | MUJALLI R O, LÓPEZ G, GARACH L .Bayes classifiers for imbalanced traffic accidents datasets[J].Accident Analysis & Prevention,2016,88:37-51. |
| [19] | SAVOLAINEN P T, MANNERING F L, LORD D,et al .The statistical analysis of highway crash-injury severities:a review and assessment of methodological alternatives[J].Accident Analysis & Prevention,2011,43(5):1666-1676. |
| [20] | ALKHEDER S, ALRUKAIBI F, AIASH A .Risk analysis of traffic accidents’severities:an application of three data mining models[J].ISA Transactions,2020,106:213-220. |
| [21] | WEN X, XIE Y, WU L,et al .Quantifying and comparing the effects of key risk factors on various types of roadway segment crashes with LightGBM and SHAP[J].Accident Analysis & Prevention,2021,159:106261/1-11. |
| [22] | DONG S, KHATTAK A, ULLAH I,et al .Predicting and analyzing road traffic injury severity using boosting-based ensemble learning models with SHAPley Additive exPlanations[J].International Journal of Environmental Research and Public Health,2022,19(5):2925/1-23. |
| [23] | WANG H, WANG X, HAN J,et al .A recognition method of aggressive driving behavior based on ensemble learning[J].Sensors,2022,22(2):644/1-24. |
| [24] | WU N, SUN J .Fatigue detection of air traffic controllers based on radiotelephony communications and self-adaption quantum genetic algorithm optimization ensemble learning[J].Applied Sciences,2022,12(20):10252. |
| [25] | IMRAN M, MAHMOOD A M, QYSER A A M .An empirical experimental evaluation on imbalanced data sets with varied imbalance ratio[C]∥ Proceeding of International Conference on Computing and Communication Technologies.Chengdu:IEEE,2014:1-7. |
| [26] | XU L, SKOULARIDOU M, CUESTA-INFANTE A,et al .Modeling tabular data using conditional gan[J].Advances in Neural Information Processing Systems,2019,659:7335-7345. |
| [27] | BOUROU SEL SAER A, VELIVASSAKI T H,et al .A review of tabular data synthesis using gans on an ids dataset[J].Information,2021,12(9):375. |
| [28] | ZHENG O, ABDEL-ATY M, WANG Z,et al .Avoid:autonomous vehicle operation incident dataset across the globe[J].arXiv preprint arXiv:2303.12889,2023. |
| [29] | DAS P, CHANDA K .Bayesian Network based modeling of regional rainfall from multiple local meteorological drivers[J].Journal of Hydrology,2020,591:125563/1-17. |
| [30] | DING S, ABDEL-ATY M, WANG D,et al .Exploratory analysis of injury severity under different levels of driving automation (SAE Level 2-5) using multi-source data[J].arXiv preprint arXiv:,2023. |
| [31] | LIU P, GUO Y, LIU P,et al .What can we learn from the AV crashes?—an association rule analysis for identifying the contributing risky factors[J].Accident Analysis & Prevention,2024,199:107492/1-12. |
| [32] | KHAN M Q, LEE S .A comprehensive survey of dri-ving monitoring and assistance systems[J].Sensors,2019,19(11):2574/1-32. |
| [33] | LI J, LI B, TU Z,et al .Light the night:a multi-condition diffusion framework for unpaired low-light Enhancement in Autonomous Driving[C]∥ Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville:IEEE,2024: 15205-15215. |
| [34] | LI X, LIN K Y, MENG M,et al .A survey of ADAS perceptions with development in China[J].IEEE Transactions on Intelligent Transportation Systems,2022,23(9):14188-14203. |
| [1] | . A Study on the Impact Mechanism of Human-Machine Mixed DrivingTraffic Flow Under Occasional Accident [J]. Journal of South China University of Technology(Natural Science Edition), 2025, 53(8): 61-72. |
| [2] | MA Ping, LIANG Cheng, WANG Cong, et al. Meta-DAE Fault Diagnosis Based on Prototype Domain Enhancement in Few-Shot [J]. Journal of South China University of Technology(Natural Science Edition), 2025, 53(1): 62-73. |
| [3] | ZHANG Yan, WU Luotian, WANG Nian, et al. 2D Footprint Classification Based on Multiple-Module Relation Network [J]. Journal of South China University of Technology (Natural Science Edition), 2021, 49(6): 66-76. |
| [4] | CHEN Jiqing, SHU Xiaoxiong, LAN Fengchong, et al. Construction of Autonomous Vehicles Test Scenarios with Typical Dangerous Accident Characteristics [J]. Journal of South China University of Technology (Natural Science Edition), 2021, 49(5): 1-8. |
| [5] |
SUN Chuanyang ZHANG Xin XI Lihe CHEN Hongwei .
Design for the Steering Controller of Autonomous
Vehicles at the Limits of Handling
|
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||