Journal of South China University of Technology(Natural Science Edition) ›› 2024, Vol. 52 ›› Issue (5): 10-19.doi: 10.12141/j.issn.1000-565X.230078

• Traffic & Transportation Engineering • Previous Articles     Next Articles

Recognition Model of Highway Toll Evasion Behavior Considering Cost-Sensitivity

ZHAO Jiandong1,2(), XU Huiling1, LÜ Xing1, LI Pingan3, HUANG Shiyin3   

  1. 1.School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China
    2.Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport, Beijing Jiaotong University, Beijing 100044, China
    3.TransChina(Beijing) Technology Co. , Ltd. , Beijing 100088, China
  • Received:2023-03-03 Online:2024-05-25 Published:2023-06-19
  • About author:赵建东(1975-),男,博士,教授,主要从事交通大数据和智能网联交通等研究。
  • Supported by:
    the National Natural Science Foundation of China(72288101);the National Key Research and Development Program(2019YFB1600200)

Abstract:

In order to effectively improve the efficiency of highway vehicle toll evasion inspection, based on ETC (Electronic Toll Collection) toll data, this paper proposed a highway vehicle evasion recognition model by combining KNN (K-Nearest Neighbor), adaptive boosting (Adaboost) algorithm and cost-sensitive learning mechanism. Firstly, in view of the large volume and redundancy of the original ETC toll flow data, data discretization and standardization processing rules were developed to repair and standardize the data form, and then two types of toll evasion features were extracted. Secondly, seven types of toll evasion, such as large vehicles with small tags, were selected as the main research objects by analyzing the ETC data set. Thirdly, to address the problem of inefficient model classification due to the “high-dimensional” characteristics of the evasion data, the best subset of features showing the evasion characteristics was selected by Pearson and Spearman correlation analysis and ReliefF importance analysis. Fourthly, to address the model overfitting problem caused by the class “imbalance” between toll evasion vehicles and normal vehicles, KNN was used as the base classifier in the Adaboost algorithm, and the boundary ambiguity of different categories was alleviated through TomekLinks undersampling, then a cost-sensitive learning mechanism was introduced to improve the model’s emphasis on the minority class (toll evasion vehicles) to alleviate the tendency to discriminate the majority class (normal vehicles). Finally, the performance of the KNN-Adaboost model incorporating cost-sensitive learning mechanisms was verified by comparing the recognition effects of different classification models for various types of evasion events. The results show that the precision of the proposed model is 0.98, Recall is 0.96, F1-score is 0.97, and Kappa coefficient is 0.95, indicating that the proposed model can better solve the sample class imbalance problem than other models and has higher recognition accuracy for minority class,and it can be a reference for improving the efficiency of highway toll inspection.

Key words: highway transport, ensemble learning, machine learning, cost-sensitivity, feature selection

CLC Number: