交通运输工程

基于深度逆向强化学习的城市车辆路径链重构

展开
  • 1.浙江大学 建筑与工程学院, 浙江 杭州 310058
    2.浙江大学 工程师学院, 浙江 杭州 310058
    3.浙江大学 海洋学院, 浙江 杭州 310058
王福建(1969-),男,博士,副教授,主要从事交通流理论、智能交通系统等的研究。

收稿日期: 2022-08-26

  网络出版日期: 2023-01-19

基金资助

国家自然科学基金重点资助项目(52131202)

Reconstruction of Urban Vehicle Path Chain Based on Deep Inverse Reinforcement Learning

Expand
  • 1.College of Civil Engineering and Architecture,Zhejiang University,Hangzhou 310058,Zhejiang,China
    2.Polytechnic Institute,Zhejiang University,Hangzhou 310058,Zhejiang,China
    3.Ocean College,Zhejiang University,Hangzhou 310058,Zhejiang,China
王福建(1969-),男,博士,副教授,主要从事交通流理论、智能交通系统等的研究。

Received date: 2022-08-26

  Online published: 2023-01-19

Supported by

the Key Program of National Natural Science Foundation of China(52131202)

摘要

随着城市交通监测系统的完善,大量车牌识别数据被存储下来,该类数据具有时间连续性强、空间范围广、样本类型多的优点,为研究城市交通提供了信息基础。但受制于成本和技术,信息采集过程中采集的车牌数据在时空上并不连续,导致车牌识别数据的应用受限。有鉴于此,文中提出一套路径链提取方案,对单次出行的完整路径链和缺失路径链加以区分,并提出了一种基于深度逆向强化学习的城市车辆出行路径链重构算法。该算法对完整路径链进行采样获得专家示例,利用深度逆向强化学习挖掘专家示例,以非线性回报函数的形式拟合出潜藏的路径选择特性,指导智能体自主补全缺失的路径链,实现车辆出行缺失路径链的重构。将该算法在杭州市萧山区的局部路网中进行实验验证,结果表明,所提出的重构算法具有良好的稳定性能,其平均准确度可达95%,且在缺失程度较大的情况下准确度仍能保持在92%以上,相比传统算法具有较大优势。文中还分析了专家示例的位置分布和数量对算法性能的影响,验证了所提出的重构算法的泛化能力。

本文引用格式

王福建, 程慧玲, 马东方, 等 . 基于深度逆向强化学习的城市车辆路径链重构[J]. 华南理工大学学报(自然科学版), 2023 , 51(7) : 120 -128 . DOI: 10.12141/j.issn.1000-565X.220550

Abstract

With the improvement of urban traffic monitoring system, a large number of license plate recognition data are stored. This type of data has the advantages of strong temporal continuity, wide spatial range and multiple sample types, which provides an information foundation for studying urban traffic. However, due to the cost and technology in the process of information collection, the collected license plate data are discontinuous in time and space domains, thus limiting the application of the data. To solve this problem, a path chain extraction scheme is proposed in this paper to distinguish the complete path chain from the missing path chain for a single trip, and a reconstruction algorithm of urban vehicle travel path chain based on deep inverse reinforcement learning is proposed. This algorithm samples the complete path chain to obtain expert examples, uses deep inverse reinforcement learning to mine expert examples, and gives the potential route selection characteristics by fitting in the form of nonlinear reward function, which guides the agent to complete the missing path chain independently, and realizes the reconstruction of the missing path chain of vehicle travel. According to the experimental validation in the local road network of Xiaoshan District, Hangzhou City, it is found that the proposed reconstruction algorithm possesses good stability performance, with an average accuracy of 95%; and that the accuracy keeps more than 92% even in case of significant missing points, so that it is of significant advantages as compared with the traditional algorithms. Moreover, by analyzing the impact of the location distribution and number of expert examples on the algorithm, strong generalization ability of the proposed reconstruction algorithm is verified.

参考文献

1 姜桂艳 .道路交通状态判别技术与应用[M].北京:人民交通出版社,2004
2 KENNEDY J, CANTRELLC R, VARNEY M D .Highway travel time analysis using license plate image capture techniques[C]∥ Proceedings of Industrial and Highway Sensors Technology.Providence:International Society for Optical Engineering,2004:294-303.
3 NIGRO M, CIPRIANI E, DEL GIUDICE A .Exploiting floating car data for time-dependent origin-destination matrices estimation[J].Journal of Intelligent Transportation Systems201822(2):159-174.
4 LUO X, WANG D, MA D,et al .Grouped travel time estimation in signalized arterials using point-to-point detectors[J].Transportation Research Part B:Methodological2019130:130-151.
5 SHEN X, ZHOU Y, JIN S,et al .Spatiotemporal influence of land use and household properties on automobile travel demand[J].Transportation Research Part D:Transport and Environment202084:102359.
6 王龙飞,陈红,李杨,等 .车辆出行轨迹调查分析中的丢点轨迹还原[J].计算机应用研究201431(1):162-165.
  WANG Long-fei, CHEN Hong, LI Yang,et al .Track patching method for incomplete track in track-oriented traffic survey and analysis[J].Application Research of Computers201431(1):162-165.
7 YANG J, SUN J .Vehicle path reconstruction using automatic vehicle identification data:an integrated particle filter and path flow estimator[J].Transportation Research Part C:Emerging Technologies201558:107-126.
8 阮树斌,王福建,马东方,等 .基于车牌识别数据的机动车出行轨迹提取算法[J].浙江大学学报(工学版)201852(5):836-844.
  RUAN Shu-bin, WANG Fu-jian, MA Dong-fang,et al .Vehicle trajectory extraction algorithm based on license plate recognition data[J].Journal of Zhejiang University (Engineering Science)201852(5):836-844.
9 毋帆 .基于车牌识别数据的通勤路径选择行为与协同控制优化研究[D].济南:山东大学,2020
10 MO B, LI R M, DAI J C .Estimating dynamic origin-destination demand:a hybrid framework using license plate recognition data[J].Computer-Aided Civil and Infrastructure Engineering202035(7):734-752.
11 李晨溪,曹雷,张永亮,等 .基于知识的深度强化学习研究综述[J].系统工程与电子技术201739(11):2603-2613.
  LI Chen-xi, CAO Lei, ZHANG Yong-liang,et al .Knowledge-based deep reinforcement learning:a review[J].Systems Engineering and Electronics201739(11):2603-2613.
12 KISHIKAWA D, ARAI S .Estimation of personal driving style via deep inverse reinforcement learning[J].Artificial Life and Robotics202126(3):338-346.
13 BING Z, LEMKE C, CHENG L,et al .Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning[J].Neural Networks2020129:323-333.
14 KIRAN B R, SOBH I, TALPAERT V .Deep reinforcement learning for autonomous driving:a survey[J].IEEE Transactions on Intelligent Transportation Systems202223(6):4909-4926.
15 李晓莉,石建军 .行程时间异常值处理方法研究[J].武汉理工大学学报(交通科学与工程版)201236(1):116-119.
  LI Xiao-li, SHI Jian-jun .Research on the filtering method for travel time outliers[J].Journal of Wuhan University of Technology (Transportation Science & Engineering)201236(1):116-119.
16 柴华骏,李瑞敏,郭敏 .基于车牌识别数据的城市道路旅行时间分布规律及估计方法研究[J].交通运输系统工程与信息201212(6):41-47.
  CHAI Hua-jun, LI Rui-min, GUO Min .Travel time distribution and estimation of urban traffic using vehicle identification data[J].Journal of Transportation Systems Engineering & Information Technology201212(6):41-47.
17 NARITA A, HAYASHI K, TOMIOKA R,et al .Tensor factorization using auxiliary information[J].Data Mining & Knowledge Discovery201225(2):298-324.
18 DONG H, DING F, TAN H,et al .Rail transit OD-matrix completion via manifold regularized tensor factorisation[J].IET Intelligent Transport Systems202115:1304-1317.
19 杨克宇,高云君,陈璐,等 .分布式的增量式张量Tucker分解方法[J].计算机学报202144(8):1696-1713.
  YANG Ke-yu, GAO Yun-jun, CHEN Lu,et al .Distributed incremental tensor tucker decomposition[J].Chinese Journal of Computers202144(8):1696-1713.
20 YOU C X, LU J B, FILEV D,et al .Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning[J].Robotics and Autonomous Systems2018114:1-18.
21 ZIEBART B D, MAAS A L, BAGNELL J A,et al .Maximum entropy inverse reinforcement learning[C]∥ Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence,AAAI 2008.Chicago:AAAI,2008.
22 陈希亮,曹雷,何明,等 .深度逆向强化学习研究综述[J].计算机工程与应用201854(5):24-35.
  CHEN Xi-lian, CAO Lei, HE Ming,et al .Overview of deep inverse reinforcement learning[J].Computer Engineering and Application201854(5):24-35.
文章导航

/