基于深度逆向强化学习的城市车辆路径链重构

王福建, 程慧玲, 马东方, 等

doi:10.12141/j.issn.1000-565X.220550

华南理工大学学报(自然科学版) >

2023 , Vol. 51 >Issue 7: 120 - 128

DOI: https://doi.org/10.12141/j.issn.1000-565X.220550

交通运输工程

基于深度逆向强化学习的城市车辆路径链重构

展开

^1.浙江大学建筑与工程学院, 浙江杭州 310058
^2.浙江大学工程师学院, 浙江杭州 310058
^3.浙江大学海洋学院, 浙江杭州 310058

王福建（1969-），男，博士，副教授，主要从事交通流理论、智能交通系统等的研究。

收稿日期: 2022-08-26

网络出版日期: 2023-01-19

基金资助

国家自然科学基金重点资助项目(52131202)

收起

Reconstruction of Urban Vehicle Path Chain Based on Deep Inverse Reinforcement Learning

Expand

^1.College of Civil Engineering and Architecture，Zhejiang University，Hangzhou 310058，Zhejiang，China
^2.Polytechnic Institute，Zhejiang University，Hangzhou 310058，Zhejiang，China
^3.Ocean College，Zhejiang University，Hangzhou 310058，Zhejiang，China

王福建（1969-），男，博士，副教授，主要从事交通流理论、智能交通系统等的研究。

Received date: 2022-08-26

Online published: 2023-01-19

Supported by

the Key Program of National Natural Science Foundation of China(52131202)

Fold

摘要

随着城市交通监测系统的完善，大量车牌识别数据被存储下来，该类数据具有时间连续性强、空间范围广、样本类型多的优点，为研究城市交通提供了信息基础。但受制于成本和技术，信息采集过程中采集的车牌数据在时空上并不连续，导致车牌识别数据的应用受限。有鉴于此，文中提出一套路径链提取方案，对单次出行的完整路径链和缺失路径链加以区分，并提出了一种基于深度逆向强化学习的城市车辆出行路径链重构算法。该算法对完整路径链进行采样获得专家示例，利用深度逆向强化学习挖掘专家示例，以非线性回报函数的形式拟合出潜藏的路径选择特性，指导智能体自主补全缺失的路径链，实现车辆出行缺失路径链的重构。将该算法在杭州市萧山区的局部路网中进行实验验证，结果表明，所提出的重构算法具有良好的稳定性能，其平均准确度可达95%，且在缺失程度较大的情况下准确度仍能保持在92%以上，相比传统算法具有较大优势。文中还分析了专家示例的位置分布和数量对算法性能的影响，验证了所提出的重构算法的泛化能力。

关键词： 城市道路网络; 车牌识别; 深度逆向强化学习; 数据处理; 路径链重构

本文引用格式

王福建, 程慧玲, 马东方, 等 . 基于深度逆向强化学习的城市车辆路径链重构[J]. 华南理工大学学报(自然科学版), 2023 , 51(7) : 120 -128 . DOI: 10.12141/j.issn.1000-565X.220550

Abstract

With the improvement of urban traffic monitoring system, a large number of license plate recognition data are stored. This type of data has the advantages of strong temporal continuity, wide spatial range and multiple sample types, which provides an information foundation for studying urban traffic. However, due to the cost and technology in the process of information collection, the collected license plate data are discontinuous in time and space domains, thus limiting the application of the data. To solve this problem, a path chain extraction scheme is proposed in this paper to distinguish the complete path chain from the missing path chain for a single trip, and a reconstruction algorithm of urban vehicle travel path chain based on deep inverse reinforcement learning is proposed. This algorithm samples the complete path chain to obtain expert examples, uses deep inverse reinforcement learning to mine expert examples, and gives the potential route selection characteristics by fitting in the form of nonlinear reward function, which guides the agent to complete the missing path chain independently, and realizes the reconstruction of the missing path chain of vehicle travel. According to the experimental validation in the local road network of Xiaoshan District, Hangzhou City, it is found that the proposed reconstruction algorithm possesses good stability performance, with an average accuracy of 95%; and that the accuracy keeps more than 92% even in case of significant missing points, so that it is of significant advantages as compared with the traditional algorithms. Moreover, by analyzing the impact of the location distribution and number of expert examples on the algorithm, strong generalization ability of the proposed reconstruction algorithm is verified.

Key words： urban road network; license plate recognition; deep inverse reinforcement learning; data processing; path chain reconstruction

参考文献

1	姜桂艳．道路交通状态判别技术与应用［M］．北京：人民交通出版社，2004．
2	KENNEDY J， CANTRELLC R， VARNEY M D ．Highway travel time analysis using license plate image capture techniques［C］∥ Proceedings of Industrial and Highway Sensors Technology．Providence：International Society for Optical Engineering，2004：294-303．
3	NIGRO M， CIPRIANI E， DEL GIUDICE A ．Exploiting floating car data for time-dependent origin-destination matrices estimation［J］．Journal of Intelligent Transportation Systems，2018，22（2）：159-174．
4	LUO X， WANG D， MA D，et al ．Grouped travel time estimation in signalized arterials using point-to-point detectors［J］．Transportation Research Part B：Methodological，2019，130：130-151．
5	SHEN X， ZHOU Y， JIN S，et al ．Spatiotemporal influence of land use and household properties on automobile travel demand［J］．Transportation Research Part D：Transport and Environment，2020，84：102359．
6	王龙飞，陈红，李杨，等．车辆出行轨迹调查分析中的丢点轨迹还原［J］．计算机应用研究，2014，31（1）：162-165．
	WANG Long-fei， CHEN Hong， LI Yang，et al ．Track patching method for incomplete track in track-oriented traffic survey and analysis［J］．Application Research of Computers，2014，31（1）：162-165．
7	YANG J， SUN J ．Vehicle path reconstruction using automatic vehicle identification data：an integrated particle filter and path flow estimator［J］．Transportation Research Part C：Emerging Technologies，2015，58：107-126．
8	阮树斌，王福建，马东方，等．基于车牌识别数据的机动车出行轨迹提取算法［J］．浙江大学学报（工学版），2018，52（5）：836-844．
	RUAN Shu-bin， WANG Fu-jian， MA Dong-fang，et al ．Vehicle trajectory extraction algorithm based on license plate recognition data［J］．Journal of Zhejiang University （Engineering Science），2018，52（5）：836-844．
9	毋帆．基于车牌识别数据的通勤路径选择行为与协同控制优化研究［D］．济南：山东大学，2020．
10	MO B， LI R M， DAI J C ．Estimating dynamic origin-destination demand：a hybrid framework using license plate recognition data［J］．Computer-Aided Civil and Infrastructure Engineering，2020，35（7）：734-752．
11	李晨溪，曹雷，张永亮，等．基于知识的深度强化学习研究综述［J］．系统工程与电子技术，2017，39（11）：2603-2613.
	LI Chen-xi， CAO Lei， ZHANG Yong-liang，et al ．Knowledge-based deep reinforcement learning：a review［J］．Systems Engineering and Electronics，2017，39（11）：2603-2613．
12	KISHIKAWA D， ARAI S ．Estimation of personal driving style via deep inverse reinforcement learning［J］．Artificial Life and Robotics，2021，26（3）：338-346．
13	BING Z， LEMKE C， CHENG L，et al ．Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning［J］．Neural Networks，2020，129：323-333．
14	KIRAN B R， SOBH I， TALPAERT V ．Deep reinforcement learning for autonomous driving：a survey［J］．IEEE Transactions on Intelligent Transportation Systems，2022，23（6）：4909-4926．
15	李晓莉，石建军．行程时间异常值处理方法研究［J］．武汉理工大学学报（交通科学与工程版），2012，36（1）：116-119．
	LI Xiao-li， SHI Jian-jun ．Research on the filtering method for travel time outliers［J］．Journal of Wuhan University of Technology （Transportation Science & Engineering），2012，36（1）：116-119．
16	柴华骏，李瑞敏，郭敏．基于车牌识别数据的城市道路旅行时间分布规律及估计方法研究［J］．交通运输系统工程与信息，2012，12（6）：41-47．
	CHAI Hua-jun， LI Rui-min， GUO Min ．Travel time distribution and estimation of urban traffic using vehicle identification data［J］．Journal of Transportation Systems Engineering & Information Technology，2012，12（6）：41-47．
17	NARITA A， HAYASHI K， TOMIOKA R，et al ．Tensor factorization using auxiliary information［J］．Data Mining & Knowledge Discovery，2012，25（2）：298-324．
18	DONG H， DING F， TAN H，et al ．Rail transit OD-matrix completion via manifold regularized tensor factorisation［J］．IET Intelligent Transport Systems，2021，15：1304-1317．
19	杨克宇，高云君，陈璐，等．分布式的增量式张量Tucker分解方法［J］．计算机学报，2021，44（8）：1696-1713．
	YANG Ke-yu， GAO Yun-jun， CHEN Lu，et al ．Distributed incremental tensor tucker decomposition［J］．Chinese Journal of Computers，2021，44（8）：1696-1713．
20	YOU C X， LU J B， FILEV D，et al ．Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning［J］．Robotics and Autonomous Systems，2018，114：1-18．
21	ZIEBART B D， MAAS A L， BAGNELL J A，et al ．Maximum entropy inverse reinforcement learning［C］∥ Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence，AAAI 2008．Chicago：AAAI，2008．
22	陈希亮，曹雷，何明，等．深度逆向强化学习研究综述［J］．计算机工程与应用，2018，54（5）：24-35．
	CHEN Xi-lian， CAO Lei， HE Ming，et al ．Overview of deep inverse reinforcement learning［J］．Computer Engineering and Application，2018，54（5）：24-35．

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献