基于深度逆向强化学习的城市车辆路径链重构

doi:10.12141/j.issn.1000-565X.220550

华南理工大学学报(自然科学版) ›› 2023, Vol. 51 ›› Issue (7): 120-128.doi: 10.12141/j.issn.1000-565X.220550

所属专题： 2023年交通运输工程

基于深度逆向强化学习的城市车辆路径链重构

王福建¹ 程慧玲² 马东方³ 王殿海¹

^1.浙江大学建筑与工程学院, 浙江杭州 310058
^2.浙江大学工程师学院, 浙江杭州 310058
^3.浙江大学海洋学院, 浙江杭州 310058

收稿日期:2022-08-26 出版日期:2023-07-25 发布日期:2023-01-20
通信作者: 王福建（1969-），男，博士，副教授，主要从事交通流理论、智能交通系统等的研究。 E-mail:ciewfj@zju.edu.cn
作者简介:王福建（1969-），男，博士，副教授，主要从事交通流理论、智能交通系统等的研究。
基金资助:
国家自然科学基金重点资助项目(52131202)

Reconstruction of Urban Vehicle Path Chain Based on Deep Inverse Reinforcement Learning

WANG Fujian¹ CHENG Huiling² MA Dongfang³ WANG Dianhai¹

^1.College of Civil Engineering and Architecture，Zhejiang University，Hangzhou 310058，Zhejiang，China
^2.Polytechnic Institute，Zhejiang University，Hangzhou 310058，Zhejiang，China
^3.Ocean College，Zhejiang University，Hangzhou 310058，Zhejiang，China

Received:2022-08-26 Online:2023-07-25 Published:2023-01-20
Contact: 王福建（1969-），男，博士，副教授，主要从事交通流理论、智能交通系统等的研究。 E-mail:ciewfj@zju.edu.cn
About author:王福建（1969-），男，博士，副教授，主要从事交通流理论、智能交通系统等的研究。
Supported by:
the Key Program of National Natural Science Foundation of China(52131202)

摘要/Abstract

摘要：

随着城市交通监测系统的完善，大量车牌识别数据被存储下来，该类数据具有时间连续性强、空间范围广、样本类型多的优点，为研究城市交通提供了信息基础。但受制于成本和技术，信息采集过程中采集的车牌数据在时空上并不连续，导致车牌识别数据的应用受限。有鉴于此，文中提出一套路径链提取方案，对单次出行的完整路径链和缺失路径链加以区分，并提出了一种基于深度逆向强化学习的城市车辆出行路径链重构算法。该算法对完整路径链进行采样获得专家示例，利用深度逆向强化学习挖掘专家示例，以非线性回报函数的形式拟合出潜藏的路径选择特性，指导智能体自主补全缺失的路径链，实现车辆出行缺失路径链的重构。将该算法在杭州市萧山区的局部路网中进行实验验证，结果表明，所提出的重构算法具有良好的稳定性能，其平均准确度可达95%，且在缺失程度较大的情况下准确度仍能保持在92%以上，相比传统算法具有较大优势。文中还分析了专家示例的位置分布和数量对算法性能的影响，验证了所提出的重构算法的泛化能力。

关键词: 城市道路网络, 车牌识别, 深度逆向强化学习, 数据处理, 路径链重构

Abstract:

With the improvement of urban traffic monitoring system, a large number of license plate recognition data are stored. This type of data has the advantages of strong temporal continuity, wide spatial range and multiple sample types, which provides an information foundation for studying urban traffic. However, due to the cost and technology in the process of information collection, the collected license plate data are discontinuous in time and space domains, thus limiting the application of the data. To solve this problem, a path chain extraction scheme is proposed in this paper to distinguish the complete path chain from the missing path chain for a single trip, and a reconstruction algorithm of urban vehicle travel path chain based on deep inverse reinforcement learning is proposed. This algorithm samples the complete path chain to obtain expert examples, uses deep inverse reinforcement learning to mine expert examples, and gives the potential route selection characteristics by fitting in the form of nonlinear reward function, which guides the agent to complete the missing path chain independently, and realizes the reconstruction of the missing path chain of vehicle travel. According to the experimental validation in the local road network of Xiaoshan District, Hangzhou City, it is found that the proposed reconstruction algorithm possesses good stability performance, with an average accuracy of 95%; and that the accuracy keeps more than 92% even in case of significant missing points, so that it is of significant advantages as compared with the traditional algorithms. Moreover, by analyzing the impact of the location distribution and number of expert examples on the algorithm, strong generalization ability of the proposed reconstruction algorithm is verified.

Key words: urban road network, license plate recognition, deep inverse reinforcement learning, data processing, path chain reconstruction

中图分类号:

U491

王福建, 程慧玲, 马东方, 等. 基于深度逆向强化学习的城市车辆路径链重构[J]. 华南理工大学学报(自然科学版), 2023, 51(7): 120-128.

WANG Fujian, CHENG Huiling, MA Dongfang, et al. Reconstruction of Urban Vehicle Path Chain Based on Deep Inverse Reinforcement Learning[J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(7): 120-128.

图/表 13

图1

表1

表2

表3

马尔可夫决策过程的变量与描述"

变量	描述
状态 $s$	路段
状态特征 f_et（s）	路段长度、路段等级、路段中心经纬度、行程时间、行程时间标准差、端点是否为信号交叉口
动作 $a$	左转、右转、直行、掉头

表3

图2

图3

表4

表5

表6

图4

图5

图6

图7

参考文献 22

1	姜桂艳．道路交通状态判别技术与应用［M］．北京：人民交通出版社，2004．
2	KENNEDY J， CANTRELLC R， VARNEY M D ．Highway travel time analysis using license plate image capture techniques［C］∥ Proceedings of Industrial and Highway Sensors Technology．Providence：International Society for Optical Engineering，2004：294-303．
3	NIGRO M， CIPRIANI E， DEL GIUDICE A ．Exploiting floating car data for time-dependent origin-destination matrices estimation［J］．Journal of Intelligent Transportation Systems，2018，22（2）：159-174．
4	LUO X， WANG D， MA D，et al ．Grouped travel time estimation in signalized arterials using point-to-point detectors［J］．Transportation Research Part B：Methodological，2019，130：130-151．
5	SHEN X， ZHOU Y， JIN S，et al ．Spatiotemporal influence of land use and household properties on automobile travel demand［J］．Transportation Research Part D：Transport and Environment，2020，84：102359．
6	王龙飞，陈红，李杨，等．车辆出行轨迹调查分析中的丢点轨迹还原［J］．计算机应用研究，2014，31（1）：162-165．
	WANG Long-fei， CHEN Hong， LI Yang，et al ．Track patching method for incomplete track in track-oriented traffic survey and analysis［J］．Application Research of Computers，2014，31（1）：162-165．
7	YANG J， SUN J ．Vehicle path reconstruction using automatic vehicle identification data：an integrated particle filter and path flow estimator［J］．Transportation Research Part C：Emerging Technologies，2015，58：107-126．
8	阮树斌，王福建，马东方，等．基于车牌识别数据的机动车出行轨迹提取算法［J］．浙江大学学报（工学版），2018，52（5）：836-844．
	RUAN Shu-bin， WANG Fu-jian， MA Dong-fang，et al ．Vehicle trajectory extraction algorithm based on license plate recognition data［J］．Journal of Zhejiang University （Engineering Science），2018，52（5）：836-844．
9	毋帆．基于车牌识别数据的通勤路径选择行为与协同控制优化研究［D］．济南：山东大学，2020．
10	MO B， LI R M， DAI J C ．Estimating dynamic origin-destination demand：a hybrid framework using license plate recognition data［J］．Computer-Aided Civil and Infrastructure Engineering，2020，35（7）：734-752．
11	李晨溪，曹雷，张永亮，等．基于知识的深度强化学习研究综述［J］．系统工程与电子技术，2017，39（11）：2603-2613.
	LI Chen-xi， CAO Lei， ZHANG Yong-liang，et al ．Knowledge-based deep reinforcement learning：a review［J］．Systems Engineering and Electronics，2017，39（11）：2603-2613．
12	KISHIKAWA D， ARAI S ．Estimation of personal driving style via deep inverse reinforcement learning［J］．Artificial Life and Robotics，2021，26（3）：338-346．
13	BING Z， LEMKE C， CHENG L，et al ．Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning［J］．Neural Networks，2020，129：323-333．
14	KIRAN B R， SOBH I， TALPAERT V ．Deep reinforcement learning for autonomous driving：a survey［J］．IEEE Transactions on Intelligent Transportation Systems，2022，23（6）：4909-4926．
15	李晓莉，石建军．行程时间异常值处理方法研究［J］．武汉理工大学学报（交通科学与工程版），2012，36（1）：116-119．
	LI Xiao-li， SHI Jian-jun ．Research on the filtering method for travel time outliers［J］．Journal of Wuhan University of Technology （Transportation Science & Engineering），2012，36（1）：116-119．
16	柴华骏，李瑞敏，郭敏．基于车牌识别数据的城市道路旅行时间分布规律及估计方法研究［J］．交通运输系统工程与信息，2012，12（6）：41-47．
	CHAI Hua-jun， LI Rui-min， GUO Min ．Travel time distribution and estimation of urban traffic using vehicle identification data［J］．Journal of Transportation Systems Engineering & Information Technology，2012，12（6）：41-47．
17	NARITA A， HAYASHI K， TOMIOKA R，et al ．Tensor factorization using auxiliary information［J］．Data Mining & Knowledge Discovery，2012，25（2）：298-324．
18	DONG H， DING F， TAN H，et al ．Rail transit OD-matrix completion via manifold regularized tensor factorisation［J］．IET Intelligent Transport Systems，2021，15：1304-1317．
19	杨克宇，高云君，陈璐，等．分布式的增量式张量Tucker分解方法［J］．计算机学报，2021，44（8）：1696-1713．
	YANG Ke-yu， GAO Yun-jun， CHEN Lu，et al ．Distributed incremental tensor tucker decomposition［J］．Chinese Journal of Computers，2021，44（8）：1696-1713．
20	YOU C X， LU J B， FILEV D，et al ．Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning［J］．Robotics and Autonomous Systems，2018，114：1-18．
21	ZIEBART B D， MAAS A L， BAGNELL J A，et al ．Maximum entropy inverse reinforcement learning［C］∥ Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence，AAAI 2008．Chicago：AAAI，2008．
22	陈希亮，曹雷，何明，等．深度逆向强化学习研究综述［J］．计算机工程与应用，2018，54（5）：24-35．
	CHEN Xi-lian， CAO Lei， HE Ming，et al ．Overview of deep inverse reinforcement learning［J］．Computer Engineering and Application，2018，54（5）：24-35．

异常类型	异常原因	异常标志
无效数据	车牌不明，设备漏检	车牌信息为空
错误数据	外界干扰，设备故障	数据乱码和检测错误
重复数据	第1类：设备故障	多条过车记录所有字段完全一致
重复数据	第2类：多检，广角错拍	车牌、卡口名称等一致，记录时间稍有差别

缺失类型	错误原因	缺失特点
完全随机性缺失	设备漏检、错检，样本数过少	缺失值随机分散，没有特定的规律
随机性通道缺失	某卡口在某段时间内损坏	部分通道随机缺失，例如某卡口在某周期的数据全部缺失
系统性通道缺失	路段上未安装卡口识别设备	某卡口时空数据完全缺失

缺失率/%	MAE/s	MAPE/%		MAE/s	MAPE/%
10	5.72	9.12	40	7.61	14.63
20	6.08	11.71	50	8.82	16.85
30	6.96	12.54	60	9.66	21.02

缺失率/%	MAE/s	MAPE/%		MAE/s	MAPE/%
10	5.80	9.30	40	7.74	15.19
20	6.77	11.48	50	8.56	17.21
30	7.13	12.91	60	10.28	22.38

缺失路段数	MAE/s	MAPE/%		MAE/s	MAPE/%
1	5.44	11.31	5	9.45	21.14
2	6.04	13.42	6	10.96	25.56
3	7.26	14.74	7	12.59	32.59
4	8.68	16.01

基于深度逆向强化学习的城市车辆路径链重构

Reconstruction of Urban Vehicle Path Chain Based on Deep Inverse Reinforcement Learning

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 13

参考文献 22

相关文章 3

编辑推荐

Metrics

本文评价

[1]	徐建闽, 魏鑫, 林永杰, 等. 基于梯度提升决策树的城市车辆路径链重构[J]. 华南理工大学学报（自然科学版）, 2020, 48(7): 55-64.
[2]	秦钟徐建闽史胜利黄海英. 基于字符角点信息的车牌定位方法[J]. 华南理工大学学报（自然科学版）, 2006, 34(4): 19-23.
[3]	骆雪超刘桂雄冯云庆申柏华. 一种基于车牌特征信息的车牌识别方法[J]. 华南理工大学学报(自然科学版), 2003, 31(4): 70-73.