华南理工大学学报(自然科学版) ›› 2023, Vol. 51 ›› Issue (7): 120-128.doi: 10.12141/j.issn.1000-565X.220550

所属专题: 2023年交通运输工程

• 交通运输工程 • 上一篇    下一篇

基于深度逆向强化学习的城市车辆路径链重构

王福建1 程慧玲2 马东方3 王殿海1   

  1. 1.浙江大学 建筑与工程学院, 浙江 杭州 310058
    2.浙江大学 工程师学院, 浙江 杭州 310058
    3.浙江大学 海洋学院, 浙江 杭州 310058
  • 收稿日期:2022-08-26 出版日期:2023-07-25 发布日期:2023-01-20
  • 通信作者: 王福建(1969-),男,博士,副教授,主要从事交通流理论、智能交通系统等的研究。 E-mail:ciewfj@zju.edu.cn
  • 作者简介:王福建(1969-),男,博士,副教授,主要从事交通流理论、智能交通系统等的研究。
  • 基金资助:
    国家自然科学基金重点资助项目(52131202)

Reconstruction of Urban Vehicle Path Chain Based on Deep Inverse Reinforcement Learning

WANG Fujian1 CHENG Huiling2 MA Dongfang3 WANG Dianhai1   

  1. 1.College of Civil Engineering and Architecture,Zhejiang University,Hangzhou 310058,Zhejiang,China
    2.Polytechnic Institute,Zhejiang University,Hangzhou 310058,Zhejiang,China
    3.Ocean College,Zhejiang University,Hangzhou 310058,Zhejiang,China
  • Received:2022-08-26 Online:2023-07-25 Published:2023-01-20
  • Contact: 王福建(1969-),男,博士,副教授,主要从事交通流理论、智能交通系统等的研究。 E-mail:ciewfj@zju.edu.cn
  • About author:王福建(1969-),男,博士,副教授,主要从事交通流理论、智能交通系统等的研究。
  • Supported by:
    the Key Program of National Natural Science Foundation of China(52131202)

摘要:

随着城市交通监测系统的完善,大量车牌识别数据被存储下来,该类数据具有时间连续性强、空间范围广、样本类型多的优点,为研究城市交通提供了信息基础。但受制于成本和技术,信息采集过程中采集的车牌数据在时空上并不连续,导致车牌识别数据的应用受限。有鉴于此,文中提出一套路径链提取方案,对单次出行的完整路径链和缺失路径链加以区分,并提出了一种基于深度逆向强化学习的城市车辆出行路径链重构算法。该算法对完整路径链进行采样获得专家示例,利用深度逆向强化学习挖掘专家示例,以非线性回报函数的形式拟合出潜藏的路径选择特性,指导智能体自主补全缺失的路径链,实现车辆出行缺失路径链的重构。将该算法在杭州市萧山区的局部路网中进行实验验证,结果表明,所提出的重构算法具有良好的稳定性能,其平均准确度可达95%,且在缺失程度较大的情况下准确度仍能保持在92%以上,相比传统算法具有较大优势。文中还分析了专家示例的位置分布和数量对算法性能的影响,验证了所提出的重构算法的泛化能力。

关键词: 城市道路网络, 车牌识别, 深度逆向强化学习, 数据处理, 路径链重构

Abstract:

With the improvement of urban traffic monitoring system, a large number of license plate recognition data are stored. This type of data has the advantages of strong temporal continuity, wide spatial range and multiple sample types, which provides an information foundation for studying urban traffic. However, due to the cost and technology in the process of information collection, the collected license plate data are discontinuous in time and space domains, thus limiting the application of the data. To solve this problem, a path chain extraction scheme is proposed in this paper to distinguish the complete path chain from the missing path chain for a single trip, and a reconstruction algorithm of urban vehicle travel path chain based on deep inverse reinforcement learning is proposed. This algorithm samples the complete path chain to obtain expert examples, uses deep inverse reinforcement learning to mine expert examples, and gives the potential route selection characteristics by fitting in the form of nonlinear reward function, which guides the agent to complete the missing path chain independently, and realizes the reconstruction of the missing path chain of vehicle travel. According to the experimental validation in the local road network of Xiaoshan District, Hangzhou City, it is found that the proposed reconstruction algorithm possesses good stability performance, with an average accuracy of 95%; and that the accuracy keeps more than 92% even in case of significant missing points, so that it is of significant advantages as compared with the traditional algorithms. Moreover, by analyzing the impact of the location distribution and number of expert examples on the algorithm, strong generalization ability of the proposed reconstruction algorithm is verified.

Key words: urban road network, license plate recognition, deep inverse reinforcement learning, data processing, path chain reconstruction

中图分类号: