华南理工大学学报(自然科学版) ›› 2020, Vol. 48 ›› Issue (4): 114-122.doi: 10.12141/j.issn.1000-565X.190465

• 交通运输工程 • 上一篇    下一篇

数据不完备下基于特征识别的公交客流短时预测

方晓平 林美 陈维亚 潘鑫
  

  1. 中南大学 交通运输工程学院∥轨道交通大数据湖南省重点实验室,湖南 长沙 410075
  • 收稿日期:2019-07-20 修回日期:2019-10-16 出版日期:2020-04-25 发布日期:2020-04-01
  • 通信作者: 陈维亚(1981-) ,男,博士,副教授,主要从事交通运输规划与管理研究。 E-mail:wychen@csu.edu.cn
  • 作者简介:方晓平(1964-) ,女,博士,教授,主要从事交通运输规划与管理、物流经济等研究。E-mail:fangxp@csu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目 ( 61203162) ; 湖南省自然科学基金资助项目 ( 2018JJ2537) ; 湖南省交通运输厅科技进步与创新计划项目 ( 201244,201723,201949)

Short-Time Bus Passenger Flow Prediction by Identifying Features of Incomplete Data

FANG Xiaoping LIN Mei CHEN Weiya PAN Xin#br#   

  1. School of Traffic and Transportation Engineering//Rail Data Research and Application Key Laboratory of Hunan Province, Central South University,Changsha 410075,Hunan,China
  • Received:2019-07-20 Revised:2019-10-16 Online:2020-04-25 Published:2020-04-01
  • Contact: 陈维亚(1981-) ,男,博士,副教授,主要从事交通运输规划与管理研究。 E-mail:wychen@csu.edu.cn
  • About author:方晓平(1964-) ,女,博士,教授,主要从事交通运输规划与管理、物流经济等研究。E-mail:fangxp@csu.edu.cn
  • Supported by:
    Supported by the National Natural Science Foundation of China ( 61203162) and the Natural Science Founda-tion of Hunan Province ( 2018JJ2537)

摘要:

识别公交客流特征是提高短时预测质量的关键。但由于设备故障、数据收集受限等原因,客流数据属性往往是不完备的,这给特征识别和客流预测带来了挑战。文中以缺少乘客属性数据的长沙市 104 路公交卡数据为样本,利用卡号与出行时间的关联性识别乘客出行频次,以此作为区分出行特征的变量,将客流总集划分为不同的特征子集,依据子集规模、方差确定变量最佳取值,推断客流特征。与直接预测客流总集相比,文中为每类子集建立季节性差分自回归滑动平均 ( SARIMA) 模型分别进行预测,整合后得出的样本外平均绝对误差改善了 36. 11% ,依据乘客出行特征建立的预测模型拟合度为 0. 95,可有效识别公交客流特征。

关键词:

Abstract: Identifying the feature of bus passenger flow is the key to improve the quality of short-time prediction. However,due to equipment failure,data collection constraint and other reasons,the attributes of passenger flow data are often incomplete,which brings challenges to feature recognition and passenger flow prediction. The card data of No. 104 bus in Changsha,lacking passenger attribute data,was taken as the sample in this paper. The cor- relation between card number and travel time was used to identify passengers’travel frequency,which was used as a variable to distinguish the feature of travel. The total passenger flow set was divided into different feature subsets, and the optimal value of variable was determined according to the subset size and variance value,and the attributes of passenger flow were inferred. Compared with direct prediction of total passenger flow,the Seasonal Auto-Regres- sive Integrated Moving Average ( SARIMA) model established for each subset was respectively used for prediction. The out of sample Mean Absolute Error ( MAE) obtained by integrating is improved by 36. 11% . The fitting degree of prediction model based on the feature of passengers’travel is 0. 95,thus can effectively identify the feature of bus passenger flow.

Key words:

中图分类号: