Journal of South China University of Technology(Natural Science Edition) ›› 2025, Vol. 53 ›› Issue (5): 82-93.doi: 10.12141/j.issn.1000-565X.240480

• Computer Science & Technology • Previous Articles    

A Spatiotemporal Heterogeneous Two-Stage Fusion Network for Traffic Flow Forecasting

HOU Yue  YIN Jie  ZHANG Zhihao  LU Keke   

  • Online:2025-05-25 Published:2024-12-04

Abstract:

In response to the existing traffic flow prediction studies that fail to fully integrate complex spatiotemporal correlations and heterogeneities, this paper designs a traffic flow prediction network based on grid data—the Spatiotemporal Heterogeneous Two-Stage Fusion Neural Network (ST_HTFNN). This network employs a phased, hierarchical spatiotemporal feature extraction architecture, adopting a new model where the static and dynamic feature extraction stages are serialized. In the static feature extraction stage, a novel Mamba-Like Linear Attention (MLLA) block is introduced as a static heterogeneous fusion unit to achieve spatial correlation and heterogeneity fusion mining. In the dynamic feature extraction stage, a simple and efficient dynamic heterogeneous fusion unit is designed, combining dilated convolution with gating mechanisms to adaptively fuse and capture global and local spatiotemporal correlations and heterogeneities. Furthermore, to address the smoothing of road features during the deep convolution process at the road-level traffic flow characteristics, a road feature enhancement module is designed to reconstruct and enhance road information. Finally, an external disturbance feature fusion module is designed to integrate the impact of external disturbance features on traffic flow prediction results. Experimental results on three real-world traffic datasets—BikeNYC, TaxiCQ, and TaxiBJ—demonstrate that the ST_HTFNN model outperforms existing benchmark methods, with an average improvement of 6.13%, 2.06%, and 5.23% in the prediction accuracy evaluation metric MAE.

Key words: traffic flow prediction, grid data, spatiotemporal heterogeneity, dilated convolution, gating mechanism