Loading...

Table of Content

    25 September 2025, Volume 53 Issue 9
    Computer Science & Technology
    LI Yue, HUANG Yihan, PENG Zhengwei, et al
    2025, 53(9):  1-10.  doi:10.12141/j.issn.1000-565X.250134
    Asbtract ( 118 )  
    Related Articles | Metrics

    As one of the traditional Chinese arts, Chinese opera culture has unique musical expressiveness. Cantonese opera is one of the main Chinese opera genres and an important carrier of Lingnan culture. It is listed in the World Intangible Cultural Heritage List. In recent years, generative artificial intelligence technology has demonstrated its powerful capabilities in the field of content creation. For example, singing synthesis technology can synthesize natural singing based on specified music scores, which provides a new idea for the digital protection and innovation of Cantonese opera. However, the collection and organization of opera data faces problems such as poor audio quality and complex dialect annotation, resulting in an extreme shortage of high-quality opera data sets. Based on this, this paper applies the singing synthesis technology in the field of pop music to the field of Cantonese opera vocal synthesis, and proposes the first Cantonese opera vocal synthesis dataset with phoneme-level annotation and audio-text alignment. First, this paper constructs the CODS dataset through a systematic process. This dataset is derived from 29 original works by four famous performers with a total length of 3.81 hours, which provides important support for the research and digitization of Cantonese opera. Secondly, this paper conducts sufficient experiments on this dataset and realizes the synthesis of Cantonese opera vocals with controllable lyrics, timbre and melody. In addition, this paper established a Cantonese opera vocal synthesis evaluation scheme and verified the usability of the produced dataset based on subjective and objective evaluation. The CODS dataset proposed in this paper successfully filled the gap in artificial intelligence in the field of Cantonese opera vocal synthesis, and strongly promoted the inheritance and innovation of this traditional art.

    LIU Huiting, LIU Shaoxiong, WANG Jiale, et al
    2025, 53(9):  11-21.  doi:10.12141/j.issn.1000-565X.240088
    Asbtract ( 89 )  
    Related Articles | Metrics

    Deep reinforcement learning (DRL) algorithms have been incorporated into recommendation settings for dynamically modeling the interests of users and maximizing cumulative rewards. However, data sparsity poses a challenge to most DRL-based interactive recommendation methods. A good view is that contrastive learning can address the data sparsity problem. However, most existing contrastive learning methods typically exploit heuristic-based augmentation strategies, resulting in the loss of critical information and failure to make full use of heterogeneous information. To address these shortcomings, we propose multi-interest oriented contrastive for deep reinforcement learning-based recommendations (MOCIR). In particular, we utilize heterogeneous information networks (HINs) to model different aspects of a user’s local interests and utilize original data to model the user’s global interests. The proposed method comprises a contrastive learning module and policy network. The contrastive learning module uses metapaths in the HINs to find neighbors for items with different aspects, aggregates them to obtain item representations, and then utilizes both the global and local interests as positive pairs for contrastive learning, thereby effectively capturing the user’s interest. A policy network is used to make recommendations after the user state representation is aggregated, and the contrastive learning module and the policy network are jointly updated. The proposed method is superior to its state-of-the-art counterparts, as demonstrated by experiments on three benchmark datasets.

    YUE Yongheng, ZHAO Zhihao
    2025, 53(9):  22-30.  doi:10.12141/j.issn.1000-565X.240609
    Asbtract ( 214 )  
    Related Articles | Metrics

    Aiming at the problem of lane line detection accuracy of intelligent vehicles in complex scenes, this paper proposes a PANet lane line detection algorithm that incorporates a multi-scale spatial attention mechanism. The algorithm references the pre-anchored frame UFLD lane line detection model and combines the feature pyramid enhancement module PANet with depth-separable convolution to realize multi-scale feature extraction from images. In addition, a multi-scale spatial attention module is designed in the network framework and a SimAM lightweight attention mechanism is introduced to enhance the focusing ability on target features. After that, an adaptive feature fusion module is designed to perform cross-scale fusion of feature maps output from PANet by intelligently adjusting the fusion weights of feature maps at different scales, so as to effectively enhance the network's ability to extract complex features. Finally, the application of Tusimple dataset detection proves that the detection accuracy of this paper's algorithm is 96.84%, which is 1.03% better than the original algorithm and superior to the traditional mainstream algorithm; and based on the CULane dataset's detection of the nine scenarios proves that the comprehensive F1 value of this paper's algorithm is 72.74% better than the traditional mainstream algorithm, which is 4.34% better than the original algorithm, especially under the bright light, shadows, and other extreme scenes. Especially in strong light, shadow and other extreme scenes, the detection performance improvement is larger, which fully demonstrates that the detection method in this paper has better detection ability in complex scenes. In addition, the real-time test shows that the model inference speed reaches 118FPS, which meets the real-time demand of intelligent vehicles.

    WANG Qingrong, GAO Huanyi, ZHU Changfeng, et al
    2025, 53(9):  31-47.  doi:10.12141/j.issn.1000-565X.250003
    Asbtract ( 167 )  
    Related Articles | Metrics

    With the continual rise in the number of motor vehicles in urban areas, traffic congestion has worsened, adversely affecting environmental protection and urban operational efficiency. Consequently, accurately predicting traffic congestion is of critical importance for traffic management and optimization. However, existing research still faces limitations in modeling the dynamic, time-varying characteristics of traffic flow as well as the complex interactions among road segments. To address these challenges, a Gated Spatiotemporal Convolutional Network model based on Graph Neural Networks is proposed to more effectively capture and predict traffic congestion. First, an improved K-Means clustering algorithm is employed to divide the raw data into multiple congestion-state categories, which are then incorporated as auxiliary features to enhance feature representation. Next, a Gated Temporal Convolutional Network is introduced to capture the temporal properties and dynamic dependencies in traffic data, and a Dynamic Adaptive Gated Graph Convolutional Network is constructed to achieve feature fusion and dynamic weight allocation through a signal generation module and a dual-modulation mechanism, thereby facilitating effective extraction of spatiotemporal features. Finally, residual connections are incorporated to improve training stability, and skip connections are utilized to integrate multi-level and multi-scale features. Experimental results on real-world PeMS08 and PeMS04 datasets demonstrate that the proposed model achieves superior prediction accuracy compared with other baseline methods.

    LU Lu, ZHAO Rong, LIANG Zhihong, et al
    2025, 53(9):  48-58.  doi:10.12141/j.issn.1000-565X.240498
    Asbtract ( 88 )  
    Related Articles | Metrics

    General matrix multiplication (GEMM) is the most important operation in linear algebra, and many applications from different scientific fields have converted their key parts to use GEMM. GEMM is widely used in large models, machine learning, scientific computing, signal processing, and other fields. In particular, half-precision batch GEMM (i.e., FP16) has been the core operation of many deep learning frameworks. This paper proposes a GPU optimization scheme for half-precision batch GEMM (HGEMM). In terms of blocking strategy, this scheme provides a matrix size-affinitive blocking strategy to ensure that each wavefront is assigned the same workload and computation; threads calculate multiple matrix multiplications simultaneously to improve the utilization of computing units. In terms of memory access optimization: at the cost of multiple data reads, the same amount of memory access is allocated to each thread to facilitate compiler optimization, ensuring that memory access and computation time cover each other. For extremely small-size batch HGEMM with a matrix size of less than 16, this paper uses a 4x4x4 Matrix Core and its corresponding blocking scheme to improve memory access performance while reducing computation waste, and provides an option to use shared memory to achieve the highest performance. This paper compares the performance of this scheme with two operators of rocBLAS on the AMD GPU MI210 platform. The results show that the average performance of this scheme on AMD GPU MI210 is 4.14 times that of rocBLAS hgemm batched and 4.96 times that of rocBLAS gemm ex batched. In particular, for extremely small batch sizes, the average performance of HGEMM is 18.60 times that of rocBLAS hgemm batched and 14.02 times that of rocBLAS gemm ex batched.

    ​TU Xinhui, GUO Cong, ZONG Yuhang
    2025, 53(9):  59-67.  doi:10.12141/j.issn.1000-565X.240499
    Asbtract ( 29 )  
    Related Articles | Metrics
    With the rapid development of large language models, text matching and characterization techniques in information retrieval have made significant progress. Among them, query expansion and document expansion are two important methods to enhance text representation in information retrieval. Currently the mainstream text expansion methods are all realized by using large language models. However, texts generated by big language models and manually created texts are very different in terms of linguistic diversity and style, and this difference may affect the calculation of query-document relevance, which ultimately leads to a decrease in the accuracy of the whole information retrieval process. In order to solve this problem, the paper proposes an information retrieval method based on bidirectional text expansion (BTE). First, a zero-sample cue is used to enable the large language model to generate pseudo-queries of documents and pseudo-documents of queries; then, the semantic similarity between pseudo-queries and pseudo-documents is computed; and finally, the original query-document similarity scores and pseudo-queries-pseudo-documents are weighted and fused together to obtain the final document ranking results. Experiments on two publicly available datasets, DL19 and DL20, demonstrate that the BTE method significantly outperforms the benchmark model in several evaluation metrics such as NDCG@10, P@10 and MRR@10. Therefore, the bidirectional text expansion method proposed in this paper can further enhance the relevance matching between the query and the document, thus producing a certain degree of improvement in the performance of the whole information retrieval system.
    Mechanical Engineering
    WEI Zhengjun, LIANG Zijian, ZHENG Kun, et al
    2025, 53(9):  68-75.  doi:10.12141/j.issn.1000-565X.240589
    Asbtract ( 195 )  
    Related Articles | Metrics

    With the increasing awareness of health and the popularity of competitive sports, the technological and professional development of ball sports training has become a significant trend. In football training, precise simulation of shooting trajectories and the design of personalized training programs are critical issues that need to be addressed. This study establishes an intelligent ballistic trajectory simulation football training robot system, integrating shooting mechanisms, visual acquisition, data analysis, and motion control technologies, aimed at enhancing the scientific and effective nature of training. The system features a three-axis gimbal shooting robot with omnidirectional movement capabilities, allowing it to flexibly adjust shooting angles and positions to meet various training needs. By utilizing an optimized RMSProp algorithm, the robot achieves the function of reverse solving launch parameters, enabling precise adjustments of yaw and pitch angles based on target positions. Experimental results indicate that the robot maintains a shooting point error of less than 0.45 meters under various training conditions, with a root mean square error of less than 7.5 centimeters between theoretical and actual trajectories, validating the system's robustness and accuracy. Additionally, we have established a detailed shooting dataset that provides important resources for future research in data science and artificial intelligence. This research promotes the intelligent development of football training, offering athletes a more scientific training tool and enhancing the overall level of football performance.

    WANG Qinghui, FANG Daoxin, CHI Zipeng, et al
    2025, 53(9):  76-85.  doi:10.12141/j.issn.1000-565X.250025
    Asbtract ( 110 )  
    Related Articles | Metrics

    Real-time visual and precise haptic interaction algorithms are critical for achieving accurate "tactile sensation" in virtual surgical training. Taking the key bone milling operation in orthopedic surgery as an example,the Tri-dexel model is first employed to represent the bone and the surgical milling tool. Real-time geometric deformation during the virtual bone milling is achieved through boolean operations and rapid surface reconstruction algorithms. Next, by integrating the geometric parameters of the surgical milling tool, a haptic interaction model based on the micro-element cutting force is proposed. This model utilizes the boolean operation results between the bone and surgical millimg tool to quickly achieve accurate calculation of the instantaneous undeformed chip thickness. The cutting force coefficients are identified and the haptic interaction model is validated through milling experiments to achieve haptic rendering. Finally, based on the aforementioned algorithms, an orthopedic virtual surgical training system is developed, and evaluation experiments are conducted. The results show that the predicted forces align with experimental measurements, with an average force error of less than 7%. The visual and haptic interactive algorithm satisfies a visual refresh rate of 30 Hz and a haptic refresh rate of 1 kHz. The developed orthopedic virtual surgical training system provides users with a highly immersive virtual bone milling training experience that can effectively improve users' hand-eye coordination.

    WANG Zhenmin, ZHU Bin, CHI Peng, et al
    2025, 53(9):  86-97.  doi:10.12141/j.issn.1000-565X.240500
    Asbtract ( 113 )  
    Related Articles | Metrics

    To address the issue of adsorption instability encountered by magnetic wheel adsorption-type underwater welding robots during operation, this paper proposes a critical adsorption force calculation theory for magnetic wheels based on centroid offset and vector superposition. This theory comprehensively considers multiple failure modes, including traditional slippage failure, detachment failure, overturning failure, and the less-studied spin-slip failure, effectively resolving the adsorption instability problems caused by the low accuracy of traditional adsorption force calculations. Firstly, based on the robot chassis structure, static models corresponding to four non-instability adsorption states are established. Furthermore, a vector superposition principle is proposed, incorporating static coupling relationships. This principle fully accounts for the influence of centroid offset on adsorption stability during actuator motion, providing a theoretical foundation for the precise calculation of the critical adsorption force of magnetic wheels. Secondly, taking an existing permanent magnet adsorption chassis of an underwater welding robot as a case study, static results are obtained through Matlab simulations. The variation law of the critical adsorption force of the chassis with maximum centroid offset at different spatial angles is summarized. Finally, an experimental setup is constructed to test the adsorption stability of the robot under various operational conditions. The experimental results demonstrate that the vector superposition principle based on centroid offset can effectively improve the adsorption stability of underwater welding robots, offering novel theoretical support for the design and magnetic force optimization of subsequent magnetic adsorption chassis.

    MA Wenqi, MA Hailong, QIN Yubin, et al
    2025, 53(9):  98-105.  doi:10.12141/j.issn.1000-565X.240305
    Asbtract ( 69 )  
    Related Articles | Metrics

    When computational fluid dynamics (CFD) is used to simulate the flow field characteristics of a gas bearing-rotor system, the thickness of the gas film is one of the crucial structural parameters. However, due to the shape and size errors generated during the machining of parts and the deviation caused by the system assembly, there are certain deviations in the spatial morphology and scale between the actual gas film and the ideal design gas film. Then the reliability and accuracy of the numerical solution are affected. In this paper, the concept of effective gas film thickness of flow field is proposed, and the reasonable equivalent gas film thickness is determined by the comparison and analysis of the bidirectional fluid-structure coupling numerical simulation and experimental results. The results show that the bidirectional fluid-structure coupling numerical simulation method can reveal the transient characteristics of the gas film flow field and the changing law of the rotor attitude, and predict and evaluate whether the gas bearing-rotor system can operate safely, saving the test cost. The rotor inclination Angle is used as the comparative analysis feature, which provides a direct reference for the analysis of the system performance deviation between the numerical simulation results and the experimental test results. The establishment of equivalent gas film thickness can simplify the numerical simulation model to the greatest extent, improve the efficiency of numerical simulation, and the results have a certain reliability. Taking the gas supply pressure of 0.6MPa and the unilateral steady-state loading force of 80N as examples, the equivalent gas film thickness predicted in the fluid-to-solid coupling simulation model was established and corrected through error analysis and approximation, and the relative error of the system inclination Angle was controlled within 5% in the end, which greatly improved the consistency between the numerical simulation results and the actual engineering system performance. It provides a reliable method and basis for the application of gas bearing rotor simulation system in structural design, performance prediction and evaluation.

    JI Shuting, LI Jiahao, ZHANG Yueming
    2025, 53(9):  106-116.  doi:10.12141/j.issn.1000-565X.240586
    Asbtract ( 157 )  
    Related Articles | Metrics

    To enhance the comprehensive transmission performance of hypoid gears with high reduction ratios, a design method for significantly inclined contact lines based on active tooth surface design technology is proposed. Initially, multiple tooth surface imprints with varying degrees of contact line inclination are preset, along with their corresponding major axes of contact ellipses and lengths of contact lines. The pinion conjugate tooth surface is then modified with a parabolic shape to achieve a tooth surface that meets the preset parameters. Subsequently, by integrating Tooth Contact Analysis (TCA) and Load Tooth Contact Analysis (LTCA) techniques, the transmission error amplitude (TE), amplitude of loaded transmission error (ALTE), tooth surface load distribution, root bending stress amplitude, and tooth surface flash temperature amplitude are obtained for each tooth surface. The impact of changes in contact line length on these performance parameters is analyzed. Based on the findings, a final target modified tooth surface is selected, and its comprehensive performance is analyzed and compared with the original tooth surface. A case study demonstrates that for a hypoid gear pair with a gear ratio of 5:75, under conditions of significantly inclined contact lines, the longer the contact line, the smaller the tooth surface contact stress, and the root bending stress and tooth surface flash temperature also decrease accordingly. The edge contact condition of the target tooth surface is mitigated, with the maximum root bending stress reduced by 12%, and the contact stress distribution becomes more uniform. The highest tooth surface flash temperature decreases by 6.3%, enhancing the scuffing load capacity. The modified target tooth surface exhibits superior contact zone performance, improved load capacity, and significantly enhanced comprehensive transmission performance.

    Power & Electrical Engineering
    CHEN Chen, WANG Miao, WANG Xinyao, et al
    2025, 53(9):  117-126.  doi:10.12141/j.issn.1000-565X.240575
    Asbtract ( 75 )  
    Related Articles | Metrics

    The detection of abnormal energy consumption in building lighting sockets can effectively improve building energy efficiency, which is of great significance for implementing building energy optimization measures and achieving building energy conservation control research. Due to the fact that the energy consumption of building lighting socket systems is greatly affected by the random behavior of personnel inside the building, this paper proposes a method for detecting energy consumption anomalies in building lighting sockets based on Long Short Term Memory Autoencoder (LSTM-AE) to address the problems of high noise and difficult feature extraction in time series data of lighting sockets. On the basis of decision tree partitioning of working conditions, deep learning methods are used to automatically learn the reconstruction errors of normal and abnormal samples, achieving unsupervised recognition and detection of building energy consumption abnormal samples. The results show that the number of iterations, neurons, and activation function all have a significant impact on the model properties. The electricity consumption data during working hours on weekdays, non working hours on non working days, and non working hours on non working days are all normally distributed. The power consumption data under working day conditions is more stable, therefore its anomaly detection accuracy is higher than that under non working day conditions. The average accuracy, recall, and F1 Score of the proposed method are 91.23%, 90.87%, and 90.80%, respectively, which can effectively detect energy consumption anomalies in building lighting socket systems.

    LIU Mingbo, LAO Ziqing, DONG Ping
    2025, 53(9):  127-137.  doi:10.12141/j.issn.1000-565X.250017
    Asbtract ( 86 )  
    Related Articles | Metrics
    The district cooling system (DCS) belongs to a class of centralized air-conditioning loads and has frequency regulation potential. This paper proposes an auxiliary frequency regulation control strategy of DCS based on model predictive control (MPC) with terminal constraints, which controls the power consumption of the DCS by adjusting the chilled water flow rate and the number of chiller shutdowns. Firstly, the dynamic model of DCS and traditional units considering the relationship between chilled water flow rate and chilled water outlet temperature is established, and the state space expression of the system is constructed. Then, a joint frequency regulation control model for DCSs and traditional units is established based on terminal constrainted MPC method, with the objective function of minimizing frequency deviation, building temperature deviation from human comfort temperature, chilled water flow’s control instructions, and traditional unit’s control instructions. Moreover, the terminal constraints include terminal cost function and terminal set. It has been proven that the terminal constrained MPC problem is asymptotically stable by constructing the Lyapunov function of the system. Finally, simulations on a 10-unit 39-bus system and an actual power system verify that adding terminal constraints can improve system stability, and the use of DCS to assist in grid frequency regulation can help the system to quickly restore the rated frequency and improve regulation performance. In addition, the participation of DCSs in grid frequency regulation would not have a significant impact on comfort.
    LU Zhimin, XIE Zili, LU Weiye, et al
    2025, 53(9):  138-148.  doi:10.12141/j.issn.1000-565X.240571
    Asbtract ( 111 )  
    Related Articles | Metrics

    The tracer gas dilution method can overcome the issue of significant errors in flue gas flow measurements resulting from the complexity of the flow field in large-diameter stacks in power plants. The method is traceable and the measuring principle differs from that of the conventional velocity-area method, thus offering the potential for utilization as an on-site calibration method for flow measurements. This paper thus analyses the feasibility and accuracy of the tracer gas dilution method in flue gas flow rate measurement of thermal power plant stacks based on numerical simulation. Furthermore, it studies the influence of the tracer dilution ratio and the tracer injection cross-section on the measurements, in addition to setting up different tracer sampling schemes to assess the stability of the measurements. The results demonstrate that at a height of approximately nine times the diameter of the stack (9D), the tracer and flue gas can be fully mixed to satisfy the measurement requirements of the tracer gas dilution method. An excessively high or low tracer dilution ratio will impair the mixing efficiency of the tracer and the flue gas. When the tracer is injected at the flue can effectively reduce the flow measurement error. When the tracer is injected at the stack, the error fluctuation range of each sampling method is considerable. However, the three-point sampling method demonstrates a stable and reliable measurement performance, with measurement errors on 3D, 8D and 12D being only -3.59%, -0.69% and -1.05%, respectively. When the tracer is injected at the flue, the flow measurement errors of each sampling method are not greater ±10%, and the measurement errors of the three-point sampling method are only 0.98%, -0.52% and 0.21%, which are all within ±1% on the cross section of 3D, 8D and 12D, respectively. This study demonstrates the feasibility and accuracy of the tracer gas dilution method in flue gas flow measurement in thermal power plants.

    GAN Yunhua, XIE Yuheng, LIU Fengming, et al
    2025, 53(9):  149-162.  doi:10.12141/j.issn.1000-565X.240534
    Asbtract ( 82 )  
    Related Articles | Metrics

    For the thermal design of high power consumption modules in 5G communication base stations, a phase change heat transfer module with roll bond aluminum vapor chamber is proposed, in which the evaporation chamber of the module are interconnected with the flow channels on all vapor chambers of condenser. By setting up an experimental test platform, an experimental investigation was conducted to assess the thermal performance of the phase change heat transfer module across various filling ratios. Additionally, the impacts of the boiling state and the flow distribution of the working fluid on both temperature uniformity and heat dissipation efficiency of the module were thoroughly analyzed. Furthermore, the variation of heat source temperature distribution under different lateral tilt angles is also discussed. The results indicate that when the input power is less than 400W, as the filling ratio increases, the total thermal resistance of the phase change heat transfer module exhibits a trend of initially decreasing and then increasing, reaching its minimum at a filling ratio of 15%, and the lowest total thermal resistance of the module is 0.2116°C/W. Appropriately reducing the filling ratio can induce boiling of the liquid stored at the bottom of vapor chambers, thereby facilitating an even distribution of vapor flow among different vapor chambers, which enhances the heat dissipation efficiency. At input power of 350W and 400W respectively, reducing the filling ratio from 30% to 15% led to a decrease in the standard deviation of temperatures among the vapor chambers by 40.92% and 34.04%, resulting in a significant enhancement in temperature uniformity. When the module is laterally tilted, the movement of the liquid level results in uneven temperature distribution across the heat source. This adverse effect intensifies as the tilt angle increases. At a tilt angle of 10°, the maximum temperature difference across the heat source expands to more than 11.7 times that of the horizontally placed module.

News
 
Featured Article
Most Read
Most Download
Most Cited