收稿日期: 2024-05-06
网络出版日期: 2024-06-12
基金资助
国家自然科学基金项目(62173148);广东省自然科学基金项目(2022A1515010150);广东省基础与应用基础研究基金项目(2022A1515240026)
Distributed Energy Cluster Scheduling Method Based on EA-RL Algorithm
Received date: 2024-05-06
Online published: 2024-06-12
Supported by
the National Natural Science Foundation of China(62173148);the Natural Science Foundation of Guangdong Province(2022A1515010150);the Guangdong Basic and Applied Basic Research Foundation(2022A1515240026)
目前对于分布式能源集群调度的研究大多局限于单一场景,同时也缺少高效、准确的算法。该文针对以上问题提出了一种基于进化算法经验指导的深度强化学习(EA-RL)的分布式能源集群多场景调度方法。分别对分布式能源集群中的电源、储能、负荷进行个体建模,并基于个体调度模型建立了包含辅助调峰调频的多场景分布式能源集群优化调度模型;基于进化强化学习算法框架,提出了一种EA-RL算法,该算法融合了遗传算法(GA)与深度确定性策略梯度(DDPG)算法,以经验序列作为遗传算法个体进行交叉、变异、选择,筛选出优质经验加入DDPG算法经验池对智能体进行指导训练以提高算法的搜索效率和收敛性;根据多场景调度模型构建分布式能源集群多场景调度问题的状态空间和动作空间,再以最小化调度成本、最小化辅助服务调度指令偏差、最小化联络线越限功率以及最小化源荷功率差构建奖励函数,完成强化学习模型的建立;为验证所提算法模型的有效性,基于多场景的仿真算例对调度智能体进行离线训练,形成能够适应电网多场景的调度智能体,通过在线决策的方式进行验证,根据决策结果评估其调度决策能力,并通过与DDPG算法的对比验证算法的有效性,最后对训练完成的智能体进行了连续60 d的加入不同程度扰动的在线决策测试,验证智能体的后效性和鲁棒性。
程小华 , 王泽夫 , 曾君 , 曾婧瑶 , 谭豪杰 . 基于EA-RL算法的分布式能源集群调度方法[J]. 华南理工大学学报(自然科学版), 2025 , 53(1) : 1 -9 . DOI: 10.12141/j.issn.1000-565X.240218
At present, the research on distributed energy cluster scheduling is mostly limited to a single scenario and lacks efficient and accurate algorithms. Aiming at these problems, this paper proposed a multi-scenario scheduling method for distributed energy clusters based on evolutionary algorithm experience-guided deep reinforcement learning (EA-RL). The individual models of power supply, energy storage and load in distributed energy cluster were established, respectively. Based on the individual scheduling model, a multi-scenario distributed energy cluster optimal scheduling model including auxiliary peak regulation and frequency modulation was established. Based on the framework of evolutionary reinforcement learning algorithm, an EA-RL algorithm was proposed. The algorithm combines genetic algorithm (GA) and deep deterministic policy gradient (DDPG) algorithm. The empirical sequence was used as the individual of genetic algorithm for crossover, mutation and selection. The high-quality experience was selected to join the DDPG algorithm experience pool to guide the training of the agent to improve the search efficiency and convergence of the algorithm. According to the multi-scenario scheduling model, the state space and action space of the multi-scenario scheduling problem of distributed energy cluster were constructed. Then, the reward function was constructed by minimizing the scheduling cost, the deviation of the auxiliary service scheduling instruction, the over-limit power of the tie line and the power difference between the source and the load, and the reinforcement learning model was established. To validate the effectiveness of the proposed algorithm and model, offline training of scheduling agents was conducted based on multi-scenario simulation cases, resulting in agents capable of adapting to various grid scenarios. Verification was carried out through online decision-making, and their scheduling decision-making capabilities were assessed based on decision outcomes. The validity of the algorithm was further verified through comparison with the DDPG algorithm. Finally, the trained agents undergo 60 consecutive days of online decision-making tests incorporating varying degrees of disturbances to validate their posterior effectiveness and robustness.
| 1 | 盛万兴,吴鸣,季宇,等 .分布式可再生能源发电集群并网消纳关键技术及工程实践[J].中国电机工程学报,2019,39(8):2175-2186,S1. |
| SHENG Wanxing, WU Ming, JI Yu,et al .Key techniques and engineering practice of distributed renewable generation clusters integration[J].Proceedings of the CSEE,2019,39(8):2175-2186,S1. | |
| 2 | 梁志峰,叶畅,刘子文,等 .分布式能源集群并网调控:体系架构与关键技术[J].电网技术,2021,45(10):3791-3802. |
| LIANG Zhifeng, YE Chang, LIU Ziwen,et al .Grid-connected scheduling and control of distributed generations clusters:architecture and key technologies[J].Power System Technology,2021,45(10):3791-3802. | |
| 3 | 张明,朴政国 .含储能的分布式光伏并网系统对配电网调峰的研究[J].电气技术,2016(12):11-14,19. |
| ZHANG Ming, PIAO Zhengguo .Research of distribution network peak shaving for distributed grid-connected PV system with energy storage device[J].Electrical Engineering,2016(12):11-14,19. | |
| 4 | 郝晓弘,杨婧元 .基于灰色PSO算法的分布式电网多目标调峰调度优化[J].兰州理工大学学报,2018,44(6):79-83. |
| HAO Xiao-hong, YANG Jing-yuan .Grey PSO algorithm-based multi-objective optimization of scheduling of distri-buted power grid with peak load regulation[J].Journal of Lanzhou University of Technology,2018,44(6):79-83. | |
| 5 | 李世彦 .面向电网调峰的综合能源系统弹性模型及调度优化方法[D].合肥:合肥工业大学,2022. |
| 6 | 董璐,边晓燕,周波,等 .计及调频备用效益的主动配电网分层分布式协调优化调度[J].电力自动化设备,2023,43(1):55-63. |
| DONG Lu, BIAN Xiaoyan, ZHOU Bo,et al .Hierarchical distributed coordinated optimal dispatch of active distribution network considering frequency regulation reserve benefits[J].Electric Power Automation Equipment,2023,43(1):55-63. | |
| 7 | 陈楚靓,李晓露,纪坤华,等 .考虑源荷储匹配的配电网集群划分与优化运行[J].电力建设,2023,44(9):80-93. |
| CHEN Chujing, LI Xiaolu, JI Kunhua,et al .Distribution network cluster partition and optimal operation consi-dering source-load-storage matching[J].Electric Power Construction,2023,44(9):80-93. | |
| 8 | 陈宁,李法社,王霜,等 .基于深度强化学习算法的分布式光伏-EV互补系统智能调度[J/OL]. (2024-04-16)[2024-04-28].. |
| CHEN Ning, LI Fashe, WANG Shuang,et al .Intelligent scheduling of distributed photovoltaic EV complementary systems based on deep reinforeement learning algorithm[J/OL]. (2024-04-16)[2024-04-28].. | |
| 9 | 蒋明喆,成贵学,赵晋斌 .基于改进DDPG的多能园区典型日调度研究[J].电网技术,2022,46(5):1867-1880. |
| JIANG Mingzhe, CHENG Guixue, ZHAO Jinbin .Typical daily scheduling if improved DDPG multifunctional industrial park[J].Power System Technology,2022,46(5):1867-1880. | |
| 10 | 徐钰涵,季天瑶,李梦诗 .基于深度强化学习的微电网日前日内协调优化调度[J].南方电网技术,2024,18(9):106-116. |
| XU Yuhan, JI Tianyao, LI Mengshi .Day-ahead and intra-day coordinated optimal scheduling of microgrid based on deep reinforcement learning[J].Southern Power System Technology,2024,18(9):106-116. | |
| 11 | 吕帅,龚晓宇,张正昊,等 .结合进化算法的深度强化学习方法研究综述[J].计算机学报,2022,45(7):1478-1499. |
| Shuai Lü, GONG Xiaoyu, ZHANG Zhenghao,et al .Survey of deep reinforcement learning methods with evolutionary algorithms[J].Chinese Journal of Computers,2022,45(7):1478-1499. | |
| 12 | 蒋正威,张锋明,胡凤桐,等 .虚拟电厂分布式资源的聚合响应能力评估方法[J].电力工程技术,2022,41(6):39-49. |
| JIANG Zhengwei, ZHANG Fengming, HU Fengtong,et al .Evaluation method of aggregated responsiveness of distribute resources in virtual power plant[J].Electric Power Engineering Technology,2022,41(6):39-49. | |
| 13 | LILLICRP T P, HUNT J J, PRITZEL A,et al .Continuous control with deep reinforcement learning[C]∥ Proceedings of 4th International Conference on Learning Representation.San Juan: ICLR,2016:1-14. |
| 14 | HOU Y, LIU L, WEI Q,et al .A novel DDPG method with prioritized experience replay[C]∥ Proceedings of IEEE International Conference on Systems, Man, and Cybernetics (SMC). Banff:IEEE, 2017:316-321. |
| 15 | KANG C, RONG C, REN W,et al .Deep deterministic policy gradient based on double network prioritized experience replay[J].IEEE Access,2021,9:60296-60308. |
/
| 〈 |
|
〉 |