With the widespread application of deep learning in power load forecasting, its nonlinear modeling capability has significantly improved prediction accuracy. However, while pursuing high precision, deep learning models often suffer from a large number of parameters, resulting in low computational efficiency and difficulty in meeting the deployment requirements of high real-time scenarios and resource-constrained environments. To address these issues, this paper proposes a lightweight power load forecasting model based on adaptive knowledge distillation, aiming to achieve an optimal balance between prediction accuracy and computational efficiency. First, in the teacher model construction phase, a deep information extraction module is designed to fully explore the deep-level features of load data, combined with Long Short-Term Memory (LSTM) networks to build a high-precision forecasting model. On this basis, a more compact student model is further designed to perform load forecasting tasks with fewer parameters. To enhance the knowledge transfer effect, a learning-based knowledge distillation method is proposed to achieve efficient knowledge transfer from the teacher model to the student model. Additionally, an adaptive knowledge correction strategy is introduced during the distillation process to further improve the learning capability of the student model. Simulation results demonstrate that the proposed method not only further improves prediction accuracy but also significantly reduces model parameters (only 1.36% of the original model), effectively achieving model lightweighting and overall performance enhancement. These results validate the effectiveness and superiority of the proposed method in load forecasting tasks.