基于Q学习的WLAN/WIMAX接入控制网络选择策略

徐玉滨 陈佳美 马琳

doi:10.3969/j.issn.1000-565X.2013.08.007

华南理工大学学报(自然科学版) >

2013 , Vol. 41 >Issue 8: 41 - 46,60

DOI: https://doi.org/10.3969/j.issn.1000-565X.2013.08.007

电子、通信与自动控制

基于Q学习的WLAN/WIMAX接入控制网络选择策略

展开

哈尔滨工业大学通信技术研究所，黑龙江哈尔滨 150080

徐玉滨(1954-)，男，博士，教授，主要从事通信网及专用移动通信系统、导航定位技术和多媒体通信技术研究．

收稿日期: 2012-06-30

修回日期: 2013-02-17

网络出版日期: 2013-07-01

基金资助

国家自然科学基金资助项目(61071105)

收起

Q- Learning- Based Network Selection Strategy for Access Control in WLAN/WIMAX

Expand

Harbin Institute of Technology,School of Electronics and Information Engineering,Harbin 150080,Heilongjiang,China

徐玉滨(1954-)，男，博士，教授，主要从事通信网及专用移动通信系统、导航定位技术和多媒体通信技术研究．

Received date: 2012-06-30

Revised date: 2013-02-17

Online published: 2013-07-01

Supported by

国家自然科学基金资助项目(61071105)

Fold

摘要

无线异构网络中，接入控制机制是决定无线异构网络性能的关键因素之一．文中提出一种基于 Q 学习的无线异构网络接入控制的网络选择算法．系统中的学习者将会利用历史经验，通过迭代思想来执行 Q 学习算法，最终获得网络选择的最优策略．此外，在系统做出决策之前，算法从一个新的角度详细分析了 WLAN/WIMAX 无线异构网络的状态，为 Q 学习算法提供精确的底层决策输入参数．仿真数据表明: 与传统的基于马尔科夫决策( MDP) 的接入控制网络选择方案相比，新算法在呼叫阻塞率和系统回报最大化上都表现出了更加优良的性能．

关键词： 网络选择; 资源管理; 强化学习; Q 学习

本文引用格式

徐玉滨陈佳美马琳 . 基于Q学习的WLAN/WIMAX接入控制网络选择策略[J]. 华南理工大学学报(自然科学版), 2013 , 41(8) : 41 -46,60 . DOI: 10.3969/j.issn.1000-565X.2013.08.007

Abstract

In wireless heterogeneous networks,the access control mechanism plays a vital role in ensuring the net-work performance.This paper proposes a Q- learning- based network selection algorithm for the access control of wireless heterogeneous networks.In the algorithm,the agent takes advantage of the past experience to implement a Q- learning algorithm by means of value iteration,thus obtaining the optimal strategy.Moreover,before the system makes decisions,the statuses of the wireless heterogeneous network of WLAN/WIMAX are analyzed in detail from a new perspective,thus providing accurate input bottom layer parameters for the Q- learning algorithm.Simulation results show that the proposed algorithm outperforms the traditional Markov Decision Process (MDP) algorithm in terms of the call- blocking probability and the system reward.

Key words： network selection; resource management; reinforcement learning; Q- learning

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract