基于最小熵正则化的半监督分类

刘小兰 郝志峰 杨晓伟 马献恒

doi:10.3969/j.issn.1000-565X.2010.01.017

华南理工大学学报(自然科学版) >

2010 , Vol. 38 >Issue 1: 87 - 91

DOI: https://doi.org/10.3969/j.issn.1000-565X.2010.01.017

计算机科学与技术

基于最小熵正则化的半监督分类

展开

1.华南理工大学计算机科学与工程学院, 广东广州 510006; 2.华南理工大学理学院, 广东广州 510640; 3.广东工业大学计算机学院, 广东广州 510090; 4.华南理工大学软件学院, 广东广州 510006

刘小兰（1979-），女，博士生，讲师，主要从事人工智能、机器学习研究．

收稿日期: 2008-12-30

修回日期: 2009-04-26

网络出版日期: 2010-01-25

基金资助

广东省-教育部产学研结合项目（2007B090400031）;广东省科技计划项目（2008B080701005）

收起

Semi-Supervised Classification Based on Regularization of Minimum Entropy

Expand

1. School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, Guangdong, China; 2. School of Science, South China University of Technology, Guangzhou 510640, Guangdong, China; 3. Faculty of Computer, Guangdong University of Technology, Guangzhou 510090, Guangdong, China; 4. School of Software Engineering, South China University of Technology, Guangzhou 510006, Guangdong, China

刘小兰（1979-），女，博士生，讲师，主要从事人工智能、机器学习研究．

Received date: 2008-12-30

Revised date: 2009-04-26

Online published: 2010-01-25

Supported by

广东省-教育部产学研结合项目（2007B090400031）;广东省科技计划项目（2008B080701005）

Fold

摘要

首先分析了条件 Havrda-Charvat’s structural α-熵为什么是一个好的聚类标准，然后基于认识：一个好的聚类标准同时也是对无标记数据的一种好的刻画，提出了基于最小熵正则化的半监督分类模型，并用拟牛顿法对模型进行了求解。该算法既是判别式的，又是直推式的，从而降低了对模型的依赖程度，同时可以方便地预测训练集之外的示例的标记。在UCI数据库上的测试结果验证了该算法的有效性。

关键词： 半监督学习; 条件Havrda-Charvat’s structual α-熵; 拟牛顿法; 分类

本文引用格式

刘小兰郝志峰杨晓伟马献恒 . 基于最小熵正则化的半监督分类[J]. 华南理工大学学报(自然科学版), 2010 , 38(1) : 87 -91 . DOI: 10.3969/j.issn.1000-565X.2010.01.017

Abstract

As the generative model needs modelling complex joint probability density and evaluating many parameters, a discriminant semi-supervised classification algorithm based on the regularization of minimum entropy is proposed. This algorithm uses Havrda-Charvat＇s structural α-entropy as the regularization item of the objective and employs the quasi-Newton method to solve the objective, which makes the algorithm discriminative and inductive and reduces the dependence of the algorithm on the model. At the same time, the algorithm can predict the labels of the out-of-sample data points easily. Simulated results of several UCI datasets demonstrate that the proposed algorithm is of low classification error even with few labeled data.

Key words： semi-supervised learning; conditional Havrda-Charvat＇s structural c~-entropy; regularization; pattern classification; quasi-Newton method

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract