华南理工大学学报(自然科学版) ›› 2011, Vol. 39 ›› Issue (5): 108-114.doi: 10.3969/j.issn.1000-565X.2011.05.019

• 计算机科学与技术 • 上一篇    下一篇

多步桥接精化迁移学习

覃姜维 郑启伦 马千里 韦佳 林古立   

  1. 华南理工大学 计算机科学与工程学院,广东 广州 510006
  • 收稿日期:2010-10-22 修回日期:2010-12-08 出版日期:2011-05-25 发布日期:2011-04-01
  • 通信作者: 覃姜维(1984-),男,博士生,主要从事机器学习、文本挖掘研究 E-mail:qjw128@gmail.com
  • 作者简介:覃姜维(1984-),男,博士生,主要从事机器学习、文本挖掘研究
  • 基金资助:

    广东省自然科学基金资助项目(9451064101003233);广东省科技攻关项目(2007B010200044);华南理工大学中央高校基本科研业务费资助项目(2009ZM0125,2009ZM0189)

Multi-Step Bridged Refinement for Transfer Learning

Qin Jiang-wei  Zheng Qi-lun  Ma Qian-li  Wei Jia  Lin Gu-li   

  1. School of Computer Science and Engineering,South China University of Technology,Guangzhou 510006,Guangdong,China
  • Received:2010-10-22 Revised:2010-12-08 Online:2011-05-25 Published:2011-04-01
  • Contact: 覃姜维(1984-),男,博士生,主要从事机器学习、文本挖掘研究 E-mail:qjw128@gmail.com
  • About author:覃姜维(1984-),男,博士生,主要从事机器学习、文本挖掘研究
  • Supported by:

    广东省自然科学基金资助项目(9451064101003233);广东省科技攻关项目(2007B010200044);华南理工大学中央高校基本科研业务费资助项目(2009ZM0125,2009ZM0189)

摘要:

传统机器学习方法假设训练数据和测试数据分布一致,但在许多实际应用中这个假设并不能得到满足。针对该情况,文中提出了一种非参数化的迁移学习算法——多步桥接精化算法。首先构造一系列中间模型来建立不同领域之间的桥梁,然后在近邻的模型间进行标签传播,实现从源领域到目标领域的判别信息迁移。实验结果表明,分布相近的模型使迁移变得平滑,并使精化结果不敏感于初始标签,文中算法在分类精度上优于其他对比算法。

关键词: 迁移学习, 标签传播, 文本分类, 交互精化

Abstract:

In the traditional machine learning methods,it is assumed that the training and test data have an identical distribution.However,this assumption is not valid in many cases.In order to solve this problem,a non-parametric transfer learning algorithm named Multi-Step Bridged Refinement is proposed.In this algorithm,a series of intermediate models is constructed to bridge different domains,and the label propagation between neighboring mo-dels is performed,through which the discriminative information is transferred from the source domain into the target one.Experimental results show that the models with similar distribution contribute to smooth transfer and make the refinement results insensitive to the initial label,and that the proposed algorithm attains a classification accuracy higher than that from other algorithms.

Key words: Transfer Learning, Label Propagation, Text Classification, Mutual Refinement