Journal of South China University of Technology (Natural Science Edition) ›› 2018, Vol. 46 ›› Issue (1): 122-130.doi: 10.3969/j.issn.1000-565X.2018.01.016

• Computer Science & Technology • Previous Articles     Next Articles

Transfer Learning for Classification on Imbalanced Data

CHEN Qiong XU Yangyang CHEN Linqing   

  1. School of Computer Science and Engineering,South China University of Technology
  • Received:2016-12-27 Revised:2017-03-24 Online:2018-01-25 Published:2017-12-01
  • Contact: 陈琼( 1966-) ,女,副教授,主要从事人工智能、机器学习、智能计算等研究 E-mail:csqchen@scut.edu.cn
  • About author:陈琼( 1966-) ,女,副教授,主要从事人工智能、机器学习、智能计算等研究
  • Supported by:
    The National Natural Science Foundation of China( 61573145) and the Natural Science Foundation of Guangdong Province of China( 2015A030308018) 

Abstract: Traditional classification algorithms based on the balance data meet some challenges, when data distribution become more and more imbalanced. Transfer learning can solve the problem of imbalanced data distribution by using the relevant auxiliary data sets to compensate the imbalanced target data set. In this paper, we proposed the UnbalancedTrAdaboost(UBTA) binary classification algorithm based on TrAdaboost, which calculates the weights of weak classifiers usingthe auprc (the Area Under the Precision-Recall Curve) of different classes and updates the weights of misclassified data of different classes with different mechanisms. The AUC measure is more accurate combined with G-mean and BER when evaluated the unbalanced classification, since AUC is insensitive to changes in class distribution. The results of these three metrics indicate that, the UBTA algorithm achieves better performance for imbalanced data and classifies more minority instances with the high accuracy of majority instances.

Key words: Imbalanced Data, Classification, Transfer Learning, Precision-Recall Curve

CLC Number: