Journal of South China University of Technology (Natural Science Edition) ›› 2005, Vol. 33 ›› Issue (9): 25-29,34.

Previous Articles     Next Articles

Chinese Quedtion Classification Based on Support Vector Machine

Yu Zheng-tao1  Fan Xiao-zhong2  Guo Jian-yi1   

  1. 1.School of Information Engineering and Automation,Kunming Univ.of Sci.and Tech.,Kunming 650051,Yunnan,China;2.Dept.of Computer Science and Engineering,Beijing Institute of Tech.,Beijing 100081,China
  • Received:2004-11-22 Online:2005-09-25 Published:2005-09-25
  • Contact: 余正涛(1970-),男,副教授,北京理工大学在职博士生,主要从事自然语言处理、汉语问答系统和信息提取方面的研究 E-mail:ztyu@bit.edu.cn
  • About author:余正涛(1970-),男,副教授,北京理工大学在职博士生,主要从事自然语言处理、汉语问答系统和信息提取方面的研究
  • Supported by:

    云南省信息技术基金资助项目(2002IT03)

Abstract:

At present,Chinese question classification is commonly based on the combinatorial rules between the in-terrogatives and their interrelated words.Because the extraction of the combinatorial rules gready depends on language knowledge and not all combinatorial rules can be listed.the classification performance is not desirable.As the SVM (Support Vector Machine),a machine learning method based on the statistical theory,possesses excellent discriminating effect on small sample classification,this paper establishes a question classification model based on SVM after the analysis and definition of Chinese question types.Th e process of the feature selection for question classification is then described in detail.Finally,a question classification experiment is carried out by introducing coresponding semantic features based on syntactic achieved,which a classification accuracy of 88.7% being achieved,which indicates that Chinese questions can be excellently clsssified by means of SVM with the combina-tion of syntactic features and semantic features.

Key words: question-answering system, ;question classification, support vector machine, syntactic feature, seman-tic feature