Journal of South China University of Technology (Natural Science Edition) ›› 2017, Vol. 45 ›› Issue (3): 68-75.doi: 10.3969/j.issn.1000-565X.2017.03.010

• Computer Science & Technology • Previous Articles     Next Articles

Convolutional Network-Based Semantic Similarity Model of Sentences

HUANG Jiang-ping JI Dong-hong   

  1. Computer School,Wuhan University,Wuhan 430072,Hubei,China
  • Received:2016-06-12 Revised:2016-11-28 Online:2017-03-25 Published:2017-02-02
  • Contact: 黄江平( 1985-) ,男,博士生,主要从事自然语言处理、机器学习研究. E-mail:hjp@whu.edu.cn
  • About author:黄江平( 1985-) ,男,博士生,主要从事自然语言处理、机器学习研究.
  • Supported by:
    Supported by the Key Program of National Natural Science Foundation of China( 61133012) , the National Natural Science Foundation of China ( 61173062,61373108 ) and the National Planning Office of Philosophy and Social Science ( 11&ZD189)

Abstract:

Computing the semantic similarity between two sentences is an important research issue in natural language processing field,and,constructing an effective semantic model of sentences is the core task of natural language processing for paraphrase identification,textual similarity computation,question /answer and textual entailment.In this paper,a parallel convolutional neural network model is proposed to represent sentences with fixedlength vectors,and a similarity layer is used to measure the similarity of sentence pairs.Then,two tasks,namely paraphrase identification and textual similarity test,are used to evaluate the performance of the proposed model.Experimental results show that the proposed model can capture sentence s semantic information effectively; and that,in comparison with the state-of-the-art baseline,the proposed model improves the F1-score in paraphrase identification by 7. 4 percentage points,while in comparison with the logistic regression method,it improves the Pearson correlation coefficient in semantic similarity by 7. 1 percentage points.

Key words: convolutional network, paraphrase identification, sentence model, semantic similarity

CLC Number: