Journal of South China University of Technology (Natural Science Edition) ›› 2021, Vol. 49 ›› Issue (1): 10-17.doi: 10.12141/j.issn.1000-565X.200506

Special Issue: 2021年计算机科学与技术

• Computer Science & Technology • Previous Articles     Next Articles

Drug-Drug Interaction Extraction Model Combining Category Keywords with Attention Mechanism 

IKA Novita Dewi CAI Xiaoling LIU Xiaofeng DONG Shoubin   

  1. School of Computer Science and Engineering,South China University of Technology,Guangzhou 510006,Guangdong,China
  • Received:2020-08-24 Revised:2020-12-15 Online:2021-01-25 Published:2021-01-01
  • Contact: 董守斌 ( 1967-) ,女,教授,主要从事信息检索、自然语言处理、高性能计算研究。 E-mail:sbdong@scut.edu.cn
  • About author:IKA Novita Dewi ( 1987-) ,女,博士生,主要从事自然语言处理研究。E-mail: sbdong@scut.edu.cn
  • Supported by:
    Supported by the National Natural Science Foundation of China ( 61976239)

Abstract: A drug interaction extraction model combining category key words with attention mechanism was proposed to enhance the discrimination among different categories of data and improve the performance of classifier. Firstly,the keywords of each class were selected based on the chi-square test and document frequency. Then,the position coding of keywords and drug pairs was added into the pre-trained model BERT,in order to make the difference of the samples more salient. The distribution information of keywords and other words in the sentence was learned through the attention mechanism to improve the performance of the model. Aiming at the problem of too much negative samples in the drug interaction extraction experiment,a negative sample filtering method based on rules and patterns was proposed to effectively reduce the proportion of positive and negative samples. Compared with other DDI models based on CNN,LSTM,and BERT,KA-BERT model can better improve performance on DDI data,which proves the effectiveness of KA-BERT model. The results of the test on chemical protein relation extraction show that the precision,recall and F1 score of KA-BERT model are enhanced significantly,which further proves the validity and universality of KA-BERT model.

Key words: drug interaction, category keywords, attention mechanism