Journal of South China University of Technology (Natural Science Edition) ›› 2017, Vol. 45 ›› Issue (11): 106-111.doi: 10.3969/j.issn.1000-565X.2017.11.015

• Biology Science • Previous Articles     Next Articles

Evaluation of Infection Possibility of Host Cell by Virus on the Basis of Sequence Alignment-Free Comparison

LIU Xue-mei1 ZANG Xiang1 HUANG Tian-lai1 YANG Zhe1,2 LI Wen1 YE Yu-zhong1 HU Shan3   

  1. 1.School of Physics and Optoelectronics,South China University of Technology,Guangzhou 510640,Guangdong,China; 2.ICBC Guangzhou Dongcheng Branch,Guangzhou 510100,Guangdong,China; 3.Department of Biomedical Engineering,Zhongshan School of Medicine,Sun Yat-Sen University,Guangzhou 510275,Guangdong,China
  • Received:2017-01-13 Revised:2017-07-09 Online:2017-11-25 Published:2017-10-01
  • Contact: 胡珊(1972-),女,博士,讲师,主要从事生物信息学研究. E-mail:hushan@mail.sysu.edu.cn
  • About author:刘雪梅(1975-),女,博士,副教授,主要从事生物信息学研究. E-mail:liuxm@scut.edu.cn
  • Supported by:
    Supported by the National Natural Science Foundation of China for Young Scientists(11205061,11205062)

Abstract: A virus and its host cell have a similar word pattern (k-tuple).The scores of the DNA sequences of the virus and its host cell,which are obtained by means of the word pattern,are often higher than those of random host cells,that is to say,the DNA sequence of the virus is similar to that of its host.On the basis of this principle,two alignment-free statistics DS2 and D*2 are adopted to acquire the scores between the DNA sequence of the virus and that of its host cell in this paper.Then,the scores are compared with the threshold,so as to judge whether the vi- rus can infect the host cell.Experimental results show that,when k =5 (k is the size of k-tuple) and Markov order is 1,both of the statistics and can describe the similarity between the virus and its host cell in genes,and that the optimal threshold of DS2 and D*2 from the ROC (Receiver Operator Characteristic) curves can be used to judge whether the virus can infect the host cell.

Key words: bioinformatics, virus, host cell, sequence alignment-free comparison

CLC Number: