Journal of South China University of Technology (Natural Science Edition) ›› 2007, Vol. 35 ›› Issue (8): 123-126.

• Mathematical Sciences • Previous Articles    

Recognition of DNA Sequences ßased on Hidden Markov Models

Luo Ze-juLi Yan-hui2  Song Li-hong Zhu Si-ming2   

  1. 1. School of Computer Science and 1nformation Engineering, Chongqing Tech. and Business Univ. , Chongqing 400067 , China;2. School of Mathematics and Computational Science , Sun Yat-Sen Univ. , Guangzhou 510275 , Guangdong , China;3. Center of Experiment and Practice , Chongqing Tech. and Business Univ. , Chongqing 400067 , China
  • Received:2006-08-26 Online:2007-08-25 Published:2007-08-25
  • Contact: 罗泽举(1965-) ,男,博士,主要从事机器学习,生物信息学研究 E-mail:luozeju@126.com
  • About author:罗泽举(1965-) ,男,博士,主要从事机器学习,生物信息学研究
  • Supported by:

    国家自然科学基金资助项目(10371135) ;重庆市教育委员会科学技术研究项目( KJ0707022 )

Abstract:

According to the distribution variation of the L value with the DNA sequence structure in the hidden Markov model (HMM) training and by improving the traditional multiclass vote model , a fast training algorithm superior to the traditional one is proposed to recognize the intron and exon of the DNA sequence. The proposed algorithm only need to train one class of parameter of HMM model and the average accuracy rate of it reaches 90. 8%. As compared with the support vector machine , the proposed HMM model is more feasible in the multiclass classification and is of less time cost and higher recognition rate.

Key words: hidden Markov model, DNA sequence, intron, exon, recognition, fast training algorithm