Information Extraction from Chinese Research Papers Based on Conditional Random Fields

Journal of South China University of Technology (Natural Science Edition) ›› 2007, Vol. 35 ›› Issue (9): 90-94,106.

• Computer Science & Technology • Previous Articles Next Articles

Information Extraction from Chinese Research Papers Based on Conditional Random Fields

Yu Jiang-de Fan xiao-zhong yin ji-hao

School of Computer Science and Tech. , Beijing Institute of Tech. , Beijing 100081 , China

Received:2006-11-27 Online:2007-09-25 Published:2007-09-25
Contact: 于江德(1971-) ，男，博士生，主要从事自然语言处理、信息抽取和信息检索方面的研究. E-mail:jangder@bit. edu. cn
About author:于江德(1971-) ，男，博士生，主要从事自然语言处理、信息抽取和信息检索方面的研究.
Supported by:
教育部博士点基金资助项目(20050007023)

Abstract

Abstract:

The information of headers and citations of research papers is necessaηfor many applications , such asthe field-based paper search , the paper statistics and the citation analysis. In order to enhance the utilization ofcontext features for information extraction which is greatly restricted by the hidden Markov model (HMM) , a methodbased on the conditional random fields (CRFs) is proposed to extract the information of paper header and citationfrom Chinese research papers. The proposed method , whose key is the parameter estimation and the feature selection, employs L-BFGS algorithm for the estimation of model parameters in the experiment and selects the categoriesfeatures of location , layout , lexicon and state transition as the feature set of the model. During the information extraction, the format information about list separators and special-labels is used to segment the text , and then CRFsare applied to the extraction in special fields. Experimental results show that the proposed method possesses betterperformance than that based on the HMM , and that the performance improvement varies with the features sets.

Key words: infoIτnation extraction, conditional random field, citation information, paper header information

Yu Jiang-de Fan xiao-zhong yin ji-hao. Information Extraction from Chinese Research Papers Based on Conditional Random Fields[J]. Journal of South China University of Technology (Natural Science Edition), 2007, 35(9): 90-94,106.

[1]	Sun Ji-feng Ouyang Jin-hua. Video Segmentation Algorithm Based on Partitioning-Processing Conditional Random Field [J]. Journal of South China University of Technology(Natural Science Edition), 2012, 40(6): 43-47.
[2]	Luo Rong-hua Min Hua-qing. Hybrid Conditional Random Field for Multi-Object Tracking of Mobile Robot [J]. Journal of South China University of Technology (Natural Science Edition), 2011, 39(5): 61-67.

Information Extraction from Chinese Research Papers Based on Conditional Random Fields

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 2

Recommended Articles

Metrics

Comments