Journal of South China University of Technology(Natural Science) >
A Plagiarism Detection Method Based on Semantic Matching
Received date: 2013-03-10
Online published: 2013-06-01
Supported by
Supported by the National Natural Science Foundation of China (61070092)
The existing plagiarism detection methods mostly use the similarity to determine whether there is pla-giarism between two documents.Unlike the case in common duplication detection,in plagiarism detection,a small segment of duplicate text without any references may be identified as plagiarism.However,due to the effects of document size,duplicate text length and interferences,the existing plagiarism detection methods are all of relatively poor performance.In order to solve this problem,the relationship between the text semantics and the fingerprint order is analyzed,and a semantic matching method,which projects the fingerprint vector into a binary sequence to reduce the dimension and remain the position information of the fingerprint,is pro-posed.Then,the method is compared with the Jaccard distance method and the Hamming distance method through the test on the PAN public corpus.The results show that the proposed method is of the highest recall and precision.
Key words: semantic matching; plagiarism detection; fingerprint; text semantics
Zou Du Chen Yu- qing Zhang Ling . A Plagiarism Detection Method Based on Semantic Matching[J]. Journal of South China University of Technology(Natural Science), 2013 , 41(7) : 131 -136 . DOI: 10.3969/j.issn.1000-565X.2013.07.022
/
| 〈 |
|
〉 |