文档级关系抽取方法研究综述

周友华; 黄翰; 刘浩龙; 郝志峰

doi:10.12141/j.issn.1000-565X.210152

华南理工大学学报(自然科学版) >

2022 , Vol. 50 >Issue 4: 10 - 25

DOI: https://doi.org/10.12141/j.issn.1000-565X.210152

计算机科学与技术

文档级关系抽取方法研究综述

周友华 ,
黄翰 ,
刘浩龙 ,
郝志峰

展开

1．华南理工大学软件学院，广东广州 510006; 2．佛山科学技术学院数学与大数据学院，广东佛山 528225

周友华 (1986-)，男，博士生，主要从事大数据审计与知识图谱研究

收稿日期: 2021-03-21

修回日期: 2021-08-09

网络出版日期: 2021-08-21

基金资助

国家自然科学基金

收起

Survey on Document-Level Relation Extraction

ZHOU You-Hua ,
HUANG Han ,
LIU Hao-Long ,
HAO Zhi-Feng

Expand

1． School of Software Engineering，South China University of Technology，Guangzhou 510006，Guangdong，China;
2． School of Mathematics and Big Data，Foshan University，Foshan 528225，Guangdong，China

周友华 (1986-)，男，博士生，主要从事大数据审计与知识图谱研究

Received date: 2021-03-21

Revised date: 2021-08-09

Online published: 2021-08-21

Supported by

National Natural Science Foundation of China

Fold

摘要

关系抽取是自然语言处理领域的一项基础研究，抽取的结果可以用于知识图谱构建、人机问答、语义搜索等下游任务，具有广泛的应用场景和重要的研究价值。近年来，关系抽取取得了丰富的成果，但绝大多数研究局限于句子级关系抽取。研究表明，大量的关系无法通过单个句子提取，随着深度学习和自然语言处理技术的不断发展，文档级关系抽取研究工作迎来了新一轮的机遇和挑战。着重对近几年文档级关系抽取的研究进展进行分类和梳理，提炼出文档级关系抽取的一般技术路线图，分析文档级关系抽取研究的特征编码及特征聚合方法，同时介绍常用文档级关系抽取数据集和评测指标，并对未来的研究趋势进行展望。

关键词： 文档级别; 关系抽取; 特征编码; 特征聚合

本文引用格式

周友华 , 黄翰 , 刘浩龙 , 郝志峰 . 文档级关系抽取方法研究综述[J]. 华南理工大学学报(自然科学版), 2022 , 50(4) : 10 -25 . DOI: 10.12141/j.issn.1000-565X.210152

Abstract

Relation extraction (RE) is one of the most important tasks in information extraction of NLP, the result of RE can be used to downstream missions such as construction of knowledge graphs, knowledge base question answering, semantic search et al. which means RE has wide-ranging application scenarios and important research value. Recent years, RE achieves frutiful results, but most of them are limited in sentence-level RE, which focus on extract relation between two mentions within a single sentence. Reserches shows that a large number of relations can’t extract from a single sentence, in rencent years, document-level RE faces new opportunities and challenges with the development of deep learning and NLP. This study reviews the recent advances in document-level RE research, summarize a general technology roadmap of this task, and then analyzes the encoding and aggregation methods used in the researches, We also introduce the common datasets and evaluation metrics of this task. This paper ends up with forecasting the future development trend of this task.

Key words： document-level; relation extraction; encoding; aggregation

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract