Settings and Benchmark¶
There are many settings in relation extraction, targeting different challenges in this area. The following figure gives a vivid example of the formats of these settings.
Sentence-Level Relation Extraction¶
It is the most classical relation extraction task. Given a sentence and two tagged entities in the sentence, models neet to classify which relation these two entites have from a predefined relation set.
Two commonly used datasets for sentence-level relation extraction are SemEval 2010 Task-8 (paper / website) and TACRED (paper / website). We also provide a new dataset Wiki80, which is derived from FewRel.
The statistics of the three datasets are as follows:
|SemEval-2010 Task 8||9||6,647|
|TACRED||42||21,784 (exclude no_relation)|
Due to the copyright reason, we did not release the TACRED dataset. Please refer to the official site for details.
Bag-Level Relation Extraction¶
This setting comes with distant supervision. Annotating large-scale relation extraction dataset is labor-intensive and money-consuming. On the other hand, there exist some knowledge graphs (KGs), like FreeBase and WikiData, already including relation triples. By linking entities mentioned in the text to these in the KGs, we can utilize the existing annotations in KGs to automatically label text, which is called distant supervision.
Though making large-scale training data available, distant supervision inevitably brings noise. For the denoising purpose, multi-instance multi-label setting was proposed. Instead of predicting relations for each sentence, in the multi-instance multi-label setting, models need to predict labels for each entity pair (which may have many sentences). Sentences share the same entity pair are called a “bag”.
Few-Shot Relation Extraction¶
Inspired by the facts that human can grasp new knowledge with only a handful of training instances, few-shot learning was proposed to explore how models can fast adapt to new tasks. FewRel is a large-scale few-shot relation dataset (paper / website). The way to sample data and evaluate models in the few-shot setting is quite different from others, and you can refer to the paper for details. Statistics of the dataset are shown in the following table: