IRT2 is a research benchmark for open-world knowledge graph completion and knowledge discovery in large text collections. Currently, we support discovery/ranking and inductive knowledge graph completion evaluations. However, this benchmark can also be used to evaluate (zero-shot) entity linking. Our goal is to provide a more realistic benchmark than what the current state-of-the-art offers based on the observations we made when working with real-world use-cases for knowledge acquisition.

Download the Benchmark

Tiny

Relations 5
Entities 1,174
Mentions 4,945
Triples 2,928
Sentences 9,1 mill.

Small

Relations 12
Entities 2,887
Mentions 9,231
Triples 7,527
Sentences 15,1 mill.

Medium

Relations 45
Entities 3,592
Mentions 14,417
Triples 26,335
Sentences 17,4 mill.

Large

Relations 45
Entities 9,952
Mentions 22,866
Triples 102,289
Sentences 18,7 mill.

Read the paper

IRT2: Linking and Ranking in Knowledge Graphs of Varying Scale

Abstract: We address the challenge of building domain-specific knowledge models for industrial use cases, where labelled data and taxonomic information is initially scarce. Our focus is on inductive link prediction models as a basis for practical tools that support knowledge engineers with exploring text collections and discovering and linking new (so-called open-world) entities to the knowledge graph. We argue that -- though neural approaches to text mining have yielded impressive results in the past years -- current benchmarks do not reflect the typical challenges encountered in the industrial wild properly. Therefore, our first contribution is an open benchmark coined IRT2 (inductive reasoning with text) that (1) covers knowledge graphs of varying sizes (including very small ones), (2) comes with incidental, low-quality text mentions, and (3) includes not only triple completion but also ranking, which is relevant for supporting experts with discovery tasks.

We investigate two neural models for inductive link prediction, one based on end-to-end learning and one that learns from the knowledge graph and text data in separate steps. These models compete with a strong bag-of-words baseline. The results show a significant advance in performance for the neural approaches as soon as the available graph data decreases for linking. For ranking, the results are promising and the neural approaches outperform the sparse retriever by a wide margin.

@article{hamann2022inductive, title={IRT2: Inductive Linking and Ranking in Knowledge Graphs of Varying Scale}, author={Hamann, Felix and Ulges, Adrian and Falk, Maurice}, journal={to be announced}, year={2022} }
Two step knowledge acquisition