Hi, my name is Felix Hamann. Currently, I work as a research associate at RheinMain University of Applied Sciences in the LAVIS working group. I have an MSc in computer science and pursue my PhD at Trier University. My research comprises information extraction using state-of-the-art neural architectures in industrial settings. Specifically, I aim to build and refine knowledge graphs from vast collections of natural, unstructured text.

  • Work address
  • Unter den Eichen 5
  • 65195 Wiesbaden
  • Room C243
Felix Hamann, Adrian Ulges | July 2023 IEA/AIE (IKEDMS) | Springer (Paper)

Domain Specific Knowledge Graph Adaption with Industrial Text Data

We suggest an interactive approach towards knowledge acquisition based on neural text mining. Our method enables experts to extend a given knowledge graph—which is general and scarce—with knowledge discovered in a text collection covering a particular customer or subdomain. Our tool suggests text passages that contain new concepts and links, which the domain expert inspects and adds to the graph. The underlying discovery model combines a transformer-based contextualized text encoder with a knowledge graph completion model. End-to-end training of both components is bootstrapped by sampling mentions of the given knowledge graph's concepts from the text collection. We evaluate our approach quantitatively based on manual annotations of model predictions, including a comparison with fastText. Furthermore, we conducted an expert annotation session using the tool and a subsequent interview. The resulting observations show the potential of the approach for knowledge engineering in the wild.

  • Paper and Code coming soon...
Felix Hamann, Adrian Ulges, Maurice Falk | September 2022 KI2022 (TMG) | Springer (Paper)

IRT2: Inductive Linking and Ranking in Knowledge Graphs of Varying Scale

We address the challenge of building domain-specific knowledge models for industrial use cases, where labelled data and taxonomic information is initially scarce. Our focus is on inductive link prediction models as a basis for practical tools that support knowledge engineers with exploring text collections and discovering and linking new (so-called open-world) entities to the knowledge graph. We argue that -- though neural approaches to text mining have yielded impressive results in the past years -- current benchmarks do not reflect the typical challenges encountered in the industrial wild properly. Therefore, our first contribution is an open benchmark coined IRT2 (inductive reasoning with text) that (1) covers knowledge graphs of varying sizes (including very small ones), (2) comes with incidental, low-quality text mentions, and (3) includes not only triple completion but also ranking, which is relevant for supporting experts with discovery tasks.

We investigate two neural models for inductive link prediction, one based on end-to-end learning and one that learns from the knowledge graph and text data in separate steps. These models compete with a strong bag-of-words baseline. The results show a significant advance in performance for the neural approaches as soon as the available graph data decreases for linking. For ranking, the results are promising and the neural approaches outperform the sparse retriever by a wide margin.

Felix Hamann, Adrian Ulges, Ralph Bergmann | March 2021 IEA/AIE | Springer (Paper)

IRT1: Open-World Knowledge Graph Completion Benchmarks for Knowledge Discovery

The construction and completion of knowledge graphs in industrial settings has gained traction over the past years. However, modelling a specific domain is often entailed with significant cost. This can be alleviated by including other knowledge sources such as text - a challenge known as open-world knowledge graph completion. Although knowledge graph completion has drawn significant attention from the research community over the past years, we argue that academic benchmarks fall short at two key characteristics of industrial conditions: (1) open-world entities are drawn randomly in benchmarks although in practice they are more volatile than closed-world entities, and (2) textual descriptions of entities are not concise.

This paper's mission is to bring academia and industry closer by proposing Inductive Reasoning with Text (IRT), an approach to create open-world evaluation benchmarks from given knowledge graphs. Two graphs, one based on Freebase and another derived from Wikidata, are created, analysed, and enhanced with textual descriptions according to the above assumptions. We evaluate a modular system that can tether any vector space knowledge graph completion model and a transformer-based text encoder to align sentence and entity representations. We show the difficulty of learning with such scattered text in contrast to other benchmarks provided texts and provide a solid baseline study for future model benchmarking.

Nadja Kurz, Felix Hamann, Adrian Ulges | August 2020 SDS2020 | IEEE (Paper)

Neural Entity Linking on Technical Service Tickets

Entity linking, the task of mapping textual mentions to known entities, has recently been tackled using contextualized neural networks. We address the question whether these results --- reported for large, high-quality datasets such as Wikipedia --- transfer to practical business use cases, where labels are scarce, text is low-quality, and terminology is highly domain-specific. Using an entity linking model based on BERT, a popular transformer network in natural language processing, we show that a neural approach outperforms and complements hand-coded heuristics, with improvements of about 20% top-1 accuracy. Also, the benefits of transfer learning on a large corpus are demonstrated, while fine-tuning proves difficult. Finally, we compare different BERT-based architectures and show that a simple sentence-wise encoding (Bi-Encoder) offers a fast yet efficient search in practice.

Felix Hamann | October 2018 (Master's Thesis)

A Neural Embedding Compressor for Scalable Document Search

In retrieval applications, binary hashes are known to offer significant improvements in terms of both memory and speed. We investigate the compression of sentence embeddings using a neural encoder-decoder architecture, which is trained by minimizing reconstruction error. Instead of employing the original real-valued embeddings, we use latent representations in Hamming space produced by the encoder for similarity calculations. In quantitative experiments on several benchmarks for semantic similarity tasks, we show that our compressed hamming embeddings yield a comparable performance to uncompressed embeddings (Sent2Vec, InferSent, Glove-BoW), at compression ratios of up to 256:1. We further demonstrate that our model strongly decorrelates input features, and that the compressor generalizes well when pre-trained on Wikipedia sentences. We publish the source code on Github and all experimental results.