Named Entity Recognition for Specific Domains - Take Advantage of Transfer Learning

Sunna Torge; Waldemar Hahn; Lalith Manjunath; René Jäkel

doi:10.57675/IMIST.PRSM/ijist-v6i3.189

Named Entity Recognition for Specific Domains - Take Advantage of Transfer Learning

Sunna Torge Technische Universität Dresden Centre for Information Services and High Performance Computing (ZIH) 01062 Dresden
Waldemar Hahn Technische Universität Dresden Centre for Information Services and High Performance Computing (ZIH) 01062 Dresden
Lalith Manjunath
René Jäkel Technische Universität Dresden Centre for Information Services and High Performance Computing (ZIH) 01062 Dresden

DOI: http://dx.doi.org/10.57675/IMIST.PRSM/ijist-v6i3.189

Abstract

Automated text analysis as named entity recognition (NER) heavily relies on large amounts of high-quality training data. For domain-specific NER transfer learning approaches aim to overcome the problem of lacking domain-specific training data. In this paper, we investigate transfer learning approaches in order to improve domain-specific NER in low-ressource domains. The first part of the paper is dedicated to information transfer from known to unknown entities using BiLSTM-CRF neural networks, considering also the influence of varying training data size. In the second part instead, pre-trained BERT models are fine-tuned to domain-specific German NER. The performance of models of both architectures is compared w.r.t. different hyperparameters and a set of 16 entities. The experiments are based on the revised German SmartData Corpus, and a baseline model, trained on this corpus.

Published

Sep 16, 2022

How to Cite

TORGE, Sunna et al. Named Entity Recognition for Specific Domains - Take Advantage of Transfer Learning. International Journal of Information Science and Technology, [S.l.], v. 6, n. 3, p. 4 - 15, sep. 2022. ISSN 2550-5114. Available at: <https://innove.org/ijist/index.php/ijist/article/view/189>. Date accessed: 11 july 2025. doi: http://dx.doi.org/10.57675/IMIST.PRSM/ijist-v6i3.189.

Citation Formats

Issue

Vol 6 No 3 (2022)

Section

Special Issue : Machine Learning and Natural Language Processing

The submitting author warrants that the submission is original and that she/he is the author of the submission together with the named co-authors; to the extend the submission incorporates text passages, figures, data or other material from the work of others, the submitting author has obtained any necessary permission.

Articles in this journal are published under the Creative Commons Attribution Licence (CC-BY). This is to get more legal certainty about what readers can do with published articles, and thus a wider dissemination and archiving, which in turn makes publishing with this journal more valuable for you, the authors.

In order for iJIST to publish and disseminate research articles, we need publishing rights. This is determined by a publishing agreement between the author and iJIST.

By submitting an article the author grants to this journal the non-exclusive right to publish it. The author retains the copyright and the publishing rights for his article without any restrictions.

Privacy Statement

The names and email addresses entered in this journal site will be used exclusively for the stated purposes of this journal and will not be made available for any other purpose or to any other party.