To Extend or Not to Extend? Context-Specific Corpus Enrichment

Felix Kuhr*, Tanya Braun, Magnus Bender, Ralf Möller

*Corresponding author for this work

Abstract

An agent in pursuit of a task may work with a corpus of documents with linked subjective content descriptions. Faced with a new document, an agent has to decide whether to include that document in its corpus or not. Basing the decision on only words, topics, or entities, has shown to not lead to a balanced performance for varying documents. Therefore, this paper presents an approach for an agent to decide if a new document adds value to its existing corpus by combining texts and content descriptions. Furthermore, an agent can use the approach as a starting point for high quality content descriptions for new documents. A case study shows the effectiveness of our approach given varying types of new documents.

Original languageEnglish
Title of host publicationAI 2019: AI 2019: Advances in Artificial Intelligence
EditorsJixue Liu, James Bailey
Number of pages12
Volume11919 LNAI
PublisherSpringer, Cham
Publication date25.11.2019
Pages357-368
ISBN (Print)978-3-030-35287-5
ISBN (Electronic)978-3-030-35288-2
DOIs
Publication statusPublished - 25.11.2019
Event32nd Australasian Joint Conference on Artificial Intelligence - Adelaide, Australia
Duration: 02.12.201905.12.2019
Conference number: 234489

Research Areas and Centers

  • Centers: Center for Artificial Intelligence Luebeck (ZKIL)
  • Research Area: Intelligent Systems

DFG Research Classification Scheme

  • 409-06 Information Systems, Process and Knowledge Management

Cite this