Unsupervised Estimation of Subjective Content Descriptions.

Magnus Bender, Tanya Braun, Ralf Möller, Marcel Gehrke

Abstract

An agent in pursuit of a task may work with a corpus containing text documents. One possible task of the agent is to retrieve documents of similar content and highlight relevant locations in retrieved documents. To perform information retrieval on the corpus, the agent may need additional data associated with the documents. Subjective Content Descriptions (SCDs) provide additional location-specific data for text documents. However, the agent needs SCDs referencing sentences of similar content across various documents in the corpus and most text documents are not associated with SCDs. Therefore, this paper presents UESM, an unsupervised approach to estimate SCDs for text documents, i.e., to associate any corpus with SCDs. In an evaluation, we show that the performance of UESM is on par with latent Dirichlet allocation, while UESM provides SCDs referencing sentences of similar content.
Original languageEnglish
Pages266-273
Number of pages8
DOIs
Publication statusPublished - 2023

Cite this