TY - CONF
T1 - Unsupervised Estimation of Subjective Content Descriptions.
AU - Bender, Magnus
AU - Braun, Tanya
AU - Möller, Ralf
AU - Gehrke, Marcel
N1 - DBLP License: DBLP's bibliographic metadata records provided through http://dblp.org/ are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.
PY - 2023
Y1 - 2023
N2 - An agent in pursuit of a task may work with a corpus containing text documents. One possible task of the agent is to retrieve documents of similar content and highlight relevant locations in retrieved documents. To perform information retrieval on the corpus, the agent may need additional data associated with the documents. Subjective Content Descriptions (SCDs) provide additional location-specific data for text documents. However, the agent needs SCDs referencing sentences of similar content across various documents in the corpus and most text documents are not associated with SCDs. Therefore, this paper presents UESM, an unsupervised approach to estimate SCDs for text documents, i.e., to associate any corpus with SCDs. In an evaluation, we show that the performance of UESM is on par with latent Dirichlet allocation, while UESM provides SCDs referencing sentences of similar content.
AB - An agent in pursuit of a task may work with a corpus containing text documents. One possible task of the agent is to retrieve documents of similar content and highlight relevant locations in retrieved documents. To perform information retrieval on the corpus, the agent may need additional data associated with the documents. Subjective Content Descriptions (SCDs) provide additional location-specific data for text documents. However, the agent needs SCDs referencing sentences of similar content across various documents in the corpus and most text documents are not associated with SCDs. Therefore, this paper presents UESM, an unsupervised approach to estimate SCDs for text documents, i.e., to associate any corpus with SCDs. In an evaluation, we show that the performance of UESM is on par with latent Dirichlet allocation, while UESM provides SCDs referencing sentences of similar content.
UR - https://www.mendeley.com/catalogue/75252f38-d0bb-3ae2-a059-e2f2aad63e23/
U2 - 10.1109/ICSC56153.2023.00052
DO - 10.1109/ICSC56153.2023.00052
M3 - Conference Papers
SP - 266
EP - 273
ER -