Meta-complementing the semantics of short texts in neural topic models

Citation DataAdvances in Neural Information Processing Systems 36 (NeurIPS 2022): New Orleans, November 28-December 9

Publication Year2022

0
Citations
26
Usage
0
Captures
0
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Metrics Details

Usage
26
- Downloads
  17
- Abstract Views
  9

Conference Paper Description

Topic models infer latent topic distributions based on observed word co-occurrences in a text corpus. While typically a corpus contains documents of variable lengths, most previous topic models treat documents of different lengths uniformly, assuming that each document is sufficiently informative. However, shorter documents may have only a few word co-occurrences, resulting in inferior topic quality. Some other previous works assume that all documents are short, and leverage external auxiliary data, e.g., pretrained word embeddings and document connectivity. Orthogonal to existing works, we remedy this problem within the corpus itself by proposing a Meta-Complement Topic Model, which improves topic quality of short texts by transferring the semantic knowledge learned on long documents to complement semantically limited short texts. As a self-contained module, our framework is agnostic to auxiliary data and can be further improved by flexibly integrating them into our framework. Specifically, when incorporating document connectivity, we further extend our framework to complement documents with limited edges. Experiments demonstrate the advantage of our framework.

Bibliographic Details

REPOSITORY URLhttps://ink.library.smu.edu.sg/sis_research/7609

URL IDhttps://ink.library.smu.edu.sg/sis_research/7609; https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=8612&context=sis_research

AUTHOR(S)

Ce ZHANG; Hady Wirawan LAUW

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know