TY - GEN
T1 - Topic Taxonomy Expansion via Hierarchy-Aware Topic Phrase Generation
AU - Lee, Dongha
AU - Shen, Jiaming
AU - Lee, Seonghyeon
AU - Yoon, Susik
AU - Yu, Hwanjo
AU - Han, Jiawei
N1 - Publisher Copyright:
© 2022 Association for Computational Linguistics.
PY - 2022
Y1 - 2022
N2 - Topic taxonomies display hierarchical topic structures of a text corpus and provide topical knowledge to enhance various NLP applications. To dynamically incorporate new topic information, several recent studies have tried to expand (or complete) a topic taxonomy by inserting emerging topics identified in a set of new documents. However, existing methods focus only on frequent terms in documents and the local topic-subtopic relations in a taxonomy, which leads to limited topic term coverage and fails to model the global topic hierarchy. In this work, we propose a novel framework for topic taxonomy expansion, named TopicExpan, which directly generates topic-related terms belonging to new topics. Specifically, TopicExpan leverages the hierarchical relation structure surrounding a new topic and the textual content of an input document for topic term generation. This approach encourages newly-inserted topics to further cover important but less frequent terms as well as to keep their relation consistency within the taxonomy. Experimental results on two real-world text corpora show that TopicExpan significantly outperforms other baseline methods in terms of the quality of output taxonomies.
AB - Topic taxonomies display hierarchical topic structures of a text corpus and provide topical knowledge to enhance various NLP applications. To dynamically incorporate new topic information, several recent studies have tried to expand (or complete) a topic taxonomy by inserting emerging topics identified in a set of new documents. However, existing methods focus only on frequent terms in documents and the local topic-subtopic relations in a taxonomy, which leads to limited topic term coverage and fails to model the global topic hierarchy. In this work, we propose a novel framework for topic taxonomy expansion, named TopicExpan, which directly generates topic-related terms belonging to new topics. Specifically, TopicExpan leverages the hierarchical relation structure surrounding a new topic and the textual content of an input document for topic term generation. This approach encourages newly-inserted topics to further cover important but less frequent terms as well as to keep their relation consistency within the taxonomy. Experimental results on two real-world text corpora show that TopicExpan significantly outperforms other baseline methods in terms of the quality of output taxonomies.
UR - https://www.scopus.com/pages/publications/105001069479
U2 - 10.18653/v1/2022.findings-emnlp.546
DO - 10.18653/v1/2022.findings-emnlp.546
M3 - Conference contribution
AN - SCOPUS:105001069479
T3 - Findings of the Association for Computational Linguistics: EMNLP 2022
SP - 1687
EP - 1700
BT - Findings of the Association for Computational Linguistics
A2 - Goldberg, Yoav
A2 - Kozareva, Zornitsa
A2 - Zhang, Yue
PB - Association for Computational Linguistics (ACL)
T2 - 2022 Findings of the Association for Computational Linguistics: EMNLP 2022
Y2 - 7 December 2022 through 11 December 2022
ER -