TY - GEN
T1 - An Analysis of Research Trends on Language Model Using BERTopic
AU - Kang, Woojin
AU - Kim, Yumi
AU - Kim, Heesop
AU - Lee, Jongwook
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Although language models have played a crucial role in various natural language processing tasks, there has been little research that focuses on systematic analysis and review of research topic trends in these models. In this paper, we conducted a comprehensive analysis of 31 years of research trends in the field of language models, using publications from Scopus, an internationally renowned academic database, to identify research topics related to language models. We adopted BERTopic, a state-of-the-art topic modeling technique, on the 13,754 research articles about language models. The research on language models has gradually increased since 1991, and there is a sudden increase in the number of publications with the emergence of BERT and GPT in 2018. We assigned 14 main topics with meaningful keywords clustered by BERTopic model. Among 14 topics, research related to speech recognition, statistical language models, and pre-trained language models demonstrated the most vigorous research fields. Our results demonstrate a more systematic and comprehensive trend in language model research, which is expected to provide an important foundation for future research directions.
AB - Although language models have played a crucial role in various natural language processing tasks, there has been little research that focuses on systematic analysis and review of research topic trends in these models. In this paper, we conducted a comprehensive analysis of 31 years of research trends in the field of language models, using publications from Scopus, an internationally renowned academic database, to identify research topics related to language models. We adopted BERTopic, a state-of-the-art topic modeling technique, on the 13,754 research articles about language models. The research on language models has gradually increased since 1991, and there is a sudden increase in the number of publications with the emergence of BERT and GPT in 2018. We assigned 14 main topics with meaningful keywords clustered by BERTopic model. Among 14 topics, research related to speech recognition, statistical language models, and pre-trained language models demonstrated the most vigorous research fields. Our results demonstrate a more systematic and comprehensive trend in language model research, which is expected to provide an important foundation for future research directions.
KW - BERT
KW - language models
KW - re-search trends
KW - Short Research Paper
KW - topic modeling
UR - http://www.scopus.com/inward/record.url?scp=85191194501&partnerID=8YFLogxK
U2 - 10.1109/CSCE60160.2023.00032
DO - 10.1109/CSCE60160.2023.00032
M3 - Conference contribution
AN - SCOPUS:85191194501
T3 - Proceedings - 2023 Congress in Computer Science, Computer Engineering, and Applied Computing, CSCE 2023
SP - 168
EP - 172
BT - Proceedings - 2023 Congress in Computer Science, Computer Engineering, and Applied Computing, CSCE 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 Congress in Computer Science, Computer Engineering, and Applied Computing, CSCE 2023
Y2 - 24 July 2023 through 27 July 2023
ER -