Self-semi-supervised clustering for large scale data with massive null group

Soohyun Ahn, Hyungwon Choi, Johan Lim, Kyeong Eun Lee

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

In this paper, we propose self-semi-supervised clustering, a new clustering method for large scale data with a massive null group. Self-semi-supervised clustering is a two-stage procedure: preselect a part of “null” group from the data in the first stage and apply semi-supervised clustering to the rest of the data in the second stage, allowing them to be assigned to the null group. We evaluate the performance of the proposed method using a simulation study and demonstrate the method in the analysis of time course gene expression data from a longitudinal study of Influenza A virus infection.

Original languageEnglish
Pages (from-to)161-176
Number of pages16
JournalJournal of the Korean Statistical Society
Volume49
Issue number1
DOIs
StatePublished - 1 Mar 2020

Keywords

  • Influenza A virus
  • Massive null group
  • Model-based clustering
  • Pre-selection
  • Semi-supervised clustering
  • Time-course microarray data

Fingerprint

Dive into the research topics of 'Self-semi-supervised clustering for large scale data with massive null group'. Together they form a unique fingerprint.

Cite this