Abstract
In this paper, we propose self-semi-supervised clustering, a new clustering method for large scale data with a massive null group. Self-semi-supervised clustering is a two-stage procedure: preselect a part of “null” group from the data in the first stage and apply semi-supervised clustering to the rest of the data in the second stage, allowing them to be assigned to the null group. We evaluate the performance of the proposed method using a simulation study and demonstrate the method in the analysis of time course gene expression data from a longitudinal study of Influenza A virus infection.
| Original language | English |
|---|---|
| Pages (from-to) | 161-176 |
| Number of pages | 16 |
| Journal | Journal of the Korean Statistical Society |
| Volume | 49 |
| Issue number | 1 |
| DOIs | |
| State | Published - 1 Mar 2020 |
Keywords
- Influenza A virus
- Massive null group
- Model-based clustering
- Pre-selection
- Semi-supervised clustering
- Time-course microarray data