Denoiseit: denoising gene expression data using rank based isolation trees

Jaemin Jeon, Youjeong Suk, Sang Cheol Kim, Hye Yeong Jo, Kwangsoo Kim, Inuk Jung

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Selecting informative genes or eliminating uninformative ones before any downstream gene expression analysis is a standard task with great impact on the results. A carefully curated gene set significantly enhances the likelihood of identifying meaningful biomarkers. Method: In contrast to the conventional forward gene search methods that focus on selecting highly informative genes, we propose a backward search method, DenoiseIt, that aims to remove potential outlier genes yielding a robust gene set with reduced noise. The gene set constructed by DenoiseIt is expected to capture biologically significant genes while pruning irrelevant ones to the greatest extent possible. Therefore, it also enhances the quality of downstream comparative gene expression analysis. DenoiseIt utilizes non-negative matrix factorization in conjunction with isolation forests to identify outlier rank features and remove their associated genes. Results: DenoiseIt was applied to both bulk and single-cell RNA-seq data collected from TCGA and a COVID-19 cohort to show that it proficiently identified and removed genes exhibiting expression anomalies confined to specific samples rather than a known group. DenoiseIt also showed to reduce the level of technical noise while preserving a higher proportion of biologically relevant genes compared to existing methods. The DenoiseIt Software is publicly available on GitHub at https://github.com/cobi-git/DenoiseIt

Original languageEnglish
Article number271
JournalBMC Bioinformatics
Volume25
Issue number1
DOIs
StatePublished - Dec 2024

Keywords

  • Filtering
  • Gene
  • Matrix factorization
  • Noise

Fingerprint

Dive into the research topics of 'Denoiseit: denoising gene expression data using rank based isolation trees'. Together they form a unique fingerprint.

Cite this