Zero inflated high dimensional compositional data with DeepInsight

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Through the Human Microbiome Project, research on human-associated microbiomes has been conducted in various fields. New sequencing techniques such as Next Generation Sequencing (NGS) and High-Throughput Sequencing (HTS) have enabled the inclusion of a wide range of features of the microbiome. These advancements have also contributed to the development of numerical proxies like Operational Taxonomic Units (OTUs) and Amplicon Sequence Variants (ASVs). Studies involving such microbiome data often encounter zero-inflated and high-dimensional problems. Based on the need to address these two issues and the recent emphasis on compositional interpretation of microbiome data, we conducted our research. To solve the zero-inflated problem in compositional microbiome data, we transformed the data onto the surface of the hypersphere using a square root transformation. Then, to solve the high-dimensional problem, we modified DeepInsight, an image-generating method using Convolutional Neural Networks (CNNs), to fit the hypersphere space. Furthermore, to resolve the common issue of distinguishing between true zero values and fake zero values in zero-inflated images, we added a small value to the true zero values. We validated our approach using pediatric inflammatory bowel disease (IBD) fecal sample data and achieved an area under the curve (AUC) value of 0.847, which is higher than the previous study’s result of 0.83.

Original languageEnglish
Article numbere0320832
JournalPLoS ONE
Volume20
Issue number4 April
DOIs
StatePublished - Apr 2025

Fingerprint

Dive into the research topics of 'Zero inflated high dimensional compositional data with DeepInsight'. Together they form a unique fingerprint.

Cite this