DCBT-Net: Training Deep Convolutional Neural Networks with Extremely Noisy Labels

Bekhzod Olimov, Jeonghong Kim, Anand Paul

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

Obtaining data with correct labels is crucial to attain the state-of-the-art performance of Convolutional Neural Network (CNN) models. However, labeling datasets is significantly time-consuming and expensive process because it requires expert knowledge in a particular domain. Therefore, real-life datasets often exhibit incorrect labels due to the involvement of nonexperts in the data-labeling process. Consequently, there are many cases of incorrectly labeled data in the wild. Although the issue of poorly labeled datasets has been studied, the existing methods are complex and difficult to reproduce. Thus, in this study, we proposed a simpler algorithm called 'Deep Clean Before Training Net' (DCBT-Net) that is based on cleaning wrongly labeled data points using the information from eigenvalues of the Laplacian matrix obtained from similarities between the data samples. The cleaned data were trained using deep CNN (DCNN) to attain the state-of-the-art results. This system achieved better performance than the existing approaches. In conducted experiments, the performance of the DCBT-Net was tested on three commercially available datasets, namely, Modified National Institute of Standards and Technology (MNIST) database of handwritten digits, Canadian Institute for Advanced Research (CIFAR) and WebVision1000 datasets. The proposed method achieved better results when assessed using several evaluation metrics compared with the existing state-of-the-art methods. Specifically, the DCBT-Net attained an average 15%, 20%, and 3% increase in accuracy score using MNIST database, CIFAR-10 dataset, and WebVision dataset, respectively. Also, the proposed approach demonstrated better results in specificity, sensitivity, positive predictive value, and negative predictive value evaluation metrics.

Original languageEnglish
Article number9276394
Pages (from-to)220482-220495
Number of pages14
JournalIEEE Access
Volume8
DOIs
StatePublished - 2020

Keywords

  • Clustering
  • deep convolutional neural networks
  • eigenvalues and eigenvectors
  • image classification
  • noisy (corrupted) labels

Fingerprint

Dive into the research topics of 'DCBT-Net: Training Deep Convolutional Neural Networks with Extremely Noisy Labels'. Together they form a unique fingerprint.

Cite this