A study on the waveform-based end-to-end deep convolutional neural network for weakly supervised sound event detection

Seokjin Lee, Minhan Kim, Youngho Jeong

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

In this paper, the deep convolutional neural network for sound event detection is studied. Especially, the end-to-end neural network, which generates the detection results from the input audio waveform, is studied for weakly supervised problem that includes weakly-labeled and unlabeled dataset. The proposed system is based on the network structure that consists of deeply-stacked 1-dimensional convolutional neural networks, and enhanced by the skip connection and gating mechanism. Additionally, the proposed system is enhanced by the sound event detection and post processings, and the training step using the mean-teacher model is added to deal with the weakly supervised data. The proposed system was evaluated by the Detection and Classification of Acoustic Scenes and Events (DCASE) 2019 Task 4 dataset, and the result shows that the proposed system has F1-scores of 54 % (segment-based) and 32 % (event-based).

Original languageEnglish
Pages (from-to)24-31
Number of pages8
JournalJournal of the Acoustical Society of Korea
Volume39
Issue number1
DOIs
StatePublished - Jan 2020

Keywords

  • Deep convolutional neural network
  • End-to-end neural network
  • Sound event detection
  • Weakly supervised training

Fingerprint

Dive into the research topics of 'A study on the waveform-based end-to-end deep convolutional neural network for weakly supervised sound event detection'. Together they form a unique fingerprint.

Cite this