Data Augmentation using Checked Pattern Mask for Acoustic Scene Classification

Seungjae Baek, Seokjin Lee

Research output: Contribution to journalConference articlepeer-review

Abstract

Recently, the acoustic scene classification (ASC) algorithm, which classifies the environment audio signals into acoustic scenes, such as shopping malls and trains, mainly uses machine-learning to achieve its goal. Various techniques, such as subsystem ensembles, network dropout, and data augmentation, have been employed to overcome the overfitting problem and improve the performance of machine-learning systems. In particular, data augmentation is easy to use because it modifies the input data only without changing the predefined model's size or structure. However, existing data augmentation methods sometimes generate meaningless data that cannot improve ASC performance. Therefore, we tried to maximize the augmentation of meaningful data using the GridMask augmentation method, which was developed for image processing systems. Because the temporal axis and the frequency axis differ in the audio spectrogram while the axes are the same in the image signal, we propose to use checked pattern masking to change the data and to prevent loss of important data. The proposed masking strategy is promising, as it does not mask a large part of data at once. The performance of the proposed method was evaluated using ASC data from the Detection and Classification ofAcoustic Scenes and Events (DCASE) 2021 Task 1A dataset.

Original languageEnglish
JournalProceedings of the International Congress on Acoustics
StatePublished - 2022
Event24th International Congress on Acoustics, ICA 2022 - Gyeongju, Korea, Republic of
Duration: 24 Oct 202228 Oct 2022

Keywords

  • acoustic scene classification
  • data augmentation
  • machine learning

Fingerprint

Dive into the research topics of 'Data Augmentation using Checked Pattern Mask for Acoustic Scene Classification'. Together they form a unique fingerprint.

Cite this