Abstract
Recognizing surgical instruments in surgery videos is an essential process to describe surgeries, which can be used for surgery navigation and evaluation systems. In this paper, we argue that an imbalance problem is crucial when we train deep neural networks for recognizing surgical instruments using the training data collected from surgery videos since surgical instruments are not uniformly shown in a video. To address the problem, we use a generative adversarial network (GAN)-based approach to supplement insufficient training data. Using this approach, we could make training data have the balanced number of images for each class. However, conventional GANs such as CycleGAN and DiscoGAN, have a potential problem to be degraded in generating surgery images, and they are not effective to increase the accuracy of the surgical instrument recognition under our experimental settings. For this reason, we propose a novel GAN framework referred to as DavinciGAN, and we demonstrate that our method outperforms conventional GANs on the surgical instrument recognition task with generated training samples to complement the unbalanced distribution of human-labeled data.
Original language | English |
---|---|
Pages (from-to) | 326-336 |
Number of pages | 11 |
Journal | Proceedings of Machine Learning Research |
Volume | 102 |
State | Published - 2019 |
Event | 2nd International Conference on Medical Imaging with Deep Learning, MIDL 2019 - London, United Kingdom Duration: 8 Jul 2019 → 10 Jul 2019 |
Keywords
- data augmentation
- Generative adversarial network (GAN)
- image-to-image translation
- self attention