Late Fusion-Based Video Transformer for Facial Micro-Expression Recognition

Jiuk Hong, Chaehyeon Lee, Heechul Jung

Research output: Contribution to journalArticlepeer-review

13 Scopus citations

Abstract

In this article, we propose a novel model for facial micro-expression (FME) recognition. The proposed model basically comprises a transformer, which is recently used for computer vision and has never been used for FME recognition. A transformer requires a huge amount of data compared to a convolution neural network. Then, we use motion features, such as optical flow and late fusion to complement the lack of FME dataset. The proposed method was verified and evaluated using the SMIC and CASME II datasets. Our approach achieved state-of-the-art (SOTA) performance of 0.7447 and 73.17% in SMIC in terms of unweighted F1 score (UF1) and accuracy (Acc.), respectively, which are 0.31 and 1.8% higher than previous SOTA. Furthermore, UF1 of 0.7106 and Acc. of 70.68% were shown in the CASME II experiment, which are comparable with SOTA.

Original languageEnglish
Article number1169
JournalApplied Sciences (Switzerland)
Volume12
Issue number3
DOIs
StatePublished - 1 Feb 2022

Keywords

  • Deep learning
  • Emotion recognition
  • Facial micro-expression
  • Image processing
  • Vision transformer

Fingerprint

Dive into the research topics of 'Late Fusion-Based Video Transformer for Facial Micro-Expression Recognition'. Together they form a unique fingerprint.

Cite this