TY - GEN
T1 - Enhancing User Acceptance in Autonomous Vehicles
T2 - 16th International Conference on Ubiquitous and Future Networks, ICUFN 2025
AU - Colaco, Savina Jassica
AU - Mohamud, Safaa Abdullahi Moallim Moallim
AU - Baek, Minjin
AU - Eun, Seung Woo
AU - Han, Dong Seog
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Autonomous vehicles (AVs) have the potential to transform transportation by improving road safety and enhancing mobility. However, widespread user acceptance remains a significant challenge due to persistent concerns about trust, system transparency, and the lack of human-centered communication. Existing AV systems often fail to provide real-time, interpretable explanations of their actions, particularly during unexpected or abrupt maneuvers. Furthermore, passenger discomfort is frequently overlooked, resulting in reduced user confidence and engagement. To address these challenges, this paper presents an integrated framework that combines hierarchical VisionLanguage Model (VLM)-based scene understanding with passenger emotion recognition to improve AV decision transparency. A structured question-answering pipeline interprets the driving environment, while an in-cabin emotion recognition module monitors passenger reactions. When a negative emotional response is detected during a critical driving event, such as an emergency stop, the system generates a natural language explanation to clarify the AV's behavior. Experimental results from the test scenarios demonstrate that the proposed framework is feasible for generating real-time explanations and detecting passenger emotional responses during critical driving events. The findings highlight the importance of affective feedback in advancing human-centered AV systems.
AB - Autonomous vehicles (AVs) have the potential to transform transportation by improving road safety and enhancing mobility. However, widespread user acceptance remains a significant challenge due to persistent concerns about trust, system transparency, and the lack of human-centered communication. Existing AV systems often fail to provide real-time, interpretable explanations of their actions, particularly during unexpected or abrupt maneuvers. Furthermore, passenger discomfort is frequently overlooked, resulting in reduced user confidence and engagement. To address these challenges, this paper presents an integrated framework that combines hierarchical VisionLanguage Model (VLM)-based scene understanding with passenger emotion recognition to improve AV decision transparency. A structured question-answering pipeline interprets the driving environment, while an in-cabin emotion recognition module monitors passenger reactions. When a negative emotional response is detected during a critical driving event, such as an emergency stop, the system generates a natural language explanation to clarify the AV's behavior. Experimental results from the test scenarios demonstrate that the proposed framework is feasible for generating real-time explanations and detecting passenger emotional responses during critical driving events. The findings highlight the importance of affective feedback in advancing human-centered AV systems.
KW - Autonomous vehicles
KW - Emotion recognition
KW - Human-centered interaction
KW - Vision-language models
UR - https://www.scopus.com/pages/publications/105018745313
U2 - 10.1109/ICUFN65838.2025.11169772
DO - 10.1109/ICUFN65838.2025.11169772
M3 - Conference contribution
AN - SCOPUS:105018745313
T3 - International Conference on Ubiquitous and Future Networks, ICUFN
SP - 65
EP - 69
BT - ICUFN 2025 - 16th International Conference on Ubiquitous and Future Networks
PB - IEEE Computer Society
Y2 - 8 July 2025 through 11 July 2025
ER -