Your blush gives you away: detecting hidden mental states with remote photoplethysmography and thermal imaging

PeerJ Comput Sci. 2024 Mar 20:10:e1912. doi: 10.7717/peerj-cs.1912. eCollection 2024.

Abstract

Multimodal emotion recognition techniques are increasingly essential for assessing mental states. Image-based methods, however, tend to focus predominantly on overt visual cues and often overlook subtler mental state changes. Psychophysiological research has demonstrated that heart rate (HR) and skin temperature are effective in detecting autonomic nervous system (ANS) activities, thereby revealing these subtle changes. However, traditional HR tools are generally more costly and less portable, while skin temperature analysis usually necessitates extensive manual processing. Advances in remote photoplethysmography (r-PPG) and automatic thermal region of interest (ROI) detection algorithms have been developed to address these issues, yet their accuracy in practical applications remains limited. This study aims to bridge this gap by integrating r-PPG with thermal imaging to enhance prediction performance. Ninety participants completed a 20-min questionnaire to induce cognitive stress, followed by watching a film aimed at eliciting moral elevation. The results demonstrate that the combination of r-PPG and thermal imaging effectively detects emotional shifts. Using r-PPG alone, the prediction accuracy was 77% for cognitive stress and 61% for moral elevation, as determined by a support vector machine (SVM). Thermal imaging alone achieved 79% accuracy for cognitive stress and 78% for moral elevation, utilizing a random forest (RF) algorithm. An early fusion strategy of these modalities significantly improved accuracies, achieving 87% for cognitive stress and 83% for moral elevation using RF. Further analysis, which utilized statistical metrics and explainable machine learning methods including SHapley Additive exPlanations (SHAP), highlighted key features and clarified the relationship between cardiac responses and facial temperature variations. Notably, it was observed that cardiovascular features derived from r-PPG models had a more pronounced influence in data fusion, despite thermal imaging's higher predictive accuracy in unimodal analysis.

Keywords: Autonomic nervous system; Cognitive stress; Explainable machine learning; Heart rate variability; Moral elevation; Multimodal emotion recognition; Multimodal fusion; Remote photoplethysmography; SHAP analysis; Thermal imaging.

Grants and funding

This work was supported by the Beijing Normal University at Zhuhai Researcher Activation Fund (Grant No. 310432101), the Shenzhen Key Laboratory of Next Generation Interactive Media Innovative Technology (Grant No. ZDSYS20210623092001004), Shenzhen R & D Sustainable Development Funding (KCXFZ20230731093600002), the Shenzhen Key Research Base of Humanities, Social Sciences for People’s Well-being Benchmarking Study (Grant No. 202003) and the Guangdong Digital Mental Health and Intelligent Generation Laboratory (Grant No. 2023WSYS010). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.