Emotion recognition at a distance: The robustness of machine learning based on hand-crafted facial features vs deep learning models
Image and Vision Computing, ISSN: 0262-8856, Vol: 136, Page: 104724
2023
- 14Citations
- 57Captures
- 1Mentions
Metric Options: Counts1 Year3 YearSelecting the 1-year or 3-year option will change the metrics count to percentiles, illustrating how an article or review compares to other articles or reviews within the selected time period in the same journal. Selecting the 1-year option compares the metrics against other articles/reviews that were also published in the same calendar year. Selecting the 3-year option compares the metrics against other articles/reviews that were also published in the same calendar year plus the two years prior.
Example: if you select the 1-year option for an article published in 2019 and a metric category shows 90%, that means that the article or review is performing better than 90% of the other articles/reviews published in that journal in 2019. If you select the 3-year option for the same article published in 2019 and the metric category shows 90%, that means that the article or review is performing better than 90% of the other articles/reviews published in that journal in 2019, 2018 and 2017.
Citation Benchmarking is provided by Scopus and SciVal and is different from the metrics context provided by PlumX Metrics.
Example: if you select the 1-year option for an article published in 2019 and a metric category shows 90%, that means that the article or review is performing better than 90% of the other articles/reviews published in that journal in 2019. If you select the 3-year option for the same article published in 2019 and the metric category shows 90%, that means that the article or review is performing better than 90% of the other articles/reviews published in that journal in 2019, 2018 and 2017.
Citation Benchmarking is provided by Scopus and SciVal and is different from the metrics context provided by PlumX Metrics.
Most Recent News
Recent Studies from University of Salerno Add New Data to Machine Learning (Emotion Recognition At a Distance: the Robustness of Machine Learning Based On Hand-crafted Facial Features Vs Deep Learning Models)
2023 AUG 14 (NewsRx) -- By a News Reporter-Staff News Editor at Robotics & Machine Learning Daily News Daily News -- New research on Machine
Article Description
Emotion estimation from face expression analysis is nowadays a widely-explored computer vision task. In turn, the classification of expressions relies on relevant facial features and their dynamics. Despite the promising accuracy results achieved in controlled and favorable conditions, the processing of faces acquired at a distance, entailing low-quality images, still suffers from a significant performance decrease. In particular, most approaches and related computational models become extremely unstable in the case of the very small amount of useful pixels that is typical in these conditions. Therefore, their behavior should be investigated more carefully. On the other hand, real-time emotion recognition at a distance may play a critical role in smart video surveillance, especially when controlling particular kinds of events, e.g., political meetings, to try to prevent adverse actions. This work compares facial expression recognition at a distance by: 1) a deep learning architecture based on state-of-the-art (SOTA) proposals, which exploits the whole images to autonomously learn the relevant embeddings; 2) a machine learning approach that relies on hand-crafted features, namely the facial landmarks preliminarily extracted using the popular Mediapipe framework. Instead of using either the complete sequence of frames or only the final still image of the expression, like current SOTA approaches, the two proposed methods are designed to use rich temporal information to identify three different stages of emotion. Expressions are time-split accordingly into four phases to better exploit their temporal-dependent dynamics. Experiments were conducted on the popular Extended Cohn-Kanade dataset (CK+). It was chosen for its wide use in related literature, and because it includes videos of facial expressions and not only still images. The results show that the approach relying on machine learning via hand-crafted features is more suitable for classifying the initial phases of the expression and does not decay in terms of accuracy when images are at a distance (only 0.08% of decay). On the contrary, deep learning not only has difficulties classifying the initial phases of the expressions but also suffers from relevant performance decay when considering images at a distance (52.68% accuracy decay).
Bibliographic Details
http://www.sciencedirect.com/science/article/pii/S0262885623000987; http://dx.doi.org/10.1016/j.imavis.2023.104724; http://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=85162122662&origin=inward; https://linkinghub.elsevier.com/retrieve/pii/S0262885623000987; https://dx.doi.org/10.1016/j.imavis.2023.104724
Elsevier BV
Provide Feedback
Have ideas for a new metric? Would you like to see something else here?Let us know