Enhanced Video Content Analysis for Emotion Measurement: A Potential Shift in the Heuristical Approach Towards Video-Based Emotions Analytics
CHAN-DOCUMENT-2017.pdf (32.95Mb)(embargoed until: 2019-05-01)
MetadataShow full item record
CitationChan, Stephen. 2017. Enhanced Video Content Analysis for Emotion Measurement: A Potential Shift in the Heuristical Approach Towards Video-Based Emotions Analytics. Master's thesis, Harvard Extension School.
AbstractVideo Content Analysis (VCA) for emotion measurement, which is the effort to determine the emotional state of the person of interest from a given video stream, has several critical applications ranging from retail marketing to public security. There are a number of current providers of turnkey systems that perform this type of VCA. For this project, we conducted an experiment with several performers/test subjects expressing a constrained range of emotions and demonstrate that current emotions analytics software systems, in many cases, do not correctly identify some of the more ambiguous expressions that may be used to hide the true emotional state of the person being analyzed. Accordingly, we investigated alternative open-source application programming interfaces (APIs) and software development kits (SDKs) that may be used in combination with each other to read additional clues that can be used to improve the predictions of these VCA systems.
This project will take the example of a smile and describe the challenges of successfully performing emotion measurement on this basic facial expression. As one example of this challenge, a false smile has different indicators than that of a genuine smile. However, many COTS emotions analytics software systems will simply analyze the smile itself rather than look for additional, pertinent facial expression and bodily gesture clues. Within this document, we will illuminate the importance of concentrating on these facial and bodily “non-obvious” areas rather than simply zooming into the “obvious” area of the actual smile for the discerning of posed, false versus spontaneous, legitimate smiles.
There will be an accompanying video project to this document, and the video itself will be produced in a very high-quality fashion, with high-definition cameras, to demonstrate some key points. First, the current paradigm of VCA that exists within the emotion measurement ecosystem are, in many cases, operating on video footage that is acquired by video cameras of insufficient resolution for capturing micro-facial expressions or very fine facial features, such as the subtler facial wrinkles. These micro-facial expressions can serve as vital clues when determining the genuineness of a smile. Second, the field of view for the video capture can be critical for not only capturing certain key facial expressions, but also pertinent bodily gestures. Third, the length of the video capture itself plays a critical role, as the “onset time” or time for formation and “fade time” or time for dissipation of the smile can provide key insight into whether the smile was real or false, and therefore, has a profound impact upon whether the analysis is successful or not. This document will set the stage and then analyze as well as discuss the process and results of the described video project.
Citable link to this pagehttp://nrs.harvard.edu/urn-3:HUL.InstRepos:33825951