Computers Are Getting Better Than Humans Are at Facial Recognition
Technology could chip away at freedom of expression—including our "right to lie."
Perceiving whether someone is sad, happy, or angry by the way he turns up his nose or knits his brow comes naturally to humans. Most of us are good at reading faces. Really good, it turns out.
So what happens when computers catch up to us? Recent advances in facial recognition technology could give anyone sporting a future iteration of Google Glass the ability to detect inconsistencies between what someone says (in words) and what that person says (with a facial expression). Technology is surpassing our ability to discern such nuances.
Scientists long believed humans could distinguish six basic emotions: happiness, sadness, fear, anger, surprise, and disgust. But earlier this year,researchers at Ohio State University found that humans are capable of reliably recognizing more than 20 facial expressions and corresponding emotional states—including a vast array of compound emotions like “happy surprise” or “angry fear.” Recognizing tone of voice and identifying facial expressions are tasks in the realm of perception where, traditionally, humans perform better than computers. Or, rather, this used to be the case. As facial recognition software improves, computers are getting the edge. The Ohio State study, when attempted by a facial recognition software program, achieved an accuracy rate on the order of 96.9 percent in the identification of the six basic emotions, and 76.9 percent in the case of the compound emotions. Computers are now adept at figuring out how we feel.
Much of this kind of computation is based on the so-called Facial Action Coding System (FACS), a method developed by Paul Elkman, a specialist in facial micro-expressions, during the 1970s and 1980s. FACS decomposes emotional expressions into their distinct facial elements. In other words, it breaks down emotions to specific sets of facial muscles and movements: the widening of the eyes, the elevation of the cheeks, the dropping of the lower lip, and so on. FACS is used in the design and construction of characters in animated films. It’s also used by cognitive scientists to identify genes, chemical compounds, and neuronal circuits that regulate the production of emotions by the brain. Such mapping could be used in the diagnosis of disorders like autism or post-traumatic stress disorder, where there is difficulty in recognizing emotions from facial expressions.
As surveillance technologies become more widespread, so do applications for sophisticated facial recognition software. These technologies appear to be ready to move from the laboratory into real life—commercialized, distributed to the masses in any number of fields, contexts, and situations. All this is happening at a time when computers are getting smarter and smarter at reading human emotion.
A team of researchers at the University of California, San Diego, founded the company Emotient, which uses machine-learning algorithms to detect emotion. The company is currently developing an app for Google Glass which will soon be on the market, according to lead scientist Marian Bartlett. The app is designed to read the emotional expressions of people who appear in the user’s field of vision in real time. Though it’s still in the testing phase, it can recognize happiness, sadness, anger, repugnance, and even scorn.
Once this kind of technology is commercially available, it’s downloadable for any Google Glass user. And beyond the recognition of specific emotions through analysis of facial movement patterns, another application of this new technology, tested by Bartlett’s team, is one that allows it to distinguish fake from true emotional expressions. In other words, it can tell if you’re lying.
The app works based on the idea that false and true expressions of emotions involve different brain mappings. While true emotional expressions are executed by the brain and the spinal cord just like a reflex, fake expressions require a conscious thought—and that involves regions of motor coordination from the cerebral cortex. As a result, the representative facial movements of true and fake emotions end up being different enough for a visual computation system to be able to detect and distinguish them. And here’s the key: Computers can make these distinctions even when humans cannot.
Upon testing, the system developed by Bartlett managed—in real time—to identify 20 of the 46 facial movements described in the FACS, according to a March report by Bartlett in Current Biology. And, even more impressive, the system not only identifies, but distinguishes authentic expressions from false expressions with an accuracy rate of 85 percent, at least in laboratory settings where the visual conditions are held constant. Humans weren’t nearly as skilled, logging an accuracy rate of about 55 percent.
Yes, Bartlett incorporated a lie detector into the facial recognition technology. This technology promises to catch in the act anyone who tries to fake a given emotion or feeling. Facial recognition is evolving into emotional recognition, but computers—not just people—are the ones deciding what's real. ( If we add voice detection to face recognition, we end up with a complete lie detection package.)
Technology and the Right to Lie
So we can begin to imagine a near future in which we’re equipped with glasses that not only recognize faces and voices, but also truths and lies—a scenario that would provoke a revolution in human interaction. It would also constitute a serious limitation on the individual’s autonomy. Let’s start with the implications of this technology at the level of social behavior and move on to analyze the implications at the individual level.
At the collective level, the occasional “little white lie”—like the one that leads us to say, “what a beautiful baby” when in reality you think it is very ugly, or that “the soup is marvelous,” when in reality you know it tastes like bleach—is much more than a high-minded lie; it is a fundamental institution in the art of social survival and coexistence. Behind these small but strategic falsehoods hide critical social conventions. The quasi-protocol statement “we should do lunch sometime,” when our desire is to never see the person again in our life, or the enthusiastic interjection, “I love your dress,” when we’d never be caught dead wearing it even as pajamas, are actually elements of cordiality and signs of respect that shape and embellish our social interactions. They are, in most cases, well-intentioned, and offered by someone who only wants to be considerate, respectful, or simply nice. It is a social game we all play and it suits us just fine.
At the individual level, the freedom to not tell the truth is an essential prerogative of our autonomy as human beings. What this technology jeopardizes is something that goes beyond the simple impossibility of lying without being caught. This technology represents an assault on our right to privacy, our right to identity, and our right to freedom of expression—which encompasses not just what we choose to say, but what we choose to keep to ourselves.
By making it possible to determine that what we say and how we express ourselves corresponds to what we effectively think and feel, we will enter into a new dimension of the violation of our privacy; a violation that attacks the most intimate aspect of our being: our thoughts and our feelings.
This new technology heightens surveillance capabilities—from monitoring actions to assessing emotions—in ways that make an individual ever more vulnerable to government authorities, marketers, employers, and to any and every person with whom we interact. We can expect, in this future, a new wave of technologies that will violate not only the privacy of our actions, but also the privacy of our emotions. Not only will it be difficult to hide where we go, what we do, and what we buy—but also what we feel and think.
The permanent inspection of what we express and say will also condition the way we present ourselves, act, and want to be seen by others. In other words, the “lie detector” technology, by breaching the separation between the thought and its expression, by merging the behind-the-scenes of thought with the center-stage of discourse will irreversibly affect our identity.
By not being able to think one thing and say another, our identity will become monochromatic, losing a good part of its richness and diversity. By being permanently compelled to express and publicize what we think and feel, under penalty of being called liars, we will lose the possibility of creating different impressions and perceptions on people with whom we interact; we will lose, for example, the ability to hide the less agreeable aspects of our nature or to inflate the most brilliant and attractive aspects of our personality. The consequences could be catastrophic; it is enough to imagine how a job interview would be, or a first date, with this new generation of glasses ready to decode our expressions and monitor our sensations. By making us absolutely transparent to the eyes of others, this technology will condition the process of building our identity, preventing us from wearing different masks, from adapting ourselves to different contexts, from playing different roles. Our identity will be subjugated to the pressure and tyranny of permanent scrutiny of our thoughts and feelings.
As important as it is to defend and promote a society based on the values of truth and transparency, we must understand that—at the individual level and with regard to interpersonal relations—too much truth and transparency can be harmful. We must defend a space for non-truth, or rather, a space where we may live with our truth without having to expose and share it.
Deprived of the ability to omit or retouch the truth, under penalty of being caught by an army of inquisitorial eyeglasses, society would feel nearly uninhabitable. The permanent confrontation with a verifiable truth will turn us into overly cautious, calculating, and suspicious people. The apparent truth of what we are and say will be derived not from personal perceptions, particular intuitions and social judgments, but from complex calculations made by algorithms and computations based on the way we use our voice, turn our nose to the right, or incline our mouth to the left.
It will be a mechanical and mechanized truth.
We run the serious risk of losing, little by little, our spontaneous humanity, appearing more and more like the predetermined algorithms that observe and judge us.
NEXT STORY: A brief history of open data