There is also the very real possibility of bias in the data: for one thing, only young white Americans representing the binaries male/female and gay/straight were included. Only its top 10 showed a 90 percent hit rate. When the system evaluated a group of 1,000 faces, only 7 percent of which belonged to gay people (in order to be more representative of the actual proportion of the population), it did relatively poorly. This accuracy, it must be noted, is only in the system’s ideal situation of choosing between two people, one of whom is known to be gay. This diagram shows where features were found that were predictive of sexual orientation. The researchers didn’t “seed” this with any preconceived notions of how gay or straight people look the system merely correlated certain features with sexuality and identified patterns. Their facial features were extracted and quantified: everything from nose and eyebrow shape to facial hair and expression.Ī deep learning network crunched through all these features, finding which tended to be associated with individuals of a given sexual orientation. Using a database of facial imagery (from a dating site that makes its data public), the researchers collected 35,326 images of 14,776 people, with (self-identified) gay and straight men and women all equally represented. (Note: the paper is still in draft form.) The paper, due to be published in the Journal of Personality and Social Psychology, details a rather ordinary supervised-learning approach to addressing the possibility of identifying people as gay or straight from their faces alone.
But it is a particularly concerning one, for several reasons. We did not create a privacy-invading tool, but rather showed that basic and widely used methods pose serious privacy threats.Ĭertainly this is only one of many systematized attempts to derive secret information such as sexuality, emotional state or medical conditions. We felt that there is an urgent need to make policymakers and LGBTQ communities aware of the risks that they are facing. The ability to control when and to whom to reveal one’s sexual orientation is crucial not only for one’s well-being, but also for one’s safety. We did not want to enable the very risks that we are warning against. We were really disturbed by these results and spent much time considering whether they should be made public at all. Most relevant are perhaps their remarks as to why the paper was released at all: In an extensive set of authors’ notes that anyone commenting on the topic ought to read, Michal Kosinski and Yilun Wang address a variety of objections and questions. And it demonstrates, as it is intended to, a class of threat to privacy that is entirely unique to the imminent era of ubiquitous computer vision.īefore discussing the system itself, it should be made clear that this research was by all indications done with good intentions. It relies on cues apparently more subtle than most can perceive - cues many would suggest do not exist. But the accuracy of the system reported in the paper seems to leave no room for mistake: this is not only possible, it has been achieved. In addition to exposing an already vulnerable population to a new form of systematized abuse, it strikes directly at the egalitarian notion that we can’t (and shouldn’t) judge a person by their appearance, nor guess at something as private as sexual orientation from something as simple as a snapshot or two. The research is as surprising as it is disconcerting. Today’s illustration of this fact is a new paper from Stanford researchers, who have created a machine learning system that they claim can tell from a few pictures whether a person is gay or straight.
We count on machine learning systems for everything from creating playlists to driving cars, but like any tool, they can be bent toward dangerous and unethical purposes, as well.