Instagram Diagnosis: Machine Learning, Social Media and Predicting Mental Illness
A recent study in the journal EPJ Data Science used machine learning to try and predict the incidence of depression amongst users of the photo-sharing social media site Instagram by analysing at the pictures they posted. The authors, Andrew Reece and Christopher Danforth, developed a model that, they argued, managed to predict depression amongst a sample size with 70% accuracy, compared to only 42% by GPs. While promising on first glance, upon closer examination the success is quite modest. Moreover, I am wary of using social media to predict depression because of how it can make users more vulnerable to depression, loneliness or anxiety, and because of other pernicious implication of these results.
Reece and Danforth examined 43,950 photos from 166 Instagram users, of whom 71 had depression as identified by standardised clinical surveys (4). The parameters that they considered included the presence of people in photographs, colour, saturation and brightness, and how many ‘Likes’ or comments the images received. They also got human participants to rate a random selection of these images on the parameters ‘happiness’, ‘sadness’, ‘likeability’ and ‘interestingness’ to compare machine learning results with human interpretations. They developed two models, a general case to distinguish between users who suffered from depression and those who didn’t, and a pre-diagnosis case to examine pictures posted from before users were first diagnosed and predict which of them would be diagnosed with depression later.
They found that posts by depressed users were more likely to be bluer (higher hue), greyer (lower saturation) and less bright compared to healthy ones. They were also less likely to use filters, and if they did they would tend towards black-and-white filters like ‘Inkwell’ compared to more colourful, vibrant ones like ‘Valencia’ that healthy users preferred. Depressed users were also more likely to have photographs with faces, but fewer people on average in these photographs (for example more selfies rather than group shots), which the authors saw as an indirect indicator of reduced social activity. All of these results were consistent with previous work on the relation between colour and mood, as well as the association of depression with reduced social activity.
The AI algorithm that Reece and Danforth trained using the above data, they argued, was able to identify 70% (i.e. 37 cases) of depressed users within a sample of 100 observations, with only 23 false positives and 17 misses. However, this is using the general model with data from users who have already received a diagnosis for depression. This varies from the photos posted by individuals from before their diagnosis because receiving a diagnosis affects the way in which people perceive and portray themselves on social media (Reece and Danforth 2).
The authors argue that machine learning is more robust at predicting depression than people. Participants who rated the images were consistent between themselves in the mood they assigned to an image (‘sad’ or ‘happy’), and in general depressed users’ pictures were more ‘sad’ than healthy users’. But human ratings of sadness had no correlation with how blue, dark or grey an image was (9). This suggests that while an excruciating and prolonged feeling of low mood is a symptom of depression, sadness — an emotional state — is not the same as depression — a psychiatric disorder that often distorts the way in which one perceives oneself and surroundings. It is, for example, possible to be sad and not be depressed. However, the tasks that Reece and Danforth assigned the AI and the human participants were very different, as participants were simply told to assign subjective moods rather than predict depression based on brightness, colour or saturation, so the claim that machine learning is a better predictor than human observers is not a fair comparison.
As for pre-diagnosis cases, Reece and Danforth only say that their predictions ‘showed improvement’ over the benchmark for GPs (8), but they do not give their rates of prediction. This makes their success seem very modest. What they claim, however, is that while ‘general practitioners discovered more true cases of depression, they were more likely than not to misdiagnose healthy individuals as depressed’, whereas their pre-diagnosis algorithm ‘was correct most of the time when it did predict a target’ as depressed (8). Reece and Danforth attribute the weakness of their model to the small data set that was available to them to train their algorithm. But even then, the improvement in accuracy over GP diagnoses does not seem as radical when one considers that it compares the AI to GPs who are unassisted, whereas it is common, though variable and far from mandatory, for GPs to use questionnaires like the PHQ-9 to aid diagnoses.
This is a developing trend: according to a story in The Telegraph from March, Facebook is already developing AI tools to identify people who are suicide risks, although this was only after its live-streaming service was used as a platform for a public tragedy. However, the utility of machine learning notwithstanding, I am uneasy about the use of social media to diagnose depression when people’s reliance and use of social media has rendered them more vulnerable to depression, like in a study by Katerina Lup, et al. There is also the controversial experiment of emotional contagion across Facebook by Adam Kramer, et al in which users in 2014 were shown negative stories in order to see how they could be induced into negative emotional states. The medium that they are using to predict depression is also a contributing factor.
Finally, the fact that their new model was rather modest in predicting depression pre-diagnosis (in that it identified fewer actual cases) suggests that its use as a diagnostic tool is rather limited. There really is little diagnostic use of something that can predict a diagnosis after it has occurred. Where it is effective, however, is in noticing the self-perception and self-presentation of individuals who have been diagnosed with depression. This algorithm can be used in advertising to identify and manipulate individuals with mental illness at a terrifying scale. There is a worrying precedent from this, as four months ago, Sam Levin reported in the Guardian about a leaked document that described how Facebook was sharing psychological profiles of users with advertisers, tracking users’ moods. I believe GPs can use machine learning to aid their diagnosis, although I remain unconvinced that the applications to social media suggested by Reece and Danforth are the way forward.
Works Cited:
Dean, Sam. ‘Facebook to Use Artificial Intelligence to Combat Suicide.’ The Telegraph 1 Mar. 2017. Web. 22 Aug. 2017. < http://www.telegraph.co.uk/technology/2017/03/01/facebook-use-artificial-intelligence-combat-suicides/ >
Levin, Sam. ‘Facebook Told Advertisers it Can Identify Teens Feeling “Insecure” and “Worthless”.’ The Guardian 1 May 2017. Web. 24 Aug. 2017. < https://www.theguardian.com/technology/2017/may/01/facebook-advertising-data-insecure-teens >
Kramer, Adam D.I., et al. ‘Experimental Evidence of Massive-Scale Emotional Contagion Through Social Networks.’ PNAS 111.24 (17 Jun. 2014): 8788-90. Web. 22 Aug. 2017. DOI: 10.1073/pnas.1320040111
Lup, Katerina, et al. ‘Instagram #Instasad?: Exploring Associations Among Instagram Use, Depressive Symptoms, Negative Social Comparison, and Strangers Followed.’ Cyberpsychology, Behavior, and Social Networking 18.5 (2015): 247-52. Web. 22 Aug. 2017. DOI: 10.1089/cyber.2014.0560
Reece, Andrew G. and Christopher M. Danforth. ‘Instagram Photos Reveal Predictive Markers of Depression.’ EPJ Data Science 2017 6:15. Web. 22 Aug. 2017. DOI: 10.1140/epjds/s13688-017-0110-z.