Hassanpour S, Tomita N, DeLise T, Crosier B, Marsch LA
Researchers built and tested a machine learning model to identify substance use risk based on images, captions, and comments from Instagram posts.
Participants (n=2,287) were recruited through online advertising on an incentivized crowdsourcing platform and word of mouth. Consenting participants completed a web-based survey to collect substance use information based on the National Institute on Drug Abuse Modified Alcohol, Smoking, and Substance Involvement Screening Tool (ASSIST). Data were used to classify each participant as “low-risk” or “high-risk” with regard to substance use. Researchers extracted anonymized Instagram data from the 2,287 consenting adult participants, and randomly extracted 20 posts and accompanying captions and comments from each account for analysis. Data were divided into training (80%), validation (10%), and test (10%) sets. Researchers trained the model for two weeks using the training set and used the validation set to improve model parameters. Participant-reported risk level was compared to the machine learning model-designated risk level from the test set to evaluate the model.
- The resulting machine learning model was able to detect alcohol risk significantly better than chance (precision: 68.6%, recall: 76.6%, F-measure: 72.4%).
- The model was unable to reliably detect tobacco, prescription drug, or other illicit drug use.
- Participants who were younger, White, had fewer captions and comments per post, and posted more facial images had an elevated risk for alcohol use compared with their counterparts.
- Innovative deep-learning approaches can provide new insights into low impact identification of population-level alcohol use risk using social media.
- The relative social acceptability of alcohol consumption may have resulted in a more balanced distribution of risk categories compared with other substances, contributing to the success of the model in detecting alcohol use risk.
- Future research will develop models for substance use and behavioral health risks more broadly using other platforms (e.g., Facebook and Twitter), and targeted high-risk populations.
- Attention to privacy concerns related to use of social media for the detection of, and intervention with, behavioral health conditions is an important area for future work.
Hassanpour S, Tomita N, DeLise T, Crosier B, Marsch LA. Identifying substance use risk based on deep neural networks and Instagram social media data. Neuropsychopharmacology. 2018. doi: 10.1038/s41386-018-0247-x