Analysis of Inter-Domain and Cross-Domain Drug Review Polarity Classification

AMIA Jt Summits Transl Sci Proc. 2020 May 30:2020:201-210. eCollection 2020.

Abstract

Individuals increasingly rely on social media to discuss health-related issues. One way to provide easier access to relevant in- formation is through sentiment analysis - classifying text into polarity classes such as positive and negative. In this paper, we generated freely available datasets of WebMD.com drug reviews and star ratings for Common, Cancer, Depression, Diabetes, and Hypertension drugs. We explored four supervised learning models: Naive Bayes, Random Forests, Support Vector Machines, and Convolutional Neural Networks for the purpose of determining the polarity of drug reviews. We conducted inter-domain and cross-domain evaluations. We found that SVM obtained the highest f-measure on average and that cross-domain training produced similar or higher results to models trained directly on their respective datasets.