JMIR Publications

JMIR Public Health and Surveillance

A multidisciplinary journal that focuses on the intersection of public health and technology, public health informatics, mass media campaigns, surveillance, participatory epidemiology, and innovation in public health practice and research.


Journal Description

JMIR Public Health & Surveillance (JPH, Editor-in-chief: Travis Sanchez, Emory University/Rollins School of Public Health) is a sister journal of the Journal of Medical Internet Research (JMIR), the top cited journal in health informatics (Impact Factor 2015: 4.532). JPH is a multidisciplinary journal with a unique focus on the intersection of innovation and technology in public health, and includes topics like health communication, public health informatics, surveillance, participatory epidemiology, infodemiology and infoveillance, digital disease detection, digital public health interventions, mass media/social media campaigns, and emerging population health analysis systems and tools.

We publish regular articles, reviews, protocols/system descriptions and viewpoint papers on all aspects of public health, with a focus on innovation and technology in public health.

Among other innovations, JPH is also dedicated to support rapid open data sharing and rapid open access to surveillance and outbreak data. As one of the novel features we plan to publish rapid or even real-time surveillance reports and open data. The methods and description of the surveillance system may be peer-reviewed and published only once in detail, in a  "baseline report" (in a JMIR Res Protoc or a JMIR Public Health & Surveill paper), and authors then have the possibility to publish data and reports in frequent intervals rapidly and with only minimal additional peer-review (we call this article type "Rapid Surveillance Reports"). JMIR Publications may even work with authors/researchers and developers of selected surveillance systems on APIs for semi-automated reports (e.g. weekly reports to be automatically published in JPHS and indexed in PubMed, based on data-feeds from surveillance systems and minmal narratives and abstracts).

Furthermore, duing epidemics and public health emergencies, submissions with critical data will be processed with expedited peer-review to enable publication within days or even in real-time.

We also publish descriptions of open data resources and open source software. Where possible, we can and want to publish or even host the actual software or dataset on the journal website.


Recent Articles:

  • Tweeting in inclement weather. Photo by and copyright owned by authors.

    Investigating Subjective Experience and the Influence of Weather Among Individuals With Fibromyalgia: A Content Analysis of Twitter


    Background: Little is understood about the determinants of symptom expression in individuals with fibromyalgia syndrome (FMS). While individuals with FMS often report environmental influences, including weather events, on their symptom severity, a consistent effect of specific weather conditions on FMS symptoms has yet to be demonstrated. Content analysis of a large number of messages by individuals with FMS on Twitter can provide valuable insights into variation in the fibromyalgia experience from a first-person perspective. Objective: The objective of our study was to use content analysis of tweets to investigate the association between weather conditions and fibromyalgia symptoms among individuals who tweet about fibromyalgia. Our second objective was to gain insight into how Twitter is used as a form of communication and expression by individuals with fibromyalgia and to explore and uncover thematic clusters and communities related to weather. Methods: Computerized sentiment analysis was performed to measure the association between negative sentiment scores (indicative of severe symptoms such as pain) and coincident environmental variables. Date, time, and location data for each individual tweet were used to identify corresponding climate data (such as temperature). We used graph analysis to investigate the frequency and distribution of domain-related terms exchanged in Twitter and their association strengths. A community detection algorithm was applied to partition the graph and detect different communities. Results: We analyzed 140,432 tweets related to fibromyalgia from 2008 to 2014. There was a very weak positive correlation between humidity and negative sentiment scores (r=.009, P=.001). There was no significant correlation between other environmental variables and negative sentiment scores. The graph analysis showed that “pain” and “chronicpain” were the most frequently used terms. The Louvain method identified 6 communities. Community 1 was related to feelings and symptoms at the time (subjective experience). It also included a list of weather-related terms such as “weather,” “cold,” and “rain.” Conclusions: According to our results, a uniform causal effect of weather variation on fibromyalgia symptoms at the group level remains unlikely. Any impact of weather on fibromyalgia symptoms may vary geographically or at an individual level. Future work will further explore geographic variation and interactions focusing on individual pain trajectories over time.

  • Disability. Image source: Author: Stevepb. Copyright: CC0 Public Domain.

    Using Administrative Data to Ascertain True Cases of Muscular Dystrophy: Rare Disease Surveillance


    Background: Administrative records from insurance and hospital discharge data sources are important public health tools to conduct passive surveillance of disease in populations. Identifying rare but catastrophic conditions is a challenge since approaches for maximizing valid case detection are not firmly established. Objective: The purpose of our study was to explore a number of algorithms in which International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes and other administrative variables could be used to identify cases of muscular dystrophy (MD). Methods: We used active surveillance to identify possible cases of MD in medical practices in neurology, genetics, and orthopedics in 5 urban South Carolina counties and to identify the cases that had diagnostic support (ie, true cases). We then developed an algorithm to identify cases based on a combination of ICD-9-CM codes and administrative variables from a public (Medicaid) and private insurer claims-based system and a statewide hospital discharge dataset (passive surveillance). Cases of all types of MD and those with Duchenne or Becker MD (DBMD) that were common to both surveillance systems were examined to identify the most specific administrative variables for ascertainment of true cases. Results: Passive statewide surveillance identified 3235 possible cases with MD in the state, and active surveillance identified 2057 possible cases in 5 actively surveilled counties that included 2 large metropolitan areas where many people seek medical care. There were 537 common cases found in both the active and passive systems, and 260 (48.4%) were confirmed by active surveillance to be true cases. Of the 260 confirmed cases, 70 (26.9%) were recorded as DBMD. Conclusions: Accuracy of finding a true case in a passive surveillance system was improved substantially when specific diagnosis codes, number of times a code was used, age of the patient, and specialty provider variables were used.

  • E-Cigarette. Image Source: Author: TBEC Review. Copyright: Creative Commons Attribution 2.0 Generic license.

    The Readability of Electronic Cigarette Health Information and Advice: A Quantitative Analysis of Web-Based Information


    Background: The popularity and use of electronic cigarettes (e-cigarettes) has increased across all demographic groups in recent years. However, little is currently known about the readability of health information and advice aimed at the general public regarding the use of e-cigarettes. Objective: The objective of our study was to examine the readability of publicly available health information as well as advice on e-cigarettes. We compared information and advice available from US government agencies, nongovernment organizations, English speaking government agencies outside the United States, and for-profit entities. Methods: A systematic search for health information and advice on e-cigarettes was conducted using search engines. We manually verified search results and converted to plain text for analysis. We then assessed readability of the collected documents using 4 readability metrics followed by pairwise comparisons of groups with adjustment for multiple comparisons. Results: A total of 54 documents were collected for this study. All 4 readability metrics indicate that all information and advice on e-cigarette use is written at a level higher than that recommended for the general public by National Institutes of Health (NIH) communication guidelines. However, health information and advice written by for-profit entities, many of which were promoting e-cigarettes, were significantly easier to read. Conclusions: A substantial proportion of potential and current e-cigarette users are likely to have difficulty in fully comprehending Web-based health information regarding e-cigarettes, potentially hindering effective health-seeking behaviors. To comply with NIH communication guidelines, government entities and nongovernment organizations would benefit from improving the readability of e-cigarettes information and advice.

  • Man using a laptop. Image source: Author: Startup Stock Photos. Copyright: CC0 License.

    A Test of Concept Study of At-Home, Self-Administered HIV Testing With Web-Based Peer Counseling Via Video Chat for Men Who Have Sex With Men


    Background: Men who have sex with men (MSM), particularly MSM who identify as African-American or Black (BMSM), are the sociodemographic group that is most heavily burdened by the human immunodeficiency virus (HIV) epidemic in the United States. To meet national HIV testing goals, there must be a greater emphasis on novel ways to promote and deliver HIV testing to MSM. Obstacles to standard, clinic-based HIV testing include concerns about stigmatization or recognition at in-person testing sites, as well as the inability to access a testing site due to logistical barriers. Objective: This study examined the feasibility of self-administered, at-home HIV testing with Web-based peer counseling to MSM by using an interactive video chatting method. The aims of this study were to (1) determine whether individuals would participate in at-home HIV testing with video chat–based test counseling with a peer counselor, (2) address logistical barriers to HIV testing that individuals who report risk for HIV transmission may experience, and (3) reduce anticipated HIV stigma, a primary psychosocial barrier to HIV testing.   Methods: In response to the gap in HIV testing, a pilot study was developed and implemented via mailed, at-home HIV test kits, accompanied by HIV counseling with a peer counselor via video chat. A total of 20 MSM were enrolled in this test of concept study, 80% of whom identified as BMSM. Results: All participants reported that at-home HIV testing with a peer counseling via video chat was a satisfying experience. The majority of participants (13/18, 72%) said they would prefer for their next HIV testing and counseling experience to be at home with Web-based video chat peer counseling, as opposed to testing in an office or clinic setting. Participants were less likely to report logistical and emotional barriers to HIV testing at the 6-week and 3-month follow-ups. Conclusions: The results of this study suggest that self-administered HIV testing with Web-based peer counseling is feasible and that MSM find it to be a satisfactory means by which they can access their test results. This study can serve as a general guideline for future, larger-scale studies of Web-based HIV test counseling for MSM.

  • Social Media Messages. Image created and copyright owned by authors.

    E-Cigarette Social Media Messages: A Text Mining Analysis of Marketing and Consumer Conversations on Twitter


    Background: As the use of electronic cigarettes (e-cigarettes) rises, social media likely influences public awareness and perception of this emerging tobacco product. Objective: This study examined the public conversation on Twitter to determine overarching themes and insights for trending topics from commercial and consumer users. Methods: Text mining uncovered key patterns and important topics for e-cigarettes on Twitter. SAS Text Miner 12.1 software (SAS Institute Inc) was used for descriptive text mining to reveal the primary topics from tweets collected from March 24, 2015, to July 3, 2015, using a Python script in conjunction with Twitter’s streaming application programming interface. A total of 18 keywords related to e-cigarettes were used and resulted in a total of 872,544 tweets that were sorted into overarching themes through a text topic node for tweets (126,127) and retweets (114,451) that represented more than 1% of the conversation. Results: While some of the final themes were marketing-focused, many topics represented diverse proponent and user conversations that included discussion of policies, personal experiences, and the differentiation of e-cigarettes from traditional tobacco, often by pointing to the lack of evidence for the harm or risks of e-cigarettes or taking the position that e-cigarettes should be promoted as smoking cessation devices. Conclusions: These findings reveal that unique, large-scale public conversations are occurring on Twitter alongside e-cigarette advertising and promotion. Proponents and users are turning to social media to share knowledge, experience, and questions about e-cigarette use. Future research should focus on these unique conversations to understand how they influence attitudes towards and use of e-cigarettes.

  • Performance Feedback Cycle. Image sourced and copyright owned by authors.

    Effect of Performance Feedback on Community Health Workers’ Motivation and Performance in Madhya Pradesh, India: A Randomized Controlled Trial


    Background: Small-scale community health worker (CHW) programs provide basic health services and strengthen health systems in resource-poor settings. This paper focuses on improving CHW performance by providing individual feedback to CHWs working with an mHealth program to address malnutrition in children younger than 5 years. Objective: The paper aims to evaluate the immediate and retention effects of providing performance feedback and supportive supervision on CHW motivation and performance for CHWs working with an mHealth platform to reduce malnutrition in five districts of Madhya Pradesh, India. We expected a positive impact on CHW performance for the indicator they received feedback on. Performance on indicators the CHW did not receive feedback on was not expected to change. Methods: In a randomized controlled trial, 60 CHWs were randomized into three treatment groups based on overall baseline performance ranks to achieve balanced treatment groups. Data for each treatment indicator were analyzed with the other two treatments acting as the control. In total, 10 CHWs were lost to follow-up. There were three performance indicators: case activity, form submissions, and duration of counseling. Each group received weekly calls to provide performance targets and discuss their performance on the specific indicator they were allocated to as well as any challenges or technical issues faced during the week for a 6-week period. Data were collected for a further 4 weeks to assess intertemporal sustained effects of the intervention. Results: We found positive and significant impacts on duration of counseling, whereas case activity and number of form submissions did not show significant improvements as a result of the intervention. We found a moderate to large effect (Glass’s delta=0.97, P=.004) of providing performance feedback on counseling times in the initial 6 weeks. These effects were sustained in the postintervention period (Glass’s delta=1.69, P<.001). The counseling times decreased slightly from the intervention to postintervention period by 2.14 minutes (P=.01). Case activity improved for all CHWs after the intervention. We also performed the analysis by replacing the CHWs lost to follow-up with those in their treatment groups with the closest ranks in baseline performance and found similar results. Conclusions: Calls providing performance feedback are effective in improving CHW motivation and performance. Providing feedback had a positive effect on performance in the case of duration of counseling. The results suggest that difficulty in achieving the performance target can affect results of performance feedback. Regardless of the performance information disclosed, calls can improve performance due to elements of supportive supervision included in the calls encouraging CHW motivation.

  • WebCAAFE. Image sourced and copyright owned by authors.

    Qualitative Analysis of Cognitive Interviews With School Children: A Web-Based Food Intake Questionnaire


    Background: The use of computers to administer dietary assessment questionnaires has shown potential, particularly due to the variety of interactive features that can attract and sustain children’s attention. Cognitive interviews can help researchers to gain insights into how children understand and elaborate their response processes in this type of questionnaire. Objective: To present the cognitive interview results of children who answered the WebCAAFE, a Web-based questionnaire, to obtain an in-depth understanding of children’s response processes. Methods: Cognitive interviews were conducted with children (using a pretested interview script). Analyses were carried out using thematic analysis within a grounded theory framework of inductive coding. Results: A total of 40 children participated in the study, and 4 themes were identified: (1) the meaning of words, (2) understanding instructions, (3) ways to resolve possible problems, and (4) suggestions for improving the questionnaire. Most children understood questions that assessed nutritional intake over the past 24 hours, although the structure of the questionnaire designed to facilitate recall of dietary intake was not always fully understood. Younger children (7 and 8 years old) had more difficulty relating the food images to mixed dishes and foods eaten with bread (eg, jam, cheese). Children were able to provide suggestions for improving future versions of the questionnaire. Conclusions: More attention should be paid to children aged 8 years or below, as they had the greatest difficulty completing the WebCAAFE.

  • Anti-nuclear power plant rally on 19 September 2011 at the Meiji Shrine complex in Tokyo.. By 保守 - Own work, Public Domain,

    Estimating the Duration of Public Concern After the Fukushima Dai-ichi Nuclear Power Station Accident From the Occurrence of Radiation Exposure-Related Terms...


    Background: After the Fukushima Dai-ichi Nuclear Power Station accident in Japan on March 11, 2011, a large number of comments, both positive and negative, were posted on social media. Objective: The objective of this study was to clarify the characteristics of the trend in the number of tweets posted on Twitter, and to estimate how long public concern regarding the accident continued. We surveyed the attenuation period of the first term occurrence related to radiation exposure as a surrogate endpoint for the duration of concern. Methods: We retrieved 18,891,284 tweets from Twitter data between March 11, 2011 and March 10, 2012, containing 143 variables in Japanese. We selected radiation, radioactive, Sievert (Sv), Becquerel (Bq), and gray (Gy) as keywords to estimate the attenuation period of public concern regarding radiation exposure. These data, formatted as comma-separated values, were transferred into a Statistical Analysis System (SAS) dataset for analysis, and survival analysis methodology was followed using the SAS LIFETEST procedure. This study was approved by the institutional review board of Hokkaido University and informed consent was waived. Results: A Kaplan-Meier curve was used to show the rate of Twitter users posting a message after the accident that included one or more of the keywords. The term Sv occurred in tweets up to one year after the first tweet. Among the Twitter users studied, 75.32% (880,108/1,168,542) tweeted the word radioactive and 9.20% (107,522/1,168,542) tweeted the term Sv. The first reduction was observed within the first 7 days after March 11, 2011. The means and standard errors (SEs) of the duration from the first tweet on March 11, 2011 were 31.9 days (SE 0.096) for radioactive and 300.6 days (SE 0.181) for Sv. These keywords were still being used at the end of the study period. The mean attenuation period for radioactive was one month, and approximately one year for radiation and radiation units. The difference in mean duration between the keywords was attributed to the effect of mass media. Regularly posted messages, such as daily radiation dose reports, were relatively easy to detect from their time and formatted contents. The survival estimation indicated that public concern about the nuclear power plant accident remained after one year. Conclusions: Although the simple plot of the number of tweets did not show clear results, we estimated the mean attenuation period as approximately one month for the keyword radioactive, and found that the keywords were still being used in posts at the end of the study period. Further research is required to quantify the effect of other phrases in social media data. The results of this exploratory study should advance progress in influencing and quantifying the communication of risk.

  • Narrative graph of children and exemptions context. Image sourced and copyright owned by authors.

    “Mommy Blogs” and the Vaccination Exemption Narrative: Results From A Machine-Learning Approach for Story Aggregation on Parenting Social Media Sites


    Background: Social media offer an unprecedented opportunity to explore how people talk about health care at a very large scale. Numerous studies have shown the importance of websites with user forums for people seeking information related to health. Parents turn to some of these sites, colloquially referred to as “mommy blogs,” to share concerns about children’s health care, including vaccination. Although substantial work has considered the role of social media, particularly Twitter, in discussions of vaccination and other health care–related issues, there has been little work on describing the underlying structure of these discussions and the role of persuasive storytelling, particularly on sites with no limits on post length. Understanding the role of persuasive storytelling at Internet scale provides useful insight into how people discuss vaccinations, including exemption-seeking behavior, which has been tied to a recent diminution of herd immunity in some communities. Objective: To develop an automated and scalable machine-learning method for story aggregation on social media sites dedicated to discussions of parenting. We wanted to discover the aggregate narrative frameworks to which individuals, through their exchange of experiences and commentary, contribute over time in a particular topic domain. We also wanted to characterize temporal trends in these narrative frameworks on the sites over the study period. Methods: To ensure that our data capture long-term discussions and not short-term reactions to recent events, we developed a dataset of 1.99 million posts contributed by 40,056 users and viewed 20.12 million times indexed from 2 parenting sites over a period of 105 months. Using probabilistic methods, we determined the topics of discussion on these parenting sites. We developed a generative statistical-mechanical narrative model to automatically extract the underlying stories and story fragments from millions of posts. We aggregated the stories into an overarching narrative framework graph. In our model, stories were represented as network graphs with actants as nodes and their various relationships as edges. We estimated the latent stories circulating on these sites by modeling the posts as a sampling of the hidden narrative framework graph. Temporal trends were examined based on monthly user-poststatistics. Results: We discovered that discussions of exemption from vaccination requirements are highly represented. We found a strong narrative framework related to exemption seeking and a culture of distrust of government and medical institutions. Various posts reinforced part of the narrative framework graph in which parents, medical professionals, and religious institutions emerged as key nodes, and exemption seeking emerged as an important edge. In the aggregate story, parents used religion or belief to acquire exemptions to protect their children from vaccines that are required by schools or government institutions, but (allegedly) cause adverse reactions such as autism, pain, compromised immunity, and even death. Although parents joined and left the discussion forums over time, discussions and stories about exemptions were persistent and robust to these membership changes. Conclusions: Analyzing parent forums about health care using an automated analytic approach, such as the one presented here, allows the detection of widespread narrative frameworks that structure and inform discussions. In most vaccination stories from the sites we analyzed, it is taken for granted that vaccines and not vaccine preventable diseases (VPDs) pose a threat to children. Because vaccines are seen as a threat, parents focus on sharing successful strategies for avoiding them, with exemption being the foremost among these strategies. When new parents join such sites, they may be exposed to this endemic narrative framework in the threads they read and to which they contribute, which may influence their health care decision making.

  • Smartphone LGBT theme. Image Source: Copyright: Lucato. Image purchased by authors under Standard license from

    Perceptions Toward a Smoking Cessation App Targeting LGBTQ+ Youth and Young Adults: A Qualitative Framework Analysis of Focus Groups


    Background: The prevalence of smoking among lesbian, gay, bisexual, trans, queer, and other sexual minority (LGBTQ+) youth and young adults (YYA) is significantly higher compared with that among non-LGBTQ+ persons. However, in the past, interventions were primarily group cessation classes that targeted LGBTQ+ persons of all ages. mHealth interventions offer an alternate and modern intervention platform for this subpopulation and may be of particular interest for young LGBTQ+ persons. Objective: This study explored LGBTQ+ YYA (the potential users’) perceptions of a culturally tailored mobile app for smoking cessation. Specifically, we sought to understand what LGBTQ+ YYA like and dislike about this potential cessation tool, along with how such interventions could be improved. Methods: We conducted 24 focus groups with 204 LGBTQ+ YYA (aged 16-29 years) in Toronto and Ottawa, Canada. Participants reflected on how an app might support LGBTQ+ persons with smoking cessation. Participants indicated their feelings, likes and dislikes, concerns, and additional ideas for culturally tailored smoking cessation apps. Framework analysis was used to code transcripts and identify the overarching themes. Results: Study findings suggested that LGBTQ+ YYA were eager about using culturally tailored mobile apps for smoking cessation. Accessibility, monitoring and tracking, connecting with community members, tailoring, connecting with social networks, and personalization were key reasons that were valued for a mobile app cessation program. However, concerns were raised about individual privacy and that not all individuals had access to a mobile phone, users might lose interest quickly, an app would need to be marketed effectively, and app users might cheat and lie about progress to themselves. Participants highlighted that the addition of distractions, rewards, notifications, and Web-based and print versions of the app would be extremely useful to mitigate some of their concerns. Conclusions: This study provided insight into the perspectives of LGBTQ+ YYA on a smoking cessation intervention delivered through a mobile app. The findings suggested a number of components of a mobile app that were valued and those that were concerning, as well as suggestions on how to make a mobile app cessation program successful. App development for this subpopulation should take into consideration the opinions of the intended users and involve them in the development and evaluation of mobile-based smoking cessation programs.

  • StoryPRIME. Image sourced and copyright owned by authors.

    Efficacy of Web-Based Collection of Strength-Based Testimonials for Text Message Extension of Youth Suicide Prevention Program: Randomized Controlled Experiment


    Background: Equipping members of a target population to deliver effective public health messaging to peers is an established approach in health promotion. The Sources of Strength program has demonstrated the promise of this approach for “upstream” youth suicide prevention. Text messaging is a well-established medium for promoting behavior change and is the dominant communication medium for youth. In order for peer ‘opinion leader’ programs like Sources of Strength to use scalable, wide-reaching media such as text messaging to spread peer-to-peer messages, they need techniques for assisting peer opinion leaders in creating effective testimonials to engage peers and match program goals. We developed a Web interface, called Stories of Personal Resilience in Managing Emotions (StoryPRIME), which helps peer opinion leaders write effective, short-form messages that can be delivered to the target population in youth suicide prevention program like Sources of Strength. Objective: To determine the efficacy of StoryPRIME, a Web-based interface for remotely eliciting high school peer leaders, and helping them produce high-quality, personal testimonials for use in a text messaging extension of an evidence-based, peer-led suicide prevention program. Methods: In a double-blind randomized controlled experiment, 36 high school students wrote testimonials with or without eliciting from the StoryPRIME interface. The interface was created in the context of Sources of Strength–an evidence-based youth suicide prevention program–and 24 ninth graders rated these testimonials on relatability, usefulness/relevance, intrigue, and likability. Results: Testimonials written with the StoryPRIME interface were rated as more relatable, useful/relevant, intriguing, and likable than testimonials written without StoryPRIME, P=.054. Conclusions: StoryPRIME is a promising way to elicit high-quality, personal testimonials from youth for prevention programs that draw on members of a target population to spread public health messages.

  • Drug Trends. Image created and copyright owned by authors.

    “When ‘Bad’ is ‘Good’”: Identifying Personal Communication and Sentiment in Drug-Related Tweets


    Background: To harness the full potential of social media for epidemiological surveillance of drug abuse trends, the field needs a greater level of automation in processing and analyzing social media content. Objectives: The objective of the study is to describe the development of supervised machine-learning techniques for the eDrugTrends platform to automatically classify tweets by type/source of communication (personal, official/media, retail) and sentiment (positive, negative, neutral) expressed in cannabis- and synthetic cannabinoid–related tweets. Methods: Tweets were collected using Twitter streaming Application Programming Interface and filtered through the eDrugTrends platform using keywords related to cannabis, marijuana edibles, marijuana concentrates, and synthetic cannabinoids. After creating coding rules and assessing intercoder reliability, a manually labeled data set (N=4000) was developed by coding several batches of randomly selected subsets of tweets extracted from the pool of 15,623,869 collected by eDrugTrends (May-November 2015). Out of 4000 tweets, 25% (1000/4000) were used to build source classifiers and 75% (3000/4000) were used for sentiment classifiers. Logistic Regression (LR), Naive Bayes (NB), and Support Vector Machines (SVM) were used to train the classifiers. Source classification (n=1000) tested Approach 1 that used short URLs, and Approach 2 where URLs were expanded and included into the bag-of-words analysis. For sentiment classification, Approach 1 used all tweets, regardless of their source/type (n=3000), while Approach 2 applied sentiment classification to personal communication tweets only (2633/3000, 88%). Multiclass and binary classification tasks were examined, and machine-learning sentiment classifier performance was compared with Valence Aware Dictionary for sEntiment Reasoning (VADER), a lexicon and rule-based method. The performance of each classifier was assessed using 5-fold cross validation that calculated average F-scores. One-tailed t test was used to determine if differences in F-scores were statistically significant. Results: In multiclass source classification, the use of expanded URLs did not contribute to significant improvement in classifier performance (0.7972 vs 0.8102 for SVM, P=.19). In binary classification, the identification of all source categories improved significantly when unshortened URLs were used, with personal communication tweets benefiting the most (0.8736 vs 0.8200, P<.001). In multiclass sentiment classification Approach 1, SVM (0.6723) performed similarly to NB (0.6683) and LR (0.6703). In Approach 2, SVM (0.7062) did not differ from NB (0.6980, P=.13) or LR (F=0.6931, P=.05), but it was over 40% more accurate than VADER (F=0.5030, P<.001). In multiclass task, improvements in sentiment classification (Approach 2 vs Approach 1) did not reach statistical significance (eg, SVM: 0.7062 vs 0.6723, P=.052). In binary sentiment classification (positive vs negative), Approach 2 (focus on personal communication tweets only) improved classification results, compared with Approach 1, for LR (0.8752 vs 0.8516, P=.04) and SVM (0.8800 vs 0.8557, P=.045). Conclusions: The study provides an example of the use of supervised machine learning methods to categorize cannabis- and synthetic cannabinoid–related tweets with fairly high accuracy. Use of these content analysis tools along with geographic identification capabilities developed by the eDrugTrends platform will provide powerful methods for tracking regional changes in user opinions related to cannabis and synthetic cannabinoids use over time and across different regions.

Citing this Article

Right click to copy or hit: ctrl+c (cmd+c on mac)

Latest Submissions Open for Peer-Review:

View All Open Peer Review Articles
  • Combining Participatory Influenza Surveillance with Modeling and Forecasting

    Date Submitted: Jan 17, 2017

    Open Peer Review Period: Jan 20, 2017 - Feb 3, 2017

    Influenza outbreaks around the planet affect millions of people every year. Monitoring and forecasting the evolution of these outbreaks is important to help decision makers design effective interventi...

    Influenza outbreaks around the planet affect millions of people every year. Monitoring and forecasting the evolution of these outbreaks is important to help decision makers design effective interventions and allocate resources to mitigate their impact. Here, we show how modeling and simulation can be or has been combined with participatory disease surveillance to measure and address the non-response bias present in a participatory surveillance sample (using WISDM as an example) and to both nowcast and forecast influenza activity in different parts of the world (using InfluenzaNet and Flu Near You [FNY] as examples).

  • Saúde na Copa: The World’s First Application of Participatory Surveillance for a Mass Gathering: FIFA World Cup 2014, Brazil.

    Date Submitted: Jan 12, 2017

    Open Peer Review Period: Jan 17, 2017 - Jan 31, 2017

    Background: The 2005 International Health Regulations (IHR) established parameters for events assessment and notification that may constitute public health emergencies of international concern. These...

    Background: The 2005 International Health Regulations (IHR) established parameters for events assessment and notification that may constitute public health emergencies of international concern. These requirements and parameters opened up space for the use of non-official mechanisms (such as websites, blogs and social networks) and technological improvements of communication that can streamline the detection, monitoring, and response to health problems and, thus, reduce damage caused by these problems. Specifically, the revised IHR created space for participatory surveillance to function in addition to the traditional surveillance mechanisms of detection, monitoring and response. Participatory surveillance is based on crowdsourcing methods that collect information from society and then return the collective knowledge gained from that information back to society; the spread of digital social networks and wiki-style knowledge platforms has created a very favorable environment for this model of production and social control of information. Objective: The aim of this study was to describe the use of a participatory surveillance application, Healthy Cup, for the early detection of acute disease outbreaks during the FIFA World Cup 2014. Our focus was on three specific syndromes (respiratory, diarrheal, and rash) related to six diseases considered important in a mass gathering (MG) context (influenza, measles, rubella, cholera, acute diarrhea, and dengue). Methods: From May 12 to July 13, 2014, users from anywhere in the world were able to download the Healthy Cup application and record their health condition, reporting whether they were "good," "very good," "ill," or “very ill.” For users reporting being “ill” or “very ill,” a screen with a list of 10 symptoms was displayed. Participatory surveillance allows for the real-time identification of aggregates of symptoms that indicate possible cases of infectious diseases. Results: From May 12 through July 13, 2014, there were 9,434 downloads of the Healthy Cup application and 7,155 (75.8%) registered users. Among the registered users, 4,706 (65.8%) were active users who posted a total of 47,879 times during the study period. The maximum number of users signing up in one day occurred on May 30, 2014, the day of official launch of the application by the Minister of Health during a press conference where Minister of Health has announced the special government program "Health in the World Cup" on national television media. On that date there were 4,183 entries, which is almost half the total reports across the entire study duration (48.8%). Conclusions: Participatory surveillance through community engagement is an innovative way to conduct epidemiological surveillance. Compared to traditional epidemiological surveillance, its advantages include its lower cost for data acquisition, timeliness of information collected and shared, platform scalability, and capacity for integration between the population being served and public health services.