Constructing Accurate Confidence Intervals When Aggregating Social Media Data for Public Health Monitoring.

Ashlynn R. Daughton,Michael J. Paul

Constructing Accurate Confidence Intervals When Aggregating Social Media Data for Public Health Monitoring.

2020

Ashlynn R. Daughton
Michael J. Paul

Social media data are widely used to infer health related information (e.g., the number of individuals with symptoms). A typical approach is to use a machine learning classification to aggregate and count the information of interest. However, this approach fails to account for errors made by the classifier. This paper summarizes data mining concepts that account for classifier error when counting data instances, and then extends these ideas to propose a new algorithm for constructing confidence intervals of social media estimates that we show to be substantially more accurate than standard approaches on two influenza-related Twitter datasets.

Keywords:

Confidence interval
Actuarial science
Public health
Social media
Psychology

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations