Download - RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Transcript
Page 1: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Photo credit: IRMA (Integrated Risk Management for Africa)

Page 2: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Taha Kass-Hout and Nicolas di Tada, Summer 2008, Washington, DC, USA.

Page 3: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

What is public health disease surveillance “Public health surveillance is the ongoing systematic

collection, analysis, and interpretation of health data essential to the planning, implementation, and evaluation of public health practice, closely integrated with the timely dissemination of these data to those who need to know. The final link in the surveillance chain is the application of these data to prevention and control. A surveillance system includes a functional capacity for data collection, analysis, and dissemination linked to public health programs.”

What is syndromic surveillance? US CDC defines syndromic surveillance as “surveillance

using health-related data that precede diagnosis and signal a sufficient probability of a case or an outbreak to warrant further public health response.”

3

Thacker, S.B., and Berkelman, R.L. "Public Health Surveillance in the United States." Epidemiology Reviews 10 (1988): 164-90.

Page 4: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Current systems design, analysis and evaluation of disease surveillance systems has been geared towards specific data sources and detection algorithms – not humans Much less has been towards interaction with responders and

domain experts across agencies and at multiple levels Often provide contradictory interpretations of ongoing events

We have disease surveillance systems in place for those threats we have been faced with before We are more vulnerable to those we know about, but have not

faced on a major scale Even more vulnerable to those that we don’t know about

4

Page 5: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

The likelihood of disasters and disease outbreaks is growing According to a recent Oxfam report, there has

been a four-fold increase in the annual number of natural disasters

30 new infectious diseases identified since 1973

Potential impact is getting greater Impact on health, economies & security Capable of spreading faster than ever before

http://www.oxfam.org/en/policy/briefingpapers/bp108_climate_change_alarm_0711

5

Page 6: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

To address these challenges by adopting a social and collaborative decision making approach in order to facilitate early characterization and identification

of potential health threats their verification, assessment and

investigation in order to recommend measures (public

health and others) to control them

6

Page 7: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Event-based - ad-hoc unstructured reports issued by formal or informal sources

Indicator-based - (number of cases, rates, proportion of strains…)

Timeliness, Representativeness, Completeness, Predictive Value, Quality, Cost, Feasibility, … 77

Page 8: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Identified risksMandatory notification

Laboratory surveillance

Emerging risksSyndromic surveillance

Mortality monitoringHealthcare activity monitoring

Prescription monitoring

Non healthcare basedVeterinary surveillanceBehavioral surveillance

Environmental surveillancePoison centers

Food safety/water supply…

DomesticMediaNGOsField Epi points

InternationalDistribution lists

ProMed (English, Chinese, Spanish, Russian, etc.)

International agencies WHOOIECDCNASA (e.g., remote sensing, weather, population migration, bird migration, population density, plant, animal)

Confidential/Limited mailing list dissemination

ProMed (e.g., MBDS)International health regulation agencies (WHO, OIE, CDC, NASA)Threat bulletin (EWARN, ECDC)

Public disseminationNews, blogs, articles, Health ministry press releases sitesWeekly releases (Eurosurveillance)

8Adopted from WHO

Page 9: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Reduce Morbidity and Mortality and Improve Health

9Adopted from WHO

Page 10: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Exposed

Clinical specimen

Symptoms

Pos. specimen

Infected

Seek medical attention

Report

1000 Shigella infections (100%)

50 Shigella notifications (5%)

Main attributes Representativeness Completeness Predictive value positive

Get as close to the bottom of the pyramid

as possible

Urge frequent reporting

Specificity / Reliability

Sensitivity / Timeliness 10

Page 11: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Exposed

Clinical specim

en

Symptom

s

Pos. specim

en

Infected

Seek medical attention

Report

Analyze and interpret Signal as early

as possible

Automated analysis/thresholds

Time

Main attributes Timeliness

11

Page 12: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Clickstream/Keyword Searching

Blogs/Chatrooms

News Sources Local National International

Curated mailing lists (ProMED)

Multi-national surveillance (Eurosurveillance)

Validated official global alerts (WHO)

12

Sensitivity / Timeliness

Specificity / Reliability

Main attributes Data quality

Page 13: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

0

10

20

30

40

50

60

70

80

90

1 4 7 10 13 16 19 22 25 28 31 34 37 40DAY

CASES

LabConfirmationDetection/

Reporting

First Case

Opportunity for control

13Adopted from WHO

Response

Page 14: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

0

10

20

30

40

50

60

70

80

90

1 4 7 10 13 16 19 22 25 28 31 34 37 40

First Case

Detection/ReportingConfirmation

Investigation

Opportunity for control

Response

DAY

CASES

14Adopted from WHO

Page 15: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

15

Nov 2002 Mar 2003

Progression of outbreak

Electronic Surveillance

Cases of atypical pneumonia FoshanNov 16th

Infected Chinese DoctorHong Kong hotelFeb 21st

305 Cases of acute respGuangdong ProvinceFeb 11th

Pharma report Guangdong ProvinceNovember 27

Media reportsGuangdong ProvinceFeb 10

Astute physician on ProMEDFeb 10

Initial WHO ReportFeb 25

Official WHO ReportMarch 10

Adopted from Brownstein, et al.

Page 16: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Figure 1. Reported cases of campylobacteriosis (n=45) in Svolvær, Norway, by

date of onset J anuary and February 1997.

10 = 1 primary case

= 1 secondary household case

5

22 23 24 25 26 27 28 29 30 31 1 2 3 4 5 6 7 8 9 10

J anuary February

8

67

12

34

5

910

8

67

12

34

5

910

16

Page 17: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

News articles

Alerts

Disease reports

17

Page 18: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

18

9/20, 15213, cough/cold, …9/21, 15207, antifever, …9/22, 15213, CC = cough, ...1,000,000 more records…

Huge mass of data Detection algorithm Too many alerts

Duplicative and uni-directional

channels

Uncoordinated response

Page 19: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Hybrid: Machine- and Human-based

Social, collaborative and cross-disciplinary

Web 2.0/3.0 platform

19

Page 20: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Better detection model Better response model

Source: http://www.pbs.org/wgbh/pages/frontline/shows/georgia/outbreak/matrix.html Source: www.sociology.columbia.edu/pdf-files/bearmanarticle.pdf

20

Page 21: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

News item 345

Field alerts

Disease report

Health News

Field alerts

News sources

Alerts

Data + Metadata

Collaboration and multi-directional communication between interested groups

Interactions beyond that allowed by original sources and with controlled visibility

Customizable, secure ‘social’ and ‘professional’ metadata around information

21

Page 22: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

22

9/20, 15213, cough/cold, …9/21, 15207, antifever, …9/22, 15213, CC = cough, ...1,000,000 more records…

Huge mass of data

Feedback loop

Fewer and more actionable alerts

Effective and coordinated response

Multi-directional communication

Page 23: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

23

Feature extraction (including geo-location)

TagsTags

CommentsComments

LocationLocation

Flags/Alerts/BookmarksFlags/Alerts/Bookmarks

Environment FactorsEnvironment Factors

Animal Health FactorsAnimal Health Factors

Remote SensingRemote Sensing

Riff

Bot

Event Classification and

Detection

Event Classification and

Detection

Previous Event Training DataPrevious Event Training Data

Previous Event Control DataPrevious Event Control Data

Metadataextraction

Other reference information

Machine learning

Show event characterizations

Social networkSocial network

Other inferred information

……

Professional network feedback

Professional feedback

Anomaly detection

Multiple data streams (multi-lingual)

User-Generated and Machine Learning Metadata

Existing Social Network (e.g., Comm. of interest)

Page 24: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

24

Page 25: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Kass-Hout and di Tada: Best Poster Award for Improving Public Health Investigation and Response at the Seventh Annual ISDS Conference, December 3-5, 2008 at the Raliegh Conference Civic Center. http://kasshout.blogspot.com/2008/12/best-poster-award-for-improving-public.html and http://www.isdsjournal.org/article/viewArticle/3308 25

Page 26: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Search: _____

{tag Cloud}Terms tagged by human collaborators or source

{Event Tag cloud}

X DiarrehaX CholeraX InfluenzaX Respiratory lllnessX Fever

[Show me unusual distributions]

26

Page 27: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

27

Page 28: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

28

Page 29: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

FiltersFilters

Item (e.g., disease report, news article, alert) summary and location (s)

Item (e.g., disease report, news article, alert) summary and location (s) Tag cloudTag cloud

29

Subscriptions Subscriptions SMS alertsSMS alerts

Ratings, comments, alerts, flagsRatings, comments, alerts, flagsTags (automatic + humans classification)Tags (automatic + humans classification)

Thread (related Items)Thread (related Items)

Page 30: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

LOCATIONS HEATMAP

30

Page 31: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

31

Page 32: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

32

Page 33: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

33

Page 34: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

34

Page 35: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

35

Page 36: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

36

Tracking the Avian

Influenza Outbreak in

Egypt (reports

started to appear late

January 2009).

Page 37: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Current classifications (automated and corrected by human experts) includes: 7 syndromes 10 transmission modes > 100 infectious diseases > 180 micro-organisms > 140 symptoms > 50 chemicals

37

HFOSS Disease Ontology Prediction Project http://2009.hfoss.org/Evolve_-_Disease_Ontology_Prediction

Page 38: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Over the summer, the Humanitarian FOSS (HFOSS) Project Summer Institute 2008 (May' 08 - July' 08) carried out an internship project mentored by InSTEDD and a number of HFOSS faculty. During this internship, Juan Pablo Mendoza and Qianqian Lin developed ALPACA Light Parsing And Classifying Application (ALPACA) to:

1. Transform raw unstructured documents (e.g., news reports, ProMED mail, etc.) into machine readable and analyzable data using a text parsing module

2. Categorize documents using a SVM classifier using libSVM for: 1. a) Classification into a predetermined (user-

defined) list of categories as described above (syndromes, symptoms, routes of transmission, diseases, etc.), and 

2. b) Suggesting additional tags and/or topics using a Naive Bayes classifier given existing topics and monitoring human input and review. This is especially helpful with new (emerging) threats or those threats that we know about but we experience them at a much bigger scale than usual (e.g., far more virulent flu virus than we’ve experienced over the past few years)

38

Page 39: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

We tested ALPACA against two widely accepted early sources of information in the public health community; Reuters news and ProMED mail. Results are shown here:            

ALPACA is extensible through a plug-in functionality that provides a simple way to add additional parsers and classifiers to the application. We are continuously adding and testing additional algorithms and we welcome your contribution to help us better calibrate existing classifiers and parsers as well as introduce additional ones (you can visit our collaborative space here.)

39

Page 40: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

To-date, we have: 480 registered users 394 collaboration spaces 694 streams of information sources (RSS, SMS,

etc.) 900.000 items [e.g., news articles, disease

reports] analyzed 443,151 geo-coded locations 700 terms [tags] ‘trained’ [accept/reject] by

human experts 12.000+ tags ‘suggested’ by human experts

40

Page 41: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Technical considerations Collaboration Workflow

Organizational considerations

Evaluation framework

41

Page 42: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Latest Progress Ontologies (e.g., BioCaster, SNOMED, ICD) Event reporting, analysis and public announcements

(e.g., Thomson Reuters Foundation’s Emergency Information Service (EIS) deployment during the Haiti Response, 2010

Planned Steps API for external extensions and interactions Full support for structured data Automatic field data collection through forms, SMS,

etc. Anomaly detections (e.g., EARS)

42

http://alertnet.org/db/blogs/1564/2010/00/24-120746-1.htm

http://ndt.instedd.org/search/label/eis

Page 43: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

43

Page 44: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

44

Page 45: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Taha Kass-Hout, MD, MS

http://kasshout.blogspot.com

Nicolás di Tada

[email protected]

Riff

http://riff.instedd.org

[Software: http://code.google.com/p/riff-evolve Code license: GNU General Public License v3, Content license: Creative Commons 3.0 BY-SA]

Cambodia, Photo taken by Taha Kass-Hout, October 2008

“this pic says it all- our kids are all the same- they deserve the same”, Comment by Robert Gregg on Facebook, October 2008

45

Page 46: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance
Page 47: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Kass-Hout and di Tada: Best Poster Award for Improving Public Health Investigation and Response at the Seventh Annual ISDS Conference, December 3-5, 2008 at the Raliegh Conference Civic Center. http://kasshout.blogspot.com/2008/12/best-poster-award-for-improving-public.html and http://www.isdsjournal.org/article/viewArticle/3308 47

Page 48: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Detection-focused visualization Individual alert listings Summary alerts Alerts in time-series graph Mapping alerts

Information-based visualization (visualizing data and information) Data query Data stratification Time-series graphs Data line listing Matrix portal Mapping

How to communicate information to users of the system. Typically there are three basic components:•Time-series graphs•Maps •Data tables

However, depending on the primary focus (detection-bases, information-based, or a hybrid of both) there can be more components as follows:

48

Page 49: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Detection-based systems (alert listings and maps of current anomalies are the two most important visualization components) What-if scenarios Automatic anomaly detection

Statistical anomaly

System believes there is an anomaly of interest to the user

Information-based systems (GIS, time-series graphs, data tables, query wizards, and real-time displays are the most important visualization components) Create new case definition Select different processing options Customize presentation to meet users needs

Additionally, we propose building a hybrid solution that combines both detection and information-based systems, which supports the following:

49

Page 50: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Early detectionSpatio-temporalBelief Networks (BNs)Simulation and modeling

Provide earlier notification of a change in the normal levels of observed counts of the desired health indicator.

Emphasis on the importance of matching the analytic process to the data type so as to achieve the performance needed for early identification of the event with minimum false alarms (Type I and Type II errors).

Performance evaluation f analytic processes using accepted metrics.

50

Page 51: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Statistical decision of the analytic data monitor include: which combinations of data sources to

test which algorithms to use with respect to

characteristics of the data background how to achieve sensitivity over many

locations within manageable false alert rate frequency

how much corroboration among data streams is required to achieve a threshold for escalating the information

A multiplicity of data sources has appeal because consistent evidence may be employed to suggest inferential accuracy. In practice, however, multiple data sources can be contradictory. Decision requirements for the prospective analytic data monitor involve when and how deeply to investigate a data anomaly as well as when to escalate the information (as an alert) for action. Unambiguous, corroborated data spikes are the exception rather than the rule. For single data streams, univariate algorithms employ data modeling and hypothesis tests to provide systematic signal escalation protocols.

51

Page 52: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Use of application-linked and hyperlinked-fields for integration of analysis and visualization tools

Commenting Capability Notification via a “publish and

subscribe” capability Shared group definitions and

calendars Shared access to key artifacts Support for Mobile devices (e.g., SMS)

and VOIP52

Page 53: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

The enforcement of the business rules for distributing and validating alerts, escalation, and the definition of tasks

Keeping the business logic encapsulated in an business engine, as opposed to “coding it into” the core applications

Modification of operations “on the fly”, and supporting different modes of operation depending on the current level of emergency

53

Page 54: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance
Page 55: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Overall measures Situation Awareness Global Assessment

Technique (SAGAT) The Situation Awareness Rating

Technique (SART)

Individual processes measures

Network parameters

55

Page 56: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Which automated systems generated the most reliable alerts, and for what types of conditions?

Which human users where the most effective in identifying conditions?

Which indicators are the most effective in identifying a health event?

What factors help to minimize or aggravate a health event?

Which elements of the biosurveillance lifecycle require the most time and/or collaboration?

The network history will provide a common point of evaluation for a variety of surveillance and response techniques

56

Page 57: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

System description Purpose (detection- and information-based) Stakeholders Operations

Health-related event detection Timeliness Validity

Validation approach

Statistical assessment of validity

Data quality

System experience System usefulness Flexibility Acceptability Portability Stability Costs

Evaluation here is primarily for the timely detection of health-related event and effectiveness of response. We have to keep in mind the flexibility of the system and how it can meet both regular and advanced users. Advanced users often want control in order to customize queries, modify graphic presentation, adjust sensitivity levels of detection algorithms, etc.

57

Page 58: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance
Page 59: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Source Type

Non-Specific Syndromic

Specific Case Definition

59

Note: All tags can follow a hierarchical construct

Ontology

Example: A subset of disease ontology, showing relationships between the various forms of pneumonia.

Pneumonia andinfluenza

Pneumonia due toStaphylococcus aureus

Other bacterialpneumoniaPneumococcal

pneumonia

Pneumonia due toHemophilus influenzae

Page 60: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Human Prodromal Clinical Morbidity and Mortality

Animal

Environmental (or Climate)

Allied Professional Source60

Page 61: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Building/vessel contamination Continuous or intermittent release of

an agent Contagious person-to-person Commercially distributed products Waterborne Vector/host borne Sexually transmitted Other

Large-scale bioaerosol Premonitory release of agent …

61

Page 62: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

CLIMATE

62

Temperature change

Precipitation change

Wind change …

PEOPLE

Die-offs observed

Sentinels tested

ANIMAL

Increased mortality rate

Increased presentations for treatment

Building/vessel contamination

Continuous or intermittent release of an agent

Contagious person-to-person

Commercially distributed products

Waterborne Vector/host borne Sexually transmitted Other

Large-scale bioaerosol

Premonitory release of agent

TRANSMISSION ROUTE

Note: All tags can follow a hierarchical construct

Page 63: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

RESPIRATORYBREATHING DIFFICULTY

63

Hemoptysis Asthma attack Croup Pneumonia Wheezing Runny or stuffy

nose Pleuritic pain Sore throat URI …

GI

Fever Weakness Anorexia Viral syndrome Faintness Malaise Body aches General illness Chills Lymphadenopa

thy Sweating …

CONSTITUTIONALIRRITABLE BABY

Abdominal pain

Diarrhea Vomiting Nausea Gastroenteritis Dehydration …

Cough Sore throat Fever Weakness Viral syndrome Body aches Bronchiolitis Pnemonia Upper respiratory

infection Malaise Chills Influenza …

INFLUENZA-LIKE ILLNESS (OR ILI)

Note: All tags can follow a hierarchical construct

Page 64: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

UNDIAGNOSED

GI Respiratory …

DIAGNOSED

Influenza Avian

influenza …

64

Can be mapped to standards, such as:• Unified Medical Language System

(UMLS) [which supports SNOMED, LOINC, ICDs, etc.] http://www.nlm.nih.gov/research/umls/

• PHIN VADS (http://www.cdc.gov/PHIN)

Case Definition:•Probable•Possible•Confirmed

Note: All tags can follow a hierarchical construct

Page 65: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

65

Page 66: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Cough [13 of 130]

If Item has:

Runny Nose [20 of 130]

Fever [23 of 130]

Then tag it with:

Flu [10 of 130]

Admin configures a new inference:

User sees a suggestion for a new item:

System will analyze the existing tagged Items and find out the probability of an item been a flu given that it has cough, runny nose and fever.

Flu [85% confidence because of cough, runny nose and fever]Influenza [55% confidence because of cough and headace]

Tags inferred

66

Page 67: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Cough

Longitude

Latitude

Fever

3 items clustered because of its proximity and similar symptoms

Note: This is actually done in a n-dimensional space, n being the number of tags available, plus the number of relevant words detected, plus a possible spatio-temporal dimension

67

Time

Page 68: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Each item gets represented by a vector of the relevant words it contains with the corresponding frequency.

Each tag classifier gets its linear classifier, which needs at least one positive and one negative sample. The classification is based on the vectors for each item, the linear classifier creates a hyperplane which divides the n-space in two for positive and negative predictions.

Whenever a user corrects or confirms a suggestion we feedback the classifier.

Any number of BNs can be created to map some “evidence” tags to a “prediction” tag. The system will measure for each item the probability of having that tag based on the existence of previous tags.

The vectors for the items can be grouped to find clusters. This will mean that those items are near in the n-space so they have similar values for their word content and tags.

68

Page 69: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

This is just an initial approach, there are a number of alternatives implementations: Automatic tagging can be done using clustering: we

create clusters for each tag and for new items we measure to which cluster centroids the item is closer.

Automatic tagging can also be done using BNs, our evidence can be the words and we can measure the probability of a certain tag based on the words contained by the item.

New Tag suggestions can be done using clustering instead of BNs: clustering all the items and suggesting tags that some of the items in the cluster have and the others don’t.

Given that we implement the algorithm abstractly enough, it should be simple to interchange them and see what works best.

69

Page 70: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

70

P(malaria) = 22% P(influenza) = 13% P(other ILI) = 33%

Page 71: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

ClassifiersClusteringBayesian StatisticsNeural NetworksGenetic Algorithms

71

Page 72: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

72

cold

fever

Page 73: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Map items to vectors (Feature extraction)

Normalize those vectorsTrain the classifierMeasure the results with new

informationFeedback the classifierSeparate classes in feature space

73

Page 74: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

74

Page 75: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Support vectors define the separator

75

Page 76: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

76

Φ: x → φ(x)

Map to higher-dimension space

Page 77: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

77

ClassifierClassifier

Document 1

Document 1

Document 2

Document 2

Document 3

Document 3

PositivesPositives

NegativesNegatives

Training DocumentTraining

DocumentTraining

DocumentTraining

Document

Page 78: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Map items to vectors (Feature extraction)

Normalization

Agglomerative or Partitional

78

Page 79: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

79

Page 80: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

80

Page 81: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

81

P(A |B) P(B | A).P(A)

P(B)

Probability of disease A (flu)

once symptom B (fever) is observed

Probability of disease A (flu)

once symptom B (fever) is observed

Probability of fever once flu is confirmed

Probability of fever once flu is confirmed

Probability of flu (prior or marginal)

Probability of flu (prior or marginal)

Probability of fever (prior or

marginal)

Probability of fever (prior or

marginal)

Page 82: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Given a set of stimuli, train a system to produce a given output…

82

Page 83: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Hidden LayerHidden Layer

Output LayerOutput Layer

Input LayerInput Layer

83

[…]

[…]

{I0,I1,……In}

{O0,O1,……On}

Weight

Weight

).(0 in

I

i in wIH

Page 84: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

84

Event?

Page 85: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Define the model that you want to optimize

Create the fitness function

Evolve the gene pool testing against the fitness function.

Select the best individual

85

Page 86: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Model the transmission process using a set of parameters (e.g., an infectious disease): Onset time between an infection and

illness Latency period Incubation period Symptomatic period Infectious period

86

(Onset, Latency, Incubation, Symptomatic , Infectious)

( 2 days, 3 days, 1 day, 4 days, 3 days)

Page 87: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

87

Fitness = 1/AreaFitness = 1/Area

Page 88: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

1. Create an initial population of candidates

2. Use operators to generate new candidates (mating and mutation)

3. Discard worst individuals or select best individuals in generation

4. Repeat from 2 until you find a candidate that satisfies the solution searched

88

Page 89: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

(4,5,6,3,5) (4,3,6,2,5)

89

(5,3,4,6,2) (2,4,6,3,5) (4,3,6,5,2)

(2,3,4,6,5) (3,4,5,2,6)

(3,5,4,6,2) (4,5,3,6,2) (5,4,2,3,6)

(4,6,3,2,5) (3,4,2,6,5) (3,6,5,1,4)

(5,3,2,6,5)

(3,4,4,6,2)

(5,3,2,6,5)

(3,4,4,6,2)

Page 90: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

90

Page 91: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance
Page 92: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

• Each "pill" is an hypothesis, it has the event tags on top, followed by the author. 

• Each hypothesis can have many responses to it, a response means someone disagrees or wants to add something to the proposed hypothesis.

• The Red area are items that have been rejected by the responder to the previous hypothesis. Something like "Yes, I agree with you, but this and this don't belong here".

• The Blue area means items that have been added to the responded hypothesis, like "Yes, you are right, but also this and this items should be included in this event.

92

Page 93: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

• A hypothesis response can include changes in the event tags: "You are right, there's definitely something going on with all those items, but I don't think its airborne, I think its waterborne"

• The list of people below the pill are all the users that have subscribed or agreed to that hypothesis: "Yes, Doc. James is right, I put my signature here."

• The size of the subscribers below is relative to their "reputation" inside Riff, the whole area below a pill gives an idea of how well supported a hypothesis is. We could also grow the pill proportionally.

93

Page 94: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

• The "bold" pill is the confirmed one, once there's a confirmation, that should increase the reputation of all the subscribers to that hypothesis. 

• Hovering each pill's area, should display a summary of the number of items there, maybe the area and the items tags. 

• Clicking should popup a small navigation of the actual items contained there.

94

Page 95: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance
Page 96: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Can trend analysis predict outbreaks? Recent studies show that Internet

search has: …considerable potential as one of the

earliest indicators for syndromic surveillance …the potential to predict population-based

events relevant to public health …the potential for a higher sensitivity

compared to other early sources (e.g., media, ProMed)

96

Page 97: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Many individuals experiencing symptoms of illness conduct Internet search prior to seeking medical attention Wilson, Kumanan, Brownstein, John S., Early detection of disease outbreaks

using the Internet, CMAJ 2009 180: 829-831 Ginsberg J, Mohebbi MH, Patel RS, et al. Detecting influenza epidemics using

search engine query data. Nature 2009;457:1012–4 Mostashari F. Can Internet searches provide useful data for public health

surveillance?. Advances in Disease Surveillance 2007;2:209 Wethington H, Bartlett P. Usage and data collection patterns for a novel

web-based foodborne-disease surveillance system. J Environ Health. 2006 Mar;68(7):25-9

Cooper CP, Mallon KP, Leadbetter S, Pollack LA, Peipins LA. Cancer Internet search activity on a major search engine, United States 2001-2003. J Med Internet Res 2005;7:e36 Li CS, Aggarwal C, Campbell M, et al. Site-Based Biosurveillance. MMWR September 24, 2004 / 53(Suppl);249

Eysenbach G, Kohler C. What is the prevalence of health-related searches on the World Wide Web? Qualitative and quantitative analysis of search engine queries on the Internet. Proc AMIA Annu Fall Symp 2003;225-9

97

Page 98: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Internet search for allergies and ragweed search terms increase in the spring, and allergy and pollen search terms increase significantly in the fall. It would also appear that Texas and Oklahoma are leading locales for ragweed.

98Source: Mostashari F. Can Internet searches provide useful data for public health surveillance?. Advances in Disease Surveillance 2007;2:209

Page 99: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

99

A search for the term “leptospirosis” in the United States finds dramatically higher search rates from Honolulu, Hawaii, consistent with the epidemiology of the illness in the United States (more than half of all national cases are reported from Hawaii).

Source: Mostashari F. Can Internet searches provide useful data for public health surveillance?. Advances in Disease Surveillance 2007;2:209

Page 100: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Internet search for “contact lens” increased in Singapore in February 2006, prior to the notification from CDC of the first US cases of contact lens-associated Fusarium keratitis in March 2006, and prior to widespread news coverage in April 2006.

100Source: Mostashari F. Can Internet searches provide useful data for public health surveillance?. Advances in Disease Surveillance 2007;2:209

Page 101: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

Following large anti-war protests on the Mall in Washington DC in late September 2005, multiple environmental sensors watching for bioterror events detected the presence of Francisella tularensis. Interestingly, queries appear to have increased prior to discovery of the sensor findings by public health officials on September 30th.

101Source: Mostashari F. Can Internet searches provide useful data for public health surveillance?. Advances in Disease Surveillance 2007;2:209

Page 102: RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance

While uncommon words like “croup” readily reveal the expected seasonal pattern of Internet search, more common words like “cough” or “throat” require logical modifiers to rule out more common search phrases.

102Source: Mostashari F. Can Internet searches provide useful data for public health surveillance?. Advances in Disease Surveillance 2007;2:209