KDD 2011 Doctoral Session

1
KDD 2011 Doctoral Session Modeling Trustworthiness of Online Content V. G. Vinod Vydiswaran Advisors: Prof.ChengXiang Zhai, Prof.Dan Roth University of Illinois at Urbana-Champaign Incorporating text in trust models Three directions of research Credibility assessment woes Acknowledgments My research is supported partially by the Multimodal Information Access and Synthesis (MIAS) Center at the University of Illinois at Urbana-Champaign, part of CCICADA, a DHS Science and Technology Center of Excellence, and grants from the Army Research Laboratory. Contact details vgvinodv@illinois. edu, [email protected], [email protected] Claim 1 Claim n Claim 2 . . . Evidenc e Claim s Source s Web sources Evidence passages Claim sentences Incorporates semantics in trust computation using evidence. Claims need not be structured tuples they can be free-text sentences. Framework does not assume that accurate Information Extraction is available. A source can have different trust profile for different claims – not all claims from a source get equal weight. Advantages over traditional models Traditional two- layer fact-finder models Claim 1 Claim n Claim 2 [Yin, et al., 2007; Pasternack & Roth, 2010] Need to determine the truth value of a claim. Many information types available to gauge trustworthiness Source credibility and the power of information network Evidence trustworthiness Signals from community knowledge Contrastive viewpoints for claims Biases of users accessing the information The goal is to recognize credible information by combining these features Next step is to understand how human biases interact with credibility of information they access Conclusion and future research steps Community knowledge to validate claims Veracity of news reporting Trustworthiness of news stories Credibility of news sources Building trust models over pieces of evidence Content-driven trust propagation framework (KDD 2011) Utilizes similarity and trustworthiness of evidence to measure trustworthiness of sources and claims. Scoring claims based on community knowledge Find treatment relations in health message boards and forums Verify if the perception formed from reading forums correlates with validity of treatments (as approved by FDA) Squashing rumors with evidence search Find evidence for claims from a large text collection (ACL 2009) Find contrasting evidence (ongoing) Even reputed sources make mistakes Some claims (and sources) are purposefully misleading Not all claims made by a source is equally trustworthy Often, contradictory claims are both supported by credible evidence How to verify free-text claims? Claim DB Claim DB Claim DB Claim DB Evidence & Support DB Match up claims to evidence Rate sites based on matching claims and their support Extract relevant claims and evidence 1 2 3 Contrastive evidence retrieval Lookup pieces of evidence supporting and opposing the claim Lookup pieces of evidence only on relevance Traditional search Evidence search Scalable Entailed Relation Recognizer Expanded Lexical Retrieval Entailmen t Recogniti on Text Corpu s Indexe s Hypothes is (Claim) Relation [Initial work at ACL 2009] 1 2 3 [KDD-DMH 2011] [KDD 2011]

description

Modeling Trustworthiness of Online Content V. G. Vinod Vydiswaran Advisors: Prof.ChengXiang Zhai, Prof.Dan Roth University of Illinois at Urbana-Champaign. Scalable Entailed Relation Recognizer. Expanded Lexical Retrieval. Entailment Recognition. Lookup pieces of evidence only on relevance. - PowerPoint PPT Presentation

Transcript of KDD 2011 Doctoral Session

Page 1: KDD 2011 Doctoral Session

KDD 2011 Doctoral Session

Modeling Trustworthiness of Online ContentV. G. Vinod Vydiswaran

Advisors: Prof.ChengXiang Zhai, Prof.Dan RothUniversity of Illinois at Urbana-Champaign

Incorporating text in trust models

Three directions of research

Credibility assessment woes

Acknowledgments

My research is supported partially by the Multimodal Information Access and Synthesis (MIAS) Center at the University of Illinois at Urbana-Champaign, part of CCICADA, a DHS Science and Technology Center of Excellence, and grants from the Army Research Laboratory.

Contact details

[email protected], [email protected], [email protected]

Claim 1

Claim n

Claim 2...

Evidence ClaimsSources

Web sources

Evidence passages

Claim sentences

Incorporates semantics in trust computation using evidence.

Claims need not be structured tuples – they can be free-text sentences.

Framework does not assume that accurate Information Extraction is available.

A source can have different trust profile for different claims – not all claims from a source get equal weight.

Advantages over traditional models

Traditional two-layer fact-finder models

Claim 1

Claim n

Claim 2…

[Yin, et al., 2007; Pasternack & Roth, 2010]

Need to determine the truth value of a claim. Many information types available to gauge

trustworthiness Source credibility and the power of information network Evidence trustworthiness Signals from community knowledge Contrastive viewpoints for claims Biases of users accessing the information

The goal is to recognize credible information by combining these features

Next step is to understand how human biases interact with credibility of information they access

Conclusion and future research steps

Community knowledge to validate claims

Veracity ofnews reporting

Trustworthiness of news stories

Credibility of news sources

Building trust models over pieces of evidence

Content-driven trust propagation framework (KDD 2011)

Utilizes similarity and trustworthiness of evidence to measure trustworthiness of sources and claims.

Scoring claims based on community knowledge

Find treatment relations in health message boards and forums

Verify if the perception formed from reading forums correlates with validity of treatments (as approved by FDA)

Squashing rumors with evidence search

Find evidence for claims from a large text collection (ACL 2009)

Find contrasting evidence (ongoing)

Even reputed sources make mistakes Some claims (and sources) are

purposefully misleading Not all claims made by a source is

equally trustworthy Often, contradictory claims are both

supported by credible evidence How to verify free-text claims?

Claim DB

Claim DB

Claim DB

Claim DB

Evidence & Support DB

Match up claims to evidence

Rate sites based on matching claims and

their support

Extract relevant claims and evidence

1 2 3

Contrastive evidence retrieval

Lookup pieces of evidence

supporting and opposing the claim

Lookup pieces of evidence only on

relevance

Traditional search

Evidence search

Scalable Entailed Relation Recognizer

Expanded Lexical

Retrieval

Entailment Recognition

Text Corpus

Indexes

Hypothesis(Claim) Relation

[Initial work at ACL 2009]

1

2 3

[KDD-DMH 2011]

[KDD 2011]