Vaclav Belak PhD Viva

33
A Structural Approach to Community-level Social Influence Analysis Ph.D. Viva Václav Belák

description

 

Transcript of Vaclav Belak PhD Viva

  • 1. A Structural Approach to Community-level Social Influence Analysis Ph.D. Viva Vclav Belk
  • 2. Context and Motivation I Our earlier study suggested communities influence each other 2 / 25
  • 3. Context and Motivation II Network represents flow between actors Actor-level social influence in healthcare, innovations, marketing, etc. high in-degree Actors embedded in communities No suitable model of community-level influence 3 / 25
  • 4. Research Problem and Questions Problem: measurement, analysis, and explanation of influence between various types of social communities Questions 1. How can we model influence between communities? 2. How do we detect communities acting as global authorities/hubs? 1. Can we exploit the model to maximise information diffusion? 4 / 25
  • 5. Q1: How can we model influence between communities? 5 / 25
  • 6. Methodology: COIN What impacts depends on How T centrality communities communities membership communities actors actors communities impact 6 / 25
  • 7. Impact and Its Aggregates impacts communities depends on communities row impact of a community on others column impact of others on a community diagonal independence importance = total impact of a community on others dependence = total impact of others on a community importance/dependence heterogeneity measured by entropy 7 / 25
  • 8. Experiments 8 / 25
  • 9. Influence Over Time Questions: Which communities influenced a given community over time? How do we measure that by COIN? Hypothesis Frequent impact higher than independence indicates influence Experiments segment data by time window find impact higher than independence of influenced community Discussion fora data links represent replies forum as a proxy of community 9 / 25
  • 10. Personal Issues vs Moderators emphasised: strong impact impacting forum impact 10 Personal Issues Moderators 5 PI Mods 0 200 300 400 time Personal Issues influenced first by Moderators Later by a specific moderating community, PI Mods 10 / 25
  • 11. Q2: How do we detect communities acting as global authorities/hubs? 11 / 25
  • 12. importance Global Authorities: Widespread High Importance local authorities global authorities low widespread low importance entropy 12 / 25
  • 13. Moderators: Authority of importance 2.0 Moderators 1.5 1.0 0.5 0.55 0.60 0.65 0.70 importance entropy 13 / 25
  • 14. Global Hubs: Widespread High Dependence hubs low widespread low dependence driven dependence entropy 14 / 25
  • 15. After Hours: Hub of dependence 10 After Hours 5 0 0.4 0.5 0.6 0.7 0.8 0.9 dependence entropy 15 / 25
  • 16. Core: Hub of dependence COIN integrated to SAP PULSAR SAP Business One: Core dependence entropy 16 / 25
  • 17. Cross-Community Dynamics in Science Questions How can we measure and explain influence between scientific communities? How does the influence relate to communitys performance? How do we adapt COIN? Data Scientists linked by citations AI communities defined as conferences 17 / 25
  • 18. COIN for Scientific Communities citations as a proxy of impact and information flow citation information flow Aggregate Measures importance: how much information flows out of the community independence: how introspective the community is 18 / 25
  • 19. Exporters and Isolated AI Communities Hypothesis importance indicates exporters independence and importance indicates isolated islands CBR independence 0.75 islands COLT exporters 0.50 0.25 mainstream 0.00 0.0 loose exporters IJCAI 0.5 1.0 importance 1.5 19 / 25
  • 20. Q3: Can we exploit the model to maximise information diffusion? 20 / 25
  • 21. Influence and Information Diffusion high in-degree Cross-community diffusion maximisation problem: Actor-level diffusion maximisation problem: Which communities to target? Which actors to target? 21 / 25
  • 22. Information Diffusion Experiments Hypothesis: product of importance and entropy identifies seed communities that induce high overall adoption Overall adoption estimated by a diffusion model on Four targeting strategies: 1. 2. 3. 4. Impact Focus (IF) COIN Greedy (GR) Group In-degree (GI) Random (RA) IF = importance entropy Selection vs Prediction 22 / 25
  • 23. Selection user activation fraction (a) COIN Optimises Information Diffusion 0.05 0.04 0.03 0.02 0.01 1 user activation fraction (a) Greedy overfits Prediction strategy IF GI GR RA 2 3 strategy# seed communities (q) 4 0.05 Impact 5 Focus is more robust 0.04 strategy IF GI GR RA 0.03 0.02 0.01 0.00 1 2 3 4 5 # seed communities (q) 23 / 25
  • 24. Summary and Future Work COIN: computational model for community influence Communities influencing a particular community Roles of communities: authorities vs hubs Isolated communities loosing influence Seed communities for information diffusion General (3 systems) and extensible Tensor-based extension of COIN captures topics Future Work May be applicable to e.g. email networks Impact Focus may be improved by discounting overlap Sentiment-informed community influence 24 / 25
  • 25. Contributions proposes a solution to the problem of measurement, analysis, and explanation of influence between communities purely structural approach extended to capture topics empirical analysis of 3 systems common/different phenomena first approach to novel problem of cross-community information diffusion Dissemination 1 journal, 3 conference, and 1 workshop papers best poster at NUIG research day 2013 complete results, software, data, thesis, etc. at: http://belak.net/doc/2014/thesis.html 25 / 25
  • 26. Personal Issues and Moderators membership indegree 1.00 0.75 ld 12 30 8 20 0.50 0.25 10 0.00 4 0 PI PIM group PI PIM 0 PI PIM PI PIM 26
  • 27. CBR community: isolated CBR in, outflow 1.6 1.2 outflow inflow introspection outflow inflow introspection 0.8 0.4 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 year JELIA in, outflow 3 2 1 0 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 year 27
  • 28. CBR: isolated and shrinking decreasing size rigid member-base rising impact factor driven by self-citations group indegree 160 size 140 120 120 80 40 1.00 impact factor 0.75 0.50 0.25 0.00 1996 1998 2000 2002 2004 2006 2008 1996 1998 2000 2002 2004 2006 2008 1996 1998 2000 2002 2004 2006 2008 year year year CBR was unable to attract new members and decayed Cannot be revealed by introspective analysis 28
  • 29. Greedy Strategy 29
  • 30. Group In-Degree GI = # links from outside 30
  • 31. COIN extended to capture topics Based on tensor algebra Better interpretability and sensitivity Consistent with purely structural COIN actors Topical Dimensions of Influence communities Example: V-TFL Admin vs V-TFL Discussion 31
  • 32. Rise of Hubs and Authorities in Boards 32
  • 33. Exporters and Introspective Communities 33