A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

34
A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES GRÉGOIRE BUREL 1 AND YULAN HE 2 1 Knowledge Media Institute, The Open University, Milton Keynes, UK. 2 School of Engineering & Applied Science Aston University, UK. HT2013 Paris, France. 2013

description

Online enquiry communities such as Question Answering (Q&A) websites allow people to seek answers to all kind of questions. With the growing popularity of such platforms, it is important for community managers to constantly mon- itor the performance of their communities. Although differ- ent metrics have been proposed for tracking the evolution of such communities, maturity, the process in which communities become more topic proficient over time, has been largely ignored despite its potential to help in identifying robust communities. In this paper, we interpret community maturity as the proportion of complex questions in a community at a given time. We use the Server Fault (SF) community, a Question Answering (Q&A) community of system administrators, as our case study and perform analysis on question complexity, the level of expertise required to answer a question. We show that question complexity depends on both the length of involvement and the level of contributions of the users who post questions within their community. We extract features relating to askers, answerers, questions and answers, and analyse which features are strongly correlated with question complexity. Although our findings highlight the difficulty of automatically identifying question complexity, we found that complexity is more influenced by both the topical focus and the length of community involvement of askers. Following the identification of question complexity, we define a measure of maturity and analyse the evolution of different topical communities. Our results show that different topical communities show different maturity patterns. Some communities show a high maturity at the beginning while others exhibit slow maturity rate.

Transcript of A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

Page 1: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY

COMMUNITIES

GRÉGOIRE BUREL1 AND YULAN HE2

1Knowledge Media Institute, The Open University, Milton Keynes, UK.2School of Engineering & Applied Science Aston University, UK.

HT2013Paris, France. 2013

Page 2: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

OUTLINEA QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

- Question Complexity and Community Maturity- Enquiry Communities- Server Fault- Needs and Motivations- Contributions

- Hypotheses and Validation- Two Definitions- Five Hypotheses- Validation

- Computing and Mapping Features- Predictors- Feature Computation: Users, Content and Threads.

- Measuring Content Complexity and Community Maturity- Prediction Results- Feature Ranking- Community Maturity

- Future Work- Conclusion

Page 3: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

ENQUIRY COMMUNITIES

“Enquiry Communities are communities composed of askers and answerers looking for solutions to particular issues.”

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 4: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

ENQUIRY COMMUNITIES

“Enquiry Communities are communities composed of askers and answerers looking for solutions to particular issues.”

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 5: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

ENQUIRY COMMUNITIES

“Enquiry Communities are communities composed of askers and answerers looking for solutions to particular issues.”

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 6: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

- Server Fault (SF):- A web based enquiry IT

community specialised in server related issues.

- Factual questions rather than conversational questions.

- Dataset (Data up to April 2011):

- 71,962 Questions- 162,401 Answers- 51,727 Users- 4,999 Topics (Tags)

http://serverfault.com

Page 7: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

ENQUIRY COMMUNITIES- Enquiry Communities Needs (Rowe et al. 2011, Burel

et al. 2012):- Community Managers:

- Make sure that the community is “happy” (questions are solved).

- Make sure that the community becomes more knowledgeable over time (users gain expertise and experience).

- Identify and implement features that help users goals.- Askers:

- Get answers related to a particular issue.- Make sure that a community can fulfil their needs before asking

a questions.- Answerers:

- Find which question they can answer.- Find questions that are challenging.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 8: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

ISSUES AND MOTIVATION- Enquiry Communities Needs:

- Questions have uneven complexity:- Difficulty to identify how hard are particular questions and who

can answer them. - Communities have different answering abilities:

- Some communities can answers simple questions about a topic while other communities can also answer complex questions.

- How do determine if a community is able to answer complex questions?

- Some communities are more knowledgeable and experienced than others:- How do we measure experience and expertise?

- Features can support the identification of mature communities and complex content, but which ones?- What features help to measure community maturity and content

complexity?

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 9: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

IDENTIFYING COMPLEX QUESTIONS AND MATURE COMMUNITIESHow user, content, thread and platform features affect content complexity identification? How can we measure maturity based on content complexity?1. Identifying Complex Questions: – Helping answerer to find relevant and challenging questions.2. Analysis of Complexity Predictors:– Helping community manager to identify important complexity factors3. Measuring Community Maturity:– Helping users to decide if their question will be answered/Helping community manager to understand their community abilities.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 10: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

CONTRIBUTIONSHow user, content, thread and platform features affect quality content complexity? How can we use content complexity for measuring the maturity of communities?- Introduce a definition of question complexity and validate the hypothesis that question complexity increases with askers’ community involvement.- Study the influence of features relating to askers, answerers, questions and answers on question complexity prediction.- Introduce the concept of community maturity, a measure of community knowledge and specialisation.- Investigate the evolution of community maturity in Server Fault and demonstrate that community maturity is influenced by topical dynamics.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 11: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

LITERATUREHow user, content, thread and platform features affect quality content complexity? How can we use content complexity for measuring the maturity of communities?

- No empirical study of the relation between content complexity and community involvement.

- No free-form model of content complexity. Typically very domain dependent (Wu, 2009; Bachrach et al. 2012).

- Community health metrics (Welinder, et al. 2010; Toral et al., 2009; Rowe et al. 2011) tend to neglect skill building as a key health indicator despite the importance of such factor in user participation (Pal et al., 2012; Nam et al., 2009).

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 12: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

QUESTION COMPLEXITY AND MATURITY- Definition 1 (Question Complexity):- Question complexity is a value representing the difficulty and level of expertise required for answering a question.- Definition 2 (Community Maturity):- Community Maturity is a value representing the level of knowledge and specialisation achieved by a community. A more mature community focuses on more complex questions whereas a community less mature has simpler and less focused questions.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 13: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

QUESTION COMPLEXITY AND MATURITY- Definition 1 (Question Complexity):- Question complexity is a value representing the difficulty and level of expertise required for answering a question.- Definition 2 (Community Maturity):- Community Maturity is a value representing the level of knowledge and specialisation achieved by a community. A more mature community focuses on more complex questions whereas a community less mature has simpler and less focused questions.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 14: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

QUESTION COMPLEXITY AND MATURITY- Hypothesis 1 (Temporality):

- For a given user, question complexity increases as a function of time and participation. The longer a user is actively involved in a community, the more complex are her questions.

- Hypothesis 2 (Enquiry):- For a given user, question complexity increases with the number of question asked. The

more a user asks questions, the more likely her questions will become more complex.

- Hypothesis 3 (Commitment):- For a given user, question complexity increases with her activity levels. The more

frequently a user is involved in a community, the more complex are her questions.

- Hypothesis 4 (Accomplishment):- For a given user, question complexity increases with the number of questions she has

found answers before. The more a user finds answers to some questions, the more likely she can improve her knowledge skill and thus asks more complex questions in the future.

- Hypothesis 5 (Focus):- For a given user, question complexity increases with the number of question asked. The

more a user asks questions, the more likely her questions will become more complex.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 15: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

QUESTION COMPLEXITY AND MATURITY- Hypothesis 1 (Temporality):

- For a given user, question complexity increases as a function of time and participation. The longer a user is actively involved in a community, the more complex are her questions.

- Hypothesis 2 (Enquiry):- For a given user, question complexity increases with the number of question asked. The

more a user asks questions, the more likely her questions will become more complex.

- Hypothesis 3 (Commitment):- For a given user, question complexity increases with her activity levels. The more

frequently a user is involved in a community, the more complex are her questions.

- Hypothesis 4 (Accomplishment):- For a given user, question complexity increases with the number of questions she has

found answers before. The more a user finds answers to some questions, the more likely she can improve her knowledge skill and thus asks more complex questions in the future.

- Hypothesis 5 (Focus):- For a given user, question complexity increases with the number of question asked. The

more a user asks questions, the more likely her questions will become more complex.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 16: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

QUESTION COMPLEXITY AND MATURITY- Hypothesis 1 (Temporality):

- For a given user, question complexity increases as a function of time and participation. The longer a user is actively involved in a community, the more complex are her questions.

- Hypothesis 2 (Enquiry):- For a given user, question complexity increases with the number of question asked. The

more a user asks questions, the more likely her questions will become more complex.

- Hypothesis 3 (Commitment):- For a given user, question complexity increases with her activity levels. The more

frequently a user is involved in a community, the more complex are her questions.

- Hypothesis 4 (Accomplishment):- For a given user, question complexity increases with the number of questions she has

found answers before. The more a user finds answers to some questions, the more likely she can improve her knowledge skill and thus asks more complex questions in the future.

- Hypothesis 5 (Focus):- For a given user, question complexity increases with the number of question asked. The

more a user asks questions, the more likely her questions will become more complex.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Participation

Com

ple

xity

Page 17: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

HYPOTHESES VALIDATION

- Methodology:1. Select 510 question pairs based on the previous hypotheses:

- Questions from early and late user contributions.

2. Annotate the question pairs by selecting what question is the most complex:- Due to low inter-annotator agreement (for 3 annotators, κ = 0.146), we

focus on pairs that have more than 75% agreement (220 pairs, 440 questions).

3. Calculate the statistical significance of hypothesis- Concentration on Hypothesis 1: Temporality.

- Results (Hypothesis 1):

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 18: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

HYPOTHESES VALIDATION

- Methodology:1. Select 510 question pairs based on the previous hypotheses:

- Questions from early and late user contributions.

2. Annotate the question pairs by selecting what question is the most complex:- Due to low inter-annotator agreement (for 3 annotators, κ = 0.146), we

focus on pairs that have more than 75% agreement (220 pairs, 440 questions).

3. Calculate the statistical significance of hypothesis- Concentration on Hypothesis 1: Temporality.

- Results (Hypothesis 1):

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 19: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

FEATURES 1. User Features (Askers and Answerers):– Represents the characteristics and reputation of

askers and answerers (e.g. reputation, number of best answers, normalised topic entropy…).

2. Questions and Answers Features:– Questions and answers features (e.g. readability,

ratings, number of views…).– Represents relation between answers within a

particular thread. (e.g. topic reputation, elapsed days…).

– Content based features (e.g. term entropy, readability…).

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 20: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

FEATURESA QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Type Features

Askers Community Age (Experience), Community Age Difference, Number of Questions (Enquiry), Number of Answers, Asking Rate (Asker Commitment), Answering Rate, Ratio of Successfully-Answered Questions (Accomplishment), Ratio of Question Successfully Answered by Others, Normalised Question Topic Entropy (Focus), Normalised Answer Topic Entropy, Average Number of Replies per Question, Average Number of Question Views, Z-score, Reputation.

Answerers

Askers features + Mean and Standard deviation forms.

Questions

Number of Views, Number of Words, Readability with Gunning Fog , Readability with Flesch-Kincaid Grade, Existing Value, Status, Number of Answers, Favourites, Score, Informativeness, Cumulative Term Entropy.

Answers Questions features + Mean and Standard deviation forms + Elapsed Days, Elapsed Days First, Elapsed Days Last, Number of Comments Mean, Score.

Page 21: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

QUESTION COMPLEXITY PREDICTION- Experimental Setting:

1. Split the annotated questions in complex and non-complex questions (440 questions).

2. Compute features.3. Use Logistic Regression algorithm and

validate results using 10-folds cross validation.

4. Compute Precision (P), Recall (R), F-Measure (F1) and area under the Receiver Operator Curve (ROC) for different feature groups.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 22: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

COMPLEXITY PREDICTION RESULTS

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 23: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

COMPLEXITY PREDICTION RESULTS

- Best Answer Identification (F1 0.60):– Baseline Models:

- Asker’s age in a community correlates better than question length.

- Question length is not correlated with complex questions.

– Feature Types Models and Complete Model:- Askers and answerer’s features are the best: Question

complexity is mostly related with asker’s features.- The full model performs better than the feature type

models.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 24: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

FEATURES RANKING- Features Ranking:

1. For each feature, Information Gain Ratio (IGR), Correlation Feature Selection (CFS) and F1 Feature Drop (FD) is computed

2. The features are then sorted by their respective importance.

3. The best features are then selected for computing a new question complexity model by accounting for the best F1.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 25: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

FEATURES RANKING RESULTSA QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 26: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

FEATURES RANKING RESULTS- Features Impact Comparison:– Asker’s community age and topical focus are the

most important features.– User features are the most significant (73.3% of

the top ten features).– Answer features are low ranked.– Focused users are more likely to ask complex

questions.– Questions with low value (Pal et al., 2010) are

more likely to be complex (complements findings on question selection behaviour of experts (Pal et al., 2010)).

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 27: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

BEST MODEL RESULTS- Best Model (F1 0.64):

– The best model is obtained when using CFS, the selected features are:1. Asker’s question topical

focus.2. Asker’s ratio of successfully-

answered questions.3. Askers’ community age. 4. Questions’ existing value

(Pal et al., 2010).5. Questions’ views.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 28: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

COMMUNITY MATURITY- Maturity Measure:

- Experimental Setting:1. Calculate question complexity based on the proportion of

complex questions asked per month.2. Compute maturity on different users sets depending on

their age in the community.3. Compute maturity for the most discussed topics (tags)

and users that have been active for more than a day.4. Observe the evolution of maturity for the most discussed

topics and the different users groups.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 29: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

COMMUNITY MATURITY RESULTS

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Users Topics/Communities

Page 30: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

COMMUNITY MATURITY RESULTS- User Evolution:- Maturity increases over time.- Maturity drop can be explained by the drop of average community age at the end of 2010 (229 to 185 days).- Committed users are more likely to become more mature (0.64 > 0.4).- Community Evolution and Topics:- Maturity increases over time.- Different topics/Different growth rates. For example:- Linux: Slow but sustained → Linux users becomes more knowledgeable over time.- Windows-server-2008: Initially high, then low → Users migrating to Windows-server-2008-r2.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 31: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

FUTURE WORK- Perform similar analysis on other

Enquiry Communities:- Confirm our results on additional

datasets.- Derive a complexity metric that can

be applied to any online community based on the 5 factors of complexity:- Create a measure that does not require

annotations.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 32: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

CONCLUSION- We showed that current health measures do not help in

identifying communities that become more topic proficient over time.

- We introduced the concept of question complexity and community maturity and provided a complexity model (F1 ≈ 0.65) and a maturity measure.

- We showed that question complexity depends on user activity and commitment as well as other factors (hypotheses testing).

- We found that complex questions depends on five key factors: 1) asker’s question topical focus; 2) asker’s ratio of successfully-answered questions; 3) askers’ community age; 4) questions’ existing value (Pal et al., 2010), and; 5) questions’ views.

- We showed that SF is a mature community and that maturity has topical dynamics.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 33: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

QUESTIONS?Web: http://evhart.online.frEmail: [email protected]: @evhart

@www

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES

Page 34: A Question of Complexity - Measuring the Maturity of Online Enquiry Communities

REFERENCES- Rowe, M., Alani, H., Angeletou, S., and Burel, G. Report on social, technical and corporate

needs in online communities. Tech. Rep. 3.1, ROBUST, 2011.- Burel, G, Yulan H., Alani H. Automatic Identification Of Best Answers In Online Enquiry

Communities. In Proceeding of ESWC2012 (2012). Heraklion, Greece. - Wu, M. The community health index. In Proceedings of the 4th International Conference on

Persuasive Technology (New York, NY, USA, 2009), Persuasive ’09, ACM, pp. 24:1–24:2.- Bachrach, Y., Graepel, T., Minka, T., and Guiver, J. How to grade a test without knowing the

Answers - A bayesian graphical model for adaptive crowdsourcing and aptitude testing. arXiv preprint arXiv:1206.6386 (2012).

- Welinder, P., Branson, S., Belongie, S., and Perona, P. The multidimensional wisdom of crowds. In In Proc. of NIPS (2010), pp. 2424–2432.

- Toral, S. L., Martınez-Torres, M. R., Barrero, F., and Cortals, F. An empirical study of the driving forces behind online communities. Internet Research 19, 4 (2009), 378–392.

- Pal, A., Chang, S., and Konstan, J. Evolution of experts in question answering communities. In Proceedings of the International AAAI Conference on Weblogs and Social Media (2012), pp. 274–281.

- Nam, K., Ackerman, M., and Adamic, L. Questions in, knowledge in?: a study of naver’s question answering community. In Proceedings of the 27th international conference on Human factors in computing systems (2009), pp. 779–788.

- Pal, A., Chang, S., and Konstan, J. Evolution of experts in question answering communities. In Proceedings of the International AAAI Conference on Weblogs and Social Media (2012), pp. 274–281.

A QUESTION OF COMPLEXITY − MEASURING THE MATURITY OF ONLINE ENQUIRY COMMUNITIES