Editorial Collaboration Networks of Wikipedia Articles in Various Languages

44
Editorial Collaboration Networks of Wikipedia Articles in Various Languages Faculty of Policy Management, Keio University COINs2011 Takashi Iba Ko Matsuzuka Daiki Muramatsu

description

Our presentation about open-collaboration given at the International Conference on Collaborative Innovation Networks (COINs2011) in Basel, Switzerland, Sep. 9, 2011. The video of this presentation is available at the Livestream site http://www.livestream.com/coinsconference

Transcript of Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Page 1: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Faculty of Policy Management, Keio University

COINs2011

Takashi Iba Ko Matsuzuka

Daiki Muramatsu

Page 2: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

• The characteristics of collaboration patterns of all articles in a certain language.

• The commonality and differences of collaboration patterns among Wikipedias written in various languages.

Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Page 3: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Editorial Collaboration Networks of Wikipedia Articles in Various Languages

  Method: Sequential collaboration network

  Analysis 1: Comparison of 12 different languages

  Analysis 2: Distribution of account and IP users

  Analysis 3: Distribution of Featured Articles

Page 4: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Editorial Collaboration Networks of Wikipedia Articles in Various Languages

  Method: Sequential collaboration network

  Analysis 1: Comparison of 12 different languages

  Analysis 2: Distribution of account and IP users

  Analysis 3: Distribution of Featured Articles

Page 5: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Editor A �

Editor B �

Editor A �

Editor C�

an article 1 �

2 �

3 �

4 �

5 �

Editor A �

Editor B �

Editor C�

order�

Building a sequential collaboration network, connecting a relation from editor A to editor B, if editor B follows on work done by editor A.

  Method: Sequential collaboration network

Page 6: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Sequential Collaboration Network of Article “Collaborative Innovation Networks” in English Wikipedia

The number of Nodes = 51 Average path length = 6.399

Page 7: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Sequential Collaboration Network of Article “Basel” in English Wikipedia

The number of Nodes = 594 Average path length = 6.577

Page 8: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Sequential Collaboration Network of Article “Switzerland” in English Wikipedia

The number of Nodes = 3998 Average path length = 5.468

Page 9: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Sequential Collaboration Network of Article “Fondue” in English Wikipedia

The number of Nodes = 457 Average path length = 10.485

Page 10: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Editor A �

Editor B �

Editor A �

Editor C�

an article 1 �

2 �

3 �

4 �

5 �

Editor A �

Editor B �

Editor C�

order�

Building a sequential collaboration network, connecting a relation from editor A to editor B, if editor B follows on work done by editor A.

  Method: Sequential collaboration network

Page 11: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Our Previous Study: Featured Articles in English Wikipedia

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k Linear graph

T. Iba, K. Nemoto, B. Peters & P. Gloor, "Analyzing the Creative Editing Behavior of Wikipedia Editors Through Dynamical Social Network Analysis", COINs2011, 2009 T. Iba and S. Itoh, "Sequential Collaboration Network of Open Collaboration", NetSci'09, 2009

2,545 articles [Jun 27 2009]

Page 12: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Editorial Collaboration Networks of Wikipedia Articles in Various Languages

  Method: Sequential collaboration network

  Analysis 1: Comparison of 12 different languages

  Analysis 2: Distribution of account and IP users

  Analysis 3: Distribution of Featured Articles

Page 13: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Rank 1: English Rank 2: German Rank 3: French Rank 4: Polish Rank 5: Italian Rank 6: Japanese Rank 7: Spanish Rank 8: Dutch Rank 9: Portuguese Rank 10: Russian … Rank 15: Finnish … Rank 20: Turkish

Target Languages

Analyzing ALL articles as of January 1st, 2011 in each language.

The ranking based on the data as of January 6th, 2011.

 Analysis 1: Comparison of 12 different languages

Page 14: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

English Rank 1 3,490,325 articles

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Double logarithmic graph

Page 15: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

English Rank 1 3,490,325 articles

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Double logarithmic graph

Page 16: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

German Rank 2 1,155,210 articles

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Double logarithmic graph

Page 17: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

French Rank 3 1,039,251 articles

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Double logarithmic graph

Page 18: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Polish Rank 4 752,734 articles

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Double logarithmic graph

Page 19: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Italian Rank 5 750,634 articles

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Double logarithmic graph

Page 20: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Japanese Rank 6 718,974 articles

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Double logarithmic graph

Page 21: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Spanish Rank 7 676,866 articles

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Double logarithmic graph

Page 22: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Dutch Rank 8 656,079 articles

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Double logarithmic graph

Page 23: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Portuguese Rank 9 638,747 articles

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Double logarithmic graph

Page 24: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Russian Rank 10 627,139 articles

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Double logarithmic graph

Page 25: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Finnish Rank 15 255,712 articles

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Double logarithmic graph

Page 26: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Turkish Rank 20 152,262 articles

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Double logarithmic graph

Page 27: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

English German French Polish

Italian Japanese Spanish Dutch

Portuguese Russian Finnish Turkish

Page 28: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Result of Analysis 1: Comparison of 12 different languages

• Scatter plot of all articles exhibits a tilted triangle in all languages.

• The height of triangle gets shorter as the number of articles decreases.

Page 29: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Editorial Collaboration Networks of Wikipedia Articles in Various Languages

  Method: Sequential collaboration network

  Analysis 1: Comparison of 12 different languages

  Analysis 2: Distribution of account and IP users

  Analysis 3: Distribution of Featured Articles

Page 30: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

 Analysis 2: Distribution of account and IP users

IP users

Account users

Page 31: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Scatter plot of articles in English Wikipedia

Double logarithmic graph

Page 32: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Scatter plot of articles with number of IP users / number of total editors

Double logarithmic graph

0.0 PIP 1.0

Page 33: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

PIP = 0.0 PIP = 0.1 PIP = 0.2

PIP = 0.3 PIP = 0.4 PIP = 0.5

PIP = 0.6 PIP = 0.7 PIP = 0.8

Scatter plot of articles with number of IP users / number of total editors

Page 34: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Scatter plot of articles with number of IP users / number of total editors

Double logarithmic graph

0.0 PIP 1.0

Page 35: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Result of Analysis 2: Distribution of account and IP users

• Top and right area of the “triangle” in scatter plot consist of articles which ratios of users is high.

• As a result, both the average path length and order of network can be large in these areas.

PIP = 0.0 PIP = 0.6

Page 36: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Editorial Collaboration Networks of Wikipedia Articles in Various Languages

  Method: Sequential collaboration network

  Analysis 1: Comparison of 12 different languages

  Analysis 2: Distribution of account and IP users

  Analysis 3: Distribution of Featured Articles

Page 37: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

 Analysis 3: Distribution of Featured Articles

3,372 featured articles / 3,732,033 articles In English Wikipedia

Page 38: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Scatter plot of all articles in English Wikipedia

Double logarithmic graph

Page 39: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Scatter plot of featured articles on the all articles in English Wikipedia

Double logarithmic graph

Page 40: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

The order of each sequential collaboration network (The number of editors in each article)

The

aver

age

path

leng

th o

f e

ach

sequ

entia

l col

labo

ratio

n ne

twor

k

Scatter plot of featured articles on the all articles in English Wikipedia

Double logarithmic graph

Page 41: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Result of Analysis 3: Distribution of Featured Articles

• Features articles are located at a certain area in the scatter plot.

• It implies that there would be characteristic patterns of collaboration producing good results.

Page 42: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Editorial Collaboration Networks of Wikipedia Articles in Various Languages

  Method: Sequential collaboration network

  Analysis 1: Comparison of 12 different languages

  Analysis 2: Distribution of account and IP users

  Analysis 3: Distribution of Featured Articles

Page 43: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Editorial Collaboration Networks of Wikipedia Articles in Various Languages

• Scatter plot of all articles commonly exhibits a tilted triangle in all languages, but the height of triangle gets shorter as the number of articles decreases.

• Top and right area of the “triangle” in scatter plot consist of articles which the ratios of IP users are high.

• Features articles are located at a certain area in the scatter plot.

Page 44: Editorial Collaboration Networks of Wikipedia Articles in Various Languages

Collaborators

Natsumi Yotsumoto

Bui Hong Ha Daiki Muramatsu

Takashi Iba

Ko Matsuzuka

Associate Professor, Faculty of Policy Management,

Keio University Ph.D. in media and governance

Iba Lab. Faculty of Policy Management,

Keio University

Iba Lab. Faculty of Policy Management,

Keio University

Former student of Iba Lab. Faculty of Policy Management,

Keio University

Former student of Iba Lab. Faculty of Policy Management,

Keio University

“Editorial Collaboration Networks of Wikipedia Articles in Various Languages”

Contact us: e-mail to [email protected]