CS224W:’Social’and’Information’Network’Analysis ...
Transcript of CS224W:’Social’and’Information’Network’Analysis ...
![Page 1: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/1.jpg)
CS224W: Social and Information Network Analysis Lada Adamic
http://cs224w.stanford.edu
![Page 2: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/2.jpg)
![Page 3: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/3.jpg)
![Page 4: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/4.jpg)
![Page 5: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/5.jpg)
![Page 6: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/6.jpg)
Stanford Social Web (ca. 1999)
network of personal homepages at Stanford
![Page 7: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/7.jpg)
Y
X
Y
X
Y X
Y
X
indegree
In each of the following networks, X has higher centrality than Y according to a particular measure
outdegree betweenness closeness
different notions of centrality
![Page 8: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/8.jpg)
Y
X
review: indegree
![Page 9: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/9.jpg)
trade in petroleum and petroleum products, 1998, source: NBER-‐United Nations Trade Data
![Page 10: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/10.jpg)
¡ Which countries have high indegree (import petroleum and petroleum products from many others) § Saudi Arabia § Japan § Iraq § USA § Venezuela
![Page 11: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/11.jpg)
review: outdegree
Y
X
![Page 12: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/12.jpg)
Nepal
Guyana
Ethiopia
Mauritius
Mali
Lebanon
Barbados
Haiti
Cambodia
Suriname
Guadeloupe
Mauritania
Fiji
Costa Rica
Cote Divoire
Bahamas
Jordan
Angola
Nigeria
CanadaUSA
Argentina
Brazil
MexicoJapan
IranIraq Kuwait
Oman
Saudi Arabia
Untd Arab Em
China HK SAR
Korea Rep. MalaysiaSingapore
Thailand
China
Belgium-Lux
France,Monac
GermanyItalyNetherlands
Spain
UK
SwedenRussian Fed
Australia
Indonesia
Poland
Algeria
Portugal
Libya
Jamaica
Panama
Malta
India
South Africa
VenezuelaColombia
Trinidad Tbg
Bahrain
Norway
Egypt
Gabon
Guatemala
Qatar
Afghanistan
Viet NamTaiwan
Myanmar
Sri Lanka
Pakistan
Nicaragua
Korea D P Rp
Guinea
Cuba
Bangladesh
Senegal
trade in petroleum and petroleum products, 1998, source: NBER-‐United Nations Trade Data
![Page 13: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/13.jpg)
¡ Which country has low outdegree but exports a significant quanDty (thickness of the edges represents $$ value of export) of petroleum products § Saudi Arabia § Japan § Iraq § USA § Venezuela
Nepal
Guyana
Ethiopia
Mauritius
Mali
Lebanon
Barbados
Haiti
Cambodia
Suriname
Guadeloupe
Mauritania
Fiji
Costa Rica
Cote Divoire
Bahamas
Jordan
Angola
Nigeria
CanadaUSA
Argentina
Brazil
MexicoJapan
IranIraq Kuwait
Oman
Saudi Arabia
Untd Arab Em
China HK SAR
Korea Rep. MalaysiaSingapore
Thailand
China
Belgium-Lux
France,Monac
GermanyItalyNetherlands
Spain
UK
SwedenRussian Fed
Australia
Indonesia
Poland
Algeria
Portugal
Libya
Jamaica
Panama
Malta
India
South Africa
VenezuelaColombia
Trinidad Tbg
Bahrain
Norway
Egypt
Gabon
Guatemala
Qatar
Afghanistan
Viet NamTaiwan
Myanmar
Sri Lanka
Pakistan
Nicaragua
Korea D P Rp
Guinea
Cuba
Bangladesh
Senegal
![Page 14: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/14.jpg)
Korea Rep.
Uruguay
Switz.Liecht
Sri Lanka
GibraltarArmenia
Ireland
Portugal
Nicaragua
Ghana
Morocco
Brazil
Paraguay
El Salvador
Slovenia
Cuba
Bulgaria
Dominican Rp
Barbados
Bermuda
Belarus
Mauritania
Philippines
Korea D P Rp
Burkina Faso
Uzbekistan
Myanmar
Costa Rica
TFYR Macedna Sudan
Senegal
Mongolia
Angola
NigeriaMexico Iran
Iraq
Kuwait
Oman
Saudi Arabia
Untd Arab Em
TurkeyUK
Lithuania
Russian Fed
Libya
Venezuela
Algeria
South Africa
Cote Divoire
USAColombia
Ecuador
Bahamas
Panama
Syria
Denmark
Netherlands
Finland
Norway
Sweden
Egypt
Cameroon
Gabon
Dem.Rp.Congo
Canada
Argentina
Bolivia
Chile
Peru
Guatemala
Trinidad Tbg
Yemen
Afghanistan
Indonesia
Malaysia
Singapore
China
Viet Nam
Estonia
Australia
Papua N.Guin
Kazakhstan
Italy
Spain
Qatar
New Zealand
Pakistan
Tunisia
Georgia
Thailand
Guinea
Liberia
Niger
JapanIndia
Taiwan
Ukraine
Germany
Greece
France,Monac
Austria
IsraelHungary
Benin
Azerbaijan
Belgium-Lux
Malta
Latvia
Jamaica
Poland
Czech Rep
Yugoslavia
Cyprus
Romania
Slovakia
Croatia
trade in crude petroleum and petroleum products, 1998, source: NBER-‐United Nations Trade Data
![Page 15: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/15.jpg)
Undirected degree, e.g. nodes with more friends are more central.
Assumption: the connections that your friend has don't matter, it is what they can do directly that does (e.g. go have a beer with you, help you build a deck...)
putting numbers to it
![Page 16: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/16.jpg)
divide degree by the max. possible, i.e. (N-1)
normalization
![Page 17: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/17.jpg)
Freeman’s general formula for centralization (can use other metrics, e.g. gini coefficient or standard deviation):
€
CD =CD (n
*) −CD (i)[ ]i=1
g∑[(N −1)(N − 2)]
How much variation is there in the centrality scores among the nodes?
maximum value in the network
centralization: skew in distribution
![Page 18: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/18.jpg)
CD = 0.167
CD = 0.167
CD = 1.0
degree centralization examples
![Page 19: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/19.jpg)
example financial trading networks
high in-centralization: one node buying from many others
low in-centralization: buying is more evenly distributed
real-‐world examples
![Page 20: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/20.jpg)
In what ways does degree fail to capture centrality in the following graphs?
![Page 21: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/21.jpg)
Stanford Social Web (ca. 1999)
network of personal homepages at Stanford
![Page 22: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/22.jpg)
Y
X
![Page 23: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/23.jpg)
![Page 24: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/24.jpg)
![Page 25: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/25.jpg)
¡ intuition: how many pairs of individuals would have to go through you in order to reach one another in the minimum number of hops?
Y X
![Page 26: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/26.jpg)
€
CB (i) = g jk (i) /g jkj<k∑
Where gjk = the number of shortest paths connecting jk gjk(i) = the number that actor i is on.
Usually normalized by:
€
CB' (i) = CB (i ) /[(n −1)(n − 2) /2]
number of pairs of vertices excluding the vertex itself
Betweenness: definition
![Page 27: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/27.jpg)
¡ non-normalized version:
![Page 28: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/28.jpg)
¡ non-normalized version:
A B C E D
n A lies between no two other vertices n B lies between A and 3 other vertices: C, D, and E n C lies between 4 pairs of vertices (A,D),(A,E),(B,D),(B,E)
n note that there are no alternate paths for these pairs to take, so C gets full credit
![Page 29: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/29.jpg)
¡ non-normalized version:
![Page 30: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/30.jpg)
¡ non-normalized version:
A B
C
E
D
n why do C and D each have betweenness 1?
n They are both on shortest paths for pairs (A,E), and (B,E), and so must share credit: n ½+½ = 1
![Page 31: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/31.jpg)
¡ What is the betweenness of node E?
E
![Page 32: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/32.jpg)
Lada’s old Facebook network: nodes are sized by degree, and colored by betweenness.
betweenness: example
![Page 33: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/33.jpg)
Q: high betweenness, low degree
¤ Find a node that has high betweenness but low degree
![Page 34: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/34.jpg)
Q: low betweenness, high degree
¤ Find a node that has low betweenness but high degree
![Page 35: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/35.jpg)
¡ What if it’s not so important to have many direct friends?
¡ Or be “between” others ¡ But one still wants to be in the “middle” of
things, not too far from the center
![Page 36: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/36.jpg)
need not be in a brokerage position
Y X
Y
X
Y
X
Y
X
![Page 37: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/37.jpg)
Closeness is based on the length of the average shortest path between a node and all other nodes in the network
€
Cc (i) = d(i, j)j=1
N
∑#
$ % %
&
' ( (
−1
€
CC' (i) = (CC (i)) /(N −1)
Closeness Centrality:
Normalized Closeness Centrality
closeness: definition
![Page 38: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/38.jpg)
Cc' (A) =
d(A, j)j=1
N
∑
N −1
#
$
%%%%
&
'
((((
−1
=1+ 2+3+ 4
4#
$%&
'(
−1
=104
#
$%&
'(
−1
= 0.4
A B C E D
Closeness: toy example
![Page 39: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/39.jpg)
Closeness: more toy examples
![Page 40: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/40.jpg)
Q:high degree, low closeness
Which node has relatively high degree but low closeness?
![Page 41: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/41.jpg)
¡ How central you are depends on how central your neighbors are
![Page 42: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/42.jpg)
c(β) =α(I −βA)−1A1• α is a normalization constant • β determines how important the centrality of your neighbors is
• A is the adjacency matrix (can be weighted) • I is the identity matrix (1s down the diagonal, 0 off-diagonal) • 1 is a matrix of all ones.
Bonacich eigenvector centrality
ci (β) = (α +βcjj∑ )Aji
![Page 43: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/43.jpg)
small β è high attenuation only your immediate friends matter, and their
importance is factored in only a bit high β è low attenuation
global network structure matters (your friends, your friends' of friends etc.) β = 0 yields simple degree centrality
Bonacich Power Centrality: attenuation factor β
ci (β) = (α +βcjj∑ )Aji
![Page 44: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/44.jpg)
If β > 0, nodes have higher centrality when they have edges to other central nodes. If β < 0, nodes have higher centrality when they have edges to less central nodes.
Bonacich Power Centrality: attenuation factor β
![Page 45: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/45.jpg)
β=.25
β=-.25
Why does the middle node have lower centrality than its neighbors when β is negative?
Bonacich Power Centrality: examples
![Page 46: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/46.jpg)
¡ WWW ¡ food webs ¡ population dynamics ¡ influence ¡ hereditary ¡ citation ¡ transcription regulation networks ¡ neural networks
![Page 47: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/47.jpg)
¡ We now consider the fraction of all directed paths between any two vertices that pass through a node
n Only modification: when normalizing, we have (N-‐1)*(N-‐2) instead of (N-‐1)*(N-‐2)/2, because we have twice as many ordered pairs as unordered pairs €
CB (i) = g jkj ,k∑ (i) /g jk
betweenness of vertex i paths between j and k that pass through i
all paths between j and k
€
CB
' (i) = CB(i) /[(N −1)(N − 2)]
![Page 48: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/48.jpg)
¡ A node does not necessarily lie on a geodesic (shortest path) from j to k if it lies on a geodesic from k to j
k
j
![Page 49: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/49.jpg)
¡ choose a direction § in-closeness (e.g. prestige in citation networks) § out-closeness
¡ usually consider only vertices from which the node i in question can be reached
![Page 50: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/50.jpg)
¡ PageRank (centrality) brings order to the Web: § it's not just the pages that point to you, but how many pages point to those pages, etc.
§ more difficult to arDficially inflate centrality with a recursive definiDon
Many webpages scattered across the web
an important page, e.g. slashdot
if a web page is slashdotted, it gains attention
![Page 51: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/51.jpg)
¡ A random walker following edges in a network for a very long time will spend a proportion of time at each node which can be used as a measure of importance
![Page 52: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/52.jpg)
¡ Problem with pure random walk metric: § Drunk can be “trapped” and end up going in circles
![Page 53: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/53.jpg)
¡ Allow drunk to teleport with some probability § e.g. random websurfer follows links for a while, but with
some probability teleports to a “random” page (bookmarked page or uses a search engine to start anew)
![Page 54: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/54.jpg)
1
2
3 4
5
7
6 8
00.10.2
0.30.40.50.60.7
0.80.91
1 2 3 4 5 6 7 8
PageRank
t=0
00.10.2
0.30.40.50.60.7
0.80.91
1 2 3 4 5 6 7 8
PageRank
t=1
20% teleportation probability
slide adapted from: Dragomir Radev
![Page 55: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/55.jpg)
1
2
3 4
5
7
6 8 00.10.2
0.30.40.50.60.7
0.80.91
1 2 3 4 5 6 7 8
PageRank
t=0
00.10.2
0.30.40.50.60.7
0.80.91
1 2 3 4 5 6 7 8
PageRank
t=1
00.10.2
0.30.40.50.60.7
0.80.91
1 2 3 4 5 6 7 8
PageRank
t=10
slide from: Dragomir Radev
![Page 56: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/56.jpg)
GUESS PageRank demo ¡ What happens to the
relative PageRank scores of the nodes as you increase the teleportation probability (decrease the damping factor)? § they equalize § they diverge § they are unchanged
PageRank.nlogo part of the built-‐in suite of network models for NetLogo
![Page 57: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/57.jpg)
¡ Centrality § many measures: degree, betweenness,
closeness, eigenvector § may be unevenly distributed
§ measure via distributions and centralization
§ in directed networks § indegree, outdegree, PageRank
§ consequences: § benefits & risks (Baker & Faulkner) § information flow & productivity (Aral & Van Alstyne)
![Page 58: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/58.jpg)
(Dme permiSng)
9/23/15 58 Jure Leskovec and Lada Adamic, Stanford CS224W: Social and InformaDon Network Analysis
![Page 59: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/59.jpg)
59
![Page 60: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/60.jpg)
60
![Page 61: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/61.jpg)
¡ The Response Time Gap
4939N =
ExpertiseRating
lowhigh
WA
ITT
IME(
min
)
10000
9000
8000
7000
6000
5000
4000
3000
2000
1000
0
6996
41
• The Expertise Gap • Difficult to infer reliability of answers
Automatically ranking expertise may be helpful.
Zhang, Ackerman, Adamic, WWW’07
![Page 62: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/62.jpg)
¡ 87 sub-forums ¡ 1,438,053
messages ¡ community
expertise network constructed: § 196,191 users § 796,270
edges
![Page 63: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/63.jpg)
A B C
Thread 1 Thread 2
Thread 1: Large Data, binary search or hashtable? user A Re: Large... user B Re: Large... user C
Thread 2: Binary file with ASCII data user A Re: File with... user C
A
B
C
1
1
A
B
C
1
2
A
B
C
1/2
1+1//2
A
B
C
0.9 0.1
unweighted
weighted by # threads
weighted by shared credit
weighted with backflow
![Page 64: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/64.jpg)
10 0 10 1 10 2 10 3 10 -4
10 -3
10 -2
10 -1
10 0
degree (k)
cum
ulat
ive
prob
abilit
y
α = 1.87 fit, R 2 = 0.9730
number of people one received replies from
number of people one replied to
§ ‘answer people’ may reply to thousands of others
§ ‘question people’ are also uneven in the number of repliers to their posts, but to a lesser extent
![Page 65: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/65.jpg)
• Core: A strongly connected component, in which everyone asks and answers • IN: Mostly askers. • OUT: Mostly Helpers
The Web is a bow tie The Java Forum network is
an uneven bow tie
![Page 66: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/66.jpg)
![Page 67: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/67.jpg)
¡ Human-rated expertise levels § 2 raters § 135 JavaForum users with >= 10 posts § inter-rater agreement (τ = 0.74, ρ = 0.83) § for evaluation of algorithms, omit users where raters disagreed by
more than 1 level (τ = 0.80, ρ = 0.83)
L Category Description 5 Top Java expert Knows the core Java theory and related
advanced topics deeply. 4 Java professional Can answer all or most of Java concept
questions. Also knows one or some sub topics very well,
3 Java user Knows advanced Java concepts. Can program relatively well.
2 Java learner Knows basic concepts and can program, but is not good at advanced topics of Java.
1 Newbie Just starting to learn java.
![Page 68: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/68.jpg)
simple local measures do as well (and better) than measures incorporating the wider network topology
Top K Kendall’s τ Spearman’s ρ
# answers z-score # answers indegree z-score indegree PageRank HITS authority
0.9 0.8 0.7 0.6 0.5 0.4 0.3
0.2 0.1
0
![Page 69: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/69.jpg)
10121617181920192N =
LEVCOM
1098765432
RAN
K o
f PR
AN
K
160
140
120
100
80
60
40
20
0
-20
9281
5
68
1
10121617181920192N =
LEVCOM
1098765432
RAN
K o
f RE
PLY
140
120
100
80
60
40
20
0
-20
40
101
10121617181920192N =
LEVCOM
1098765432
RANK
of Z
THRE
ADS
160
140
120
100
80
60
40
20
0
-20
40
1011
10121117171917192N =
LEVCOM
1098765432
RANK
of H
ITS_
A UT
140
120
100
80
60
40
20
0
-20
33
# answers
human rating
auto
mat
ed ra
nkin
g
10121617181920192N =
LEVCOM
1098765432
RAN
K o
f IN
DG
R
160
140
120
100
80
60
40
20
0
-20
40
101
10121117171917192N =
LEVCOM
1098765432
RAN
K o
f ZD
GR
140
120
100
80
60
40
20
0
-20
106104
z # answers
HITS authority
indegree
z indegree
PageRank
![Page 70: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/70.jpg)
Control Parameters: n Distribution of expertise n Who asks questions most often? n Who answers questions most often? n best expert most likely n someone a bit more expert
ExpertiseNet Simulator
![Page 71: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/71.jpg)
0 1 2 3 4 50
1
2
3
4
5
replier expertise
asker expertise
0.02
0.04
0.06
0.08
0.1
0.12
0 1 2 3 4 50
1
2
3
4
5
replier expertise
asker expertise
0
0.05
0.1
0.15
suppose: expertise is uniformly distributed probability of posing a question is inversely proportional to expertise pij = probability a user with expertise j replies to a user with expertise i
2 models:
‘best’ preferred ‘just better’ preferred
iep ijij /~ )( −β iep ji
ij /~ )( −γ j>i
![Page 72: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/72.jpg)
Best “preferred” just better
![Page 73: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/73.jpg)
best preferred (simulation) just better (simulation)
Java Forum Network as
ker i
ndeg
ree
aske
r ind
egre
e
aske
r ind
egre
e
![Page 74: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/74.jpg)
Preferred Helper: ‘just better’
Preferred Helper: ‘best available’
![Page 75: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/75.jpg)
In the ‘just better’ model, a node is correctly ranked by PageRank but not by HITS
![Page 76: CS224W:’Social’and’Information’Network’Analysis ...](https://reader036.fdocuments.net/reader036/viewer/2022062301/62a5a1352dd8ba21776af20f/html5/thumbnails/76.jpg)
¡ Node centrality can reveal the relaDve importance of nodes within the network
¡ Choose a measure appropriate to the quesDon you are asking
9/23/15 Jure Leskovec and Lada Adamic, Stanford CS224W: Social and InformaDon Network Analysis 76