COMS 6998-06 Network Theory Week 4: September 29, 2010
-
Upload
cheyenne-frazier -
Category
Documents
-
view
36 -
download
3
description
Transcript of COMS 6998-06 Network Theory Week 4: September 29, 2010
COMS 6998-06 Network TheoryWeek 4: September 29, 2010
Dragomir R. RadevWednesdays, 6:10-8 PM
325 Pupin TerraceFall 2010
(27) Self-similarity
Similarity and self-similarity
Sierpinski Gasket
See also Koch’s snowflake: http://en.wikipedia.org/wiki/Koch_snowflake http://www.arcytech.org/java/fractals/koch.shtml
The Cantor set
Measuring a fractal’s dimension
• In the Sierpinski gasket example, we need at the first step 4 triangles of side ½, at the second step we need 3 such triangles, then at the third step we need 9 triangles of side ¼.
• Let N() be the number of triangles with side 1/ . Then the fractal dimension is:
/1ln
)(lnlim
0
ND
Box counting
N(1) = 1N(1/2) = 3N(1/4) = N((1/2)2) = 9 = 32
N(1/8) = N((1/2)3) = 27 = 33
…N((1/2)n) = 3n.
http://classes.yale.edu/fractals/FracAndDim/BoxDim/GasketBoxDim/GasketLogLog.html
Effective fractal dimension
• For a compact triangle:– At the beginning, D = ln4/ln2– After one iteration, D = ln16/ln4 = 2
• For the Sierpinski gasket:D = ln3/ln2 = 1.5850
• For the Koch curve:D = ln4/ln3 = 1.2618
• For the Cantor set:D = ln2/ln3 = 0.6309
A self-similar fern
(7) Small world networks
The idea of a small world
• Milgram’s experiment (1960s)• Send a package to a stockbroker n Boston• 296 senders• 20% reached target• Chain length (avg) = 6.5• Recent reenactment by Dodds et al.
(2003) with 18 targets, 13 countries, 60K participants, only 384 reached the target with path length of 4.
The Watts-Strogatz model
• How to keep the diameter of a growing random graph small?
• Simple model: starts with a regular lattice.• Two parameters:
– Coordination number z: how many neighbors each node has
– Shortcuts probability p: for an existing edge, the probability to draw a shortcut between two random nodes
– Total number of shortcuts is mp=nzp/2
The Watts-Strogatz model
Diameter
• Example (Amaral and Barthelemy, 1999): d=1, N=1000, z=10, p=0.25: d=3.6
• If p=0.016 (=1/64), the diameter d=7.6
Clustering coefficient
• It mirrors the underlying lattice structure.• According to (Barrat and Weigt, 2000)
• In the limit, C=3/4
3)1()12(2
)1(3p
z
zC
Properties
large :43
large :2
/ C
K N/l
For random graphs
For lattices
N
KC
ln
ln
K
Nl
Degree distributionFrom (Barrat and Veigt, 2000)
Kleinberg model
• Use geographical distance (e.g., p ~1/d2)
HW 1• Analyze a network data set• Submit a PR-style 6 page paper• Check class home page for examples and instructions
• Model papers– How to become a superhero, P. M. Gleiser, J. Stat. Mech. (2007) P09020
http://arxiv.org/abs/0708.2410 – The Political Blogosphere and the 2004 U.S. Election: Divided They Blog (2005)
http://www.blogpulse.com/papers/2005/AdamicGlanceBlogWWW.pdf – Patterns in syntactic dependency networks, Ramon Ferrer Cancho, Ricard V. Solé, and
Reinhard Köhler, PHYSICAL REVIEW E 69, 051915 (2004) http://complex.upf.es/~ricard/syntaxPRE51915.pdf
– Network properties of written human language, A. P. Masucci and G. J. Rodgers, Phys. Rev. E 74, 026102 (2006) http://arxiv.org/abs/physics/0605071
– An evaluation of human protein-protein interaction data in the public domain, BMC Bioinformatics 2006, 7(Suppl 5):S19http://www.biomedcentral.com/1471-2105/7/S5/S19/abstractDatabase: This database is hand-curated. There are around 25,000 proteins and 35,000 interactions http://www.hprd.org/download
Examples
• program committees of conferences in NLP/CL or IR or ML • Skitter (http://www.caida.org/tools/measurement/skitter/)• syntactic dependencies • mentions of named entities in text • wikipedia • social networking sites such as myspace, facebook, linkedin, etc.. • product recommendations for sites such as amazon, ebay, clothing sites etc.. • youtube related videos • adjective/noun network • Two words are connected if one appears in the directory definition of another. • analyze the AAN author network, collaboration network, or title network (two paper titles are connected if they
share a non-stop word) • people or locations that are mentioned in the same news story • collocation networks (Dorogovtsev and Mendes) • co-occurrence or other sentence graphs • concept, thesaurus, and association graphs • citation • Web Related • similarity-based (e.g., cosine) • http://www.nd.edu/~networks/resources.htm• http://deim.urv.cat/~aarenas/data/welcome.htm• http://www-personal.umich.edu/~mejn/netdata/ • http://www.sciencemag.org/cgi/content/full/302/5651/1727