Social Network Theory

download Social Network Theory

of 47

  • date post

    17-Nov-2014
  • Category

    Technology

  • view

    1.833
  • download

    0

Embed Size (px)

description

 

Transcript of Social Network Theory

  • 1. Social Network Theory Networked Life CSE 112 Spring 2006 Prof. Michael Kearns

2. Natural Networks and Universality

  • Consider the many kinds of networks we have examined:
    • social, technological, business, economic, content,
  • These networks tend to share certaininformalproperties:
    • large scale; continual growth
    • distributed, organic growth: vertices decide who to link to
    • interaction (largely) restricted to links
    • mixture of local and long-distance connections
    • abstract notions of distance: geographical, content, social,
  • Do natural networks share morequantitativeuniversals?
  • What would these universals be?
  • How can we make them precise and measure them?
  • How can we explain their universality?
  • This is the domain ofsocial network theory
  • Sometimes also referred to aslink analysis

3. Some Interesting Quantities

  • Connected components:
    • how many, and how large?
  • Network diameter:
    • maximum (worst-case) or average?
    • exclude infinite distances? (disconnected components)
    • the small-world phenomenon
  • Clustering:
    • to what extent do links tend to cluster locally?
    • what is the balance between local and long-distance connections?
    • what roles do the two types of links play?
  • Degree distribution:
    • what is the typical degree in the network?
    • what is the overall distribution?

4. A Canonical Natural Network has

  • Fewconnected components:
    • often only 1 or a small number independent of network size
  • Smalldiameter:
    • often a constant independent of network size (like 6)
    • or perhaps growing only logarithmically with network size
    • typically exclude infinite distances
  • Ahighdegree of clustering:
    • considerably more so than for a random network
    • in tension with small diameter
  • Aheavy-taileddegree distribution:
    • a small but reliable number of high-degree vertices
    • quantifies Gladwells connectors
    • often ofpower lawform

5. Some Models of Network Generation

  • Random graphs (Erdos-Renyi models):
    • gives few components and small diameter
    • does not give high clustering and heavy-tailed degree distributions
    • is the mathematically most well-studied and understood model
  • Watts-Strogatz and related models:
    • give few components, small diameter and high clustering
    • does not give heavy-tailed degree distributions
  • Preferential attachment:
    • gives few components, small diameter and heavy-tailed distribution
    • does not give high clustering
  • Hierarchical networks:
    • few components, small diameter, high clustering, heavy-tailed
  • Affiliation networks:
    • models group-actor formation
  • Nothing magic about any of the measures or models

6. Approximate Roadmap

  • Examine a series of models of network generation
    • macroscopic properties they do and do not entail
    • pros and cons of each model
  • Examine some real life case studies
  • Study some dynamics issues (e.g. navigation)
  • Move into in-depth study of the web as network

7. Probabilistic Models of Networks

  • All of the network generation models we will study areprobabilisticorstatistical in nature
  • They can generate networks of any size
  • They often have variousparametersthat can be set:
    • size of network generated
    • average degree of a vertex
    • fraction of long-distance connections
  • The models generate adistributionover networks
  • Statements are alwaysstatisticalin nature:
    • with high probability , diameter is small
    • on average,degree distribution has heavy tail
  • Thus, were going to need some basic statistics and probability theory

8. Statistics and Probability Theory: The Absolute, Bare-Minimum Essentials 9. Probability and Random Variables

  • Arandom variableX is simply a variable thatprobabilisticallyassumes values in some set
    • set of possible values sometimes called thesample spaceS of X
    • sample space may be small and simple or large and complex
      • S = {Heads, Tails}, X is outcome of a coin flip
      • S = {0,1,,U.S. population size}, Xis number voting democratic
      • S=all networks of size N, Xis generated bypreferential attachment
  • Behavior of X determined by itsdistribution(ordensity )
    • for each value x in S, specify Pr[X = x]
    • these probabilities sum to exactly 1 (mutually exclusive outcomes)
    • complex sample spaces (such as large networks):
      • distribution often definedimplicitlyby simpler components
      • might specify the probability that eachedgeappears independently
      • thisinducesa probability distribution overnetworks
      • may be difficult tocomputeinduced distribution

10. Some Basic Notions and Laws

  • Independence :
    • let X and Y be random variables
    • independence: for any x and y, Pr[X = x & Y = y] = Pr[X=x]Pr[Y=y]
    • intuition: value of X does not influence value of Y, vice-versa
    • dependence:
      • e.g. X, Y coin flips, but Y is always opposite of X
  • Expected (mean) valueof X:
    • only makes sense fornumericrandom variables
    • average value of X according to its distribution
    • formally, E[X] = (Pr[X = x] *x), sum is over all x in S
    • often denoted by
    • alwaystrue: E[X + Y] = E[X] + E[Y]
    • forindependentrandom variables: E[XY] = E[X]E[Y]
  • Varianceof X:
    • Var(X) = E[(X )^2]; often denoted by ^2
    • standard deviationis sqrt(Var(X)) =

11. Convergence to Expectations

  • Let X1, X2,, Xn be:
    • independentrandom variables
    • with thesamedistribution Pr[X=x]
    • expectation= E[X] and variance ^2
    • independent and identically distributed (i.i.d.)
    • essentially n repeated trials of the same experiment
    • natural to examine r.v. Z = (1/n) Xi, where sum is over i=1,,n
    • example: number of heads in a sequence of coin flips
    • example: degree of a vertex in the random graph model
    • E[Z] = E[X]; what can we say about thedistributionof Z?
  • Central Limit Theorem:
    • as n becomes large, Z becomesnormally distributed
      • with expectation and variance ^2/n
    • heres ademo

12. The Normal Distribution

  • ThenormalorGaussiandensity:
    • applies to continuous, real-valued random variables
    • characterized by mean (average) and standard deviation
    • densityat x is defined as
      • (1/( sqrt(2 ))) exp(-(x- )^2/2 ^2)
      • special case= 0,= 1: a exp(-x^2/b) for some constants a,b > 0