ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor Daniel Martin Katz

Post on 21-Jan-2015

506 views 2 download

Tags:

description

 

Transcript of ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor Daniel Martin Katz

Daniel Martin KatzMichigan State University

College of Law

Complex Systems Models in the Social Sciences

(Lecture 3)

Back to Where We Ended Our Last Class

Stanley Milgram’s Other Experiment

Milgram was interested in the structure of society

Including the social distance between individuals

While the term “six degrees” is often attributed to milgram it can be traced to ideas from hungarian author Frigyes Karinthy

What is the average distance between two individuals in society?

Stanley Milgram’s Other Experiment

NE

MA

Six Degrees of Separation?

NE

MA

Target person worked in Boston as a stockbroker

296 senders from Boston and Omaha.

20% of senders reached target.

Average chain length = 6.5.

And So the term ... “Six degrees of Separation”

Six Degrees

Six Degrees is a claim that “average path length” between two individuals in society is ~ 6

The idea of ‘Six Degrees’ Popularized through plays/movies and the kevin bacon game

http://oracleofbacon.org/

Six Degrees of Kevin Bacon

Visualization Source: Duncan J. Watts, Six Degrees

Six Degrees of Kevin Bacon

But What is Wrong with Milgram’s Logic?

150(150) = 22,500

150 3 = 3,375,000

150 4 = 506,250,000

150 5= 75,937,500,000

The Strength of ‘Weak’ Ties

Does Milgram get it right? (Mark Granovetter)

Visualization Source: Early Friendster – MIT Network

www.visualcomplexity.com

Strong and Weak Ties (Clustered

v. Spanning)

Clustering ---- My Friends’ Friends are also likely to be friends

So Was Milgram Correct?

Small Worlds (i.e. Six Degrees) was a theoretical and an empirical Claim

The Theoretical Account Was Incorrect

The Empirical Claim was still intact

Query as to how could real social networks display both small worlds and clustering?

At the Same time, the Strength of Weak Ties was also an Theoretical and Empirical proposition

Watts and Strogatz (1998)

A few random links in an otherwise clustered graph yields the types of small world properties found by Milgram

“Randomness” is key bridge between the small world result and the clustering that is commonly observed in real social networks

Watts and Strogatz (1998)

A Small Amount of Random Rewiring or Something akin to Weak Ties—Allows for Clustering and Small Worlds

Random Graphlocally Clustered

Different Form of Network Representation

1 mode

2 mode

2 mode

Actors and

Movies

Different Forms of Network Representation

1 mode

Actor to Actor

Could be Binary (0,1)

Did they Co-Appear?

Different Forms of Network Representation

Different Forms of Network Representation

1 mode

Actor to Actor

Could also beWeighted

(I.E. Edge Weights by Number of

Co-Appearences)

Features of Networks

Mesoscopic Community StructuresWe will discuss these next week

Macroscopic Graph Level PropertiesWe will discuss these today

Microscopic Node Level Properties We will discuss these Next week

Macroscopic Graph Level Properties

Degree Distributions (Outdegree & Indegree)

Clustering Coefficients

Connected Components

Shortest Paths

Density

Shortest Paths

Shortest Paths

The shortest set of links connecting two nodes

Also, known as the geodesic path

In many graphs, there are multiple shortest paths

Shortest Paths

Shortest Paths

A and C are connected by 2 shortest paths

A – E – B - C

A – E – D - C

Diameter: the largest geodesic distance in the graph

The distance between A and C is the maximum for the graph: 3

Shortest Paths

In the Watts -Strogatz Model Shortest Paths are reduced by increasing levels of random rewiring

Clustering Coefficients

Clustering Coefficients

Measure of the tendency of nodes in a graph to cluster

Both a graph level average for clustering

Also, a local version which is interested in cliqueness of a graph

Density

Density = Of the connections that could exist between n nodes

directed graph: emax = n*(n-1)!(each of the n nodes can connect to (n-1) other nodes)

undirected graph emax = n*(n-1)/2(since edges are undirected, count each one only once)

What Fraction are Present?

DensityWhat fraction are present?density = e / emax

For example, out of 12possible connections.. this graph

this graph has 7, giving it a density of 7/12 = 0.58

A “fully connected graph has a density =1

Connected Components

We are often interested in whether the graph has a single or multiple connected components

Strong Components

Giant Component

Weak Components

“Largest Weakly Connected Component” in the SCOTUS Citation Network

There exist cases that are not in this visual as they are disconnected as of the year 1830

However, by 2009, 99% of SCOTUS Decisions are in the Largest Weakly Connected Component

Connected Components

Open “Giant Component” from the netlogo models Library

Connected Components

Notice the fraction of nodes in the

giant component

Notice the Size of the “Giant

Component”

Model has been

advanced 25+ Ticks

Connected Components

Model has been

advanced 80+ Ticks

Notice the fraction of nodes in the

giant component

Notice the Size of the “Giant

Component”

Connected Components

Model has been

advanced 120+ Ticks

Notice the fraction of nodes in the

giant component

Notice the Size of the “Giant Component”now = “num-nodes”

in the slider

Degree Distributions

outdegreehow many directed edges (arcs) originate at a node

indegreehow many directed edges (arcs) are incident on a node

degree (in or out)number of edges incident on a node

Indegree=3

Outdegree=2

Degree=5

Node Degree from

Matrix Values

Outdegree:

outdegree for node 3 = 2, which we obtain by summing the number of non-zero entries in the 3rd row

Indegree:

indegree for node 3 = 1, which we obtain by summing the number of non-zero entries in the 3rd column

Degree Distributions

These are Degree Count for particular nodes but we are also interested in the distribution of arcs (or edges) across all nodes

These Distributions are called “degree distributions”

Degree distribution: A frequency count of the occurrence of each degree

Degree Distributions

Imagine we have this 8 node network:

In-degree sequence:[2, 2, 2, 1, 1, 1, 1, 0]

Out-degree sequence:[2, 2, 2, 2, 1, 1, 1, 0]

(undirected) degree sequence:[3, 3, 3, 2, 2, 1, 1, 1]

Degree Distributions

Imagine we have this 8 node network:

In-degree distribution:[(2,3) (1,4) (0,1)]

Out-degree distribution:[(2,4) (1,3) (0,1)]

(undirected) distribution:[(3,3) (2,2) (1,3)]

Why are Degree Distributions Useful?

They are the signature of a dynamic process

We will discuss in greater detail tomorrow

Consider several canonical network models

Canonical Network Models

Erdős-Renyi Random Network

Highly Clustered Network

Watts-Strogatz Small World Network

Highly Clustered Highly Clustered

Barabási-Albert Preferential

Attachment Network

Why are Degree Distributions Useful?

Barabási-Albert Preferential

Attachment Network

Barabási-Albert Preferential Attachment

Netlogo Models Library --> Networks --> Preferential Attachment

Watch the Changing Degree Distribution

Barabási-Albert Preferential Attachment

Netlogo Models Library --> Networks --> Preferential Attachment

Barabási-Albert Preferential Attachment

Netlogo Models Library --> Networks --> Preferential Attachment

Barabási-Albert Preferential Attachment

Netlogo Models Library --> Networks --> Preferential Attachment

Barabási-Albert Preferential Attachment

Netlogo Models Library --> Networks --> Preferential Attachment

Barabási-Albert Preferential Attachment

Netlogo Models Library --> Networks --> Preferential Attachment

Back to the Milgram

Experiment

The Milgram Experiment

How did the successful subjects actually succeed?

How did they manage to get the envelope from nebraska to boston?

this is a question regarding how individuals conduct searches in their networks

Given most individuals do not know the path to distantly linked individuals

Search in Networks

Most individuals do not know the path to an individual who is many hops away

Must rely on some sort of heuristic rules to determine the possible path

Search in Networks

What information about the problem might the individual attempt to leverage?

visual by duncan watts

dimensional data:

send it to a stockbrokersend it to closet possible city to boston

Follow up to the original Experiment

available at: http://research.yahoo.com/pub/2397

Published in Science in 2003