1 CSC 321: Data Structures Fall 2013 Graphs & search graphs, isomorphism simple vs. directed vs....

38
1 CSC 321: Data Structures Fall 2013 Graphs & search graphs, isomorphism simple vs. directed vs. weighted adjacency matrix vs. adjacency lists Java implementations graph traversals depth-first search, breadth-first search

Transcript of 1 CSC 321: Data Structures Fall 2013 Graphs & search graphs, isomorphism simple vs. directed vs....

Page 1: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

1

CSC 321: Data Structures

Fall 2013

Graphs & search graphs, isomorphism simple vs. directed vs. weighted adjacency matrix vs. adjacency lists Java implementations graph traversals

depth-first search, breadth-first search

Page 2: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

2

Graphs

trees are special instances of the more general data structure: graphsinformally, a graph is a collection of nodes/data elements with connections

many real-world and CS-related applications rely on graphs the connectivity of computers in a network cities on a map, connected by roads cities on a airline schedule, connected by flights Facebook friends finite automata

Page 3: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Interesting graphs

3

Page 4: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Interesting graphs

4

graph of the Internetcreated at San Diego Supercomputer Center (UCSD)depicts round-trip times of data packets sent from Herndon, VA to various sites on the Internet

http://ucsdnews.ucsd.edu/

Page 5: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Interesting graphs

5

TouchGraph Facebook Browserallows you to view all of your friends and groups as a graph

http://www.touchgraph.com/facebook

Page 6: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Interesting graphs

6

graph view of www.creighton.edu

blue: for links red: for tables green: for the DIV tagviolet: for images yellow: for forms orange: for breaks & blockquotes black: the HTML taggray: all other tags

http://www.aharef.info/static/htmlgraph/

Page 7: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Interesting graphs

7

NFL 2011 @ week 11edges from one team to another show a clear beat path

e.g., Chicago > Tampa > Indy

http://beatpaths.com/

Page 8: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Formal definition

a simple graph G consists of a nonempty set V of vertices, and a set E of edges (unordered vertex pairs)can write simply as G = (V, E)

e.g., consider G = (V, E) where V = {a, b, c, d} and E = {{a,b}, {a, c}, {b,c}, {c,d}} DRAW IT!

8

graph terminology for G = (V, E)vertices u and v are adjacent if {u, v} E (i.e., connected by an edge)edge {u,v} is incident to vertices u and v the degree of v is the # of edges incident with it (or # of vertices adjacent to it)|V| = # of vertices |E| = # of edgesa path between v1 and vn is a sequence of edges from E:

[ {v1, v2}, {v2, v3}, …, {vn-2, vn-1}, {vn-1, vn} ]

Page 9: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Depicting graphs

you can draw the same set of vertices & edges many different ways

9

are these different graphs?

Page 10: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Isomorphism

how can we tell if two graphs are basically the same?

G1 = (V1, E1) and G2 = (V2, E2) are isomorphic if there is mapping f:V1V2 such that {u,v} E1 iff {f(u), f(v)} E2

i.e., G1 and G2 are isomorphic if they are identical modulo a renaming of vertices

10

Page 11: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Common graphs empty graph En has n vertices and no edges line graph Ln has n vertices connected in sequence HOW MANY? complete graph Kn has n vertices with an edge between each pair HOW MANY?

11

E5 L5 K5

Page 12: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Simple vs. directed?

in a simple graph, edges are presumed to be bidirectional• {u,v} E is equivalent to saying {v,u} E

for many real-world applications, this is appropriate if computer c1 is connected to computer c2, then c2 is connected to c1 if person p1 is Facebook friends with person p2, then p2 is friends with p1

however, in other applications the connection may be 1-way a flight from city c1 to city c2 does not imply a return flight from c2 to c1 roads connecting buildings in a town can be 1-way (with no direct return route) NFL beat paths are clearly directional

12

Page 13: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Directed graphs

a directed graph or digraph G consists of a nonempty set V of vertices, and a set E of edges (ordered vertex pairs)

draw edge (u,v) as an arrow pointing from vertex u to vertex v

e.g., consider G = (V, E) where V = {a, b, c, d} and E = {(a,b), (a, c), (b,c), (c,d)}DRAW IT!

13

Page 14: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Weighted graphs

a weighted graph G = (V, E) is a graph/digraph with an additional weight function ω:E

draw edge (u,v) labeled with its weight

shortest path from BOS to SFO? does the fewest legs automatically imply the shortest path in terms of miles?

14

from Algorithm Design by Goodrich & Tamassia

Page 15: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Graph data structures an adjacency matrix is a 2-D array, indexed by the vertices

for graph G, adjMatrix[u][v] = 1 iff {u,v} E (store weights for digraphs)

15

an adjacency list is an array of linked listsfor graph G, adjList[u] contains v iff {u,v} E (also store weights for digraphs)

Page 16: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Matrix vs. list

space tradeoff adjacency matrix requires O(|V|2) space adjacency list requires O(|V| + |E|) space

which is better?

16

it depends

if the graph is sparse (few edges), then |V|+|E| < |V|2 adjacency list

if the graph is dense, i.e., |E| is O(|V|2), then |V|2 < |V|+|E| adjacency matrix

many graph algorithms involve visiting every adjacent neighbor of a node adjacency list makes this efficient, so tends to be used more in practice

Page 17: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Java implementations

to allow for alternative implementations, we will define a Graph interface

import java.util.Set;

public interface Graph<E> { public void addEdge(E v1, E v2); public Set<E> getAdj(E v); public boolean containsEdge(E v1, E v2); }

many other methods are possible, but these suffice for now

from this, we can implement simple graphs or directed graphscan utilize an adjacency matrix or an adjacency list

17

Page 18: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

DiGraphMatrixpublic class DiGraphMatrix<E> implements Graph<E>{ private E[] vertices; private Map<E, Integer> lookup; private boolean[][] adjMatrix; public DiGraphMatrix(Collection<E> vertices) { this.vertices = (E[])(new Object[vertices.size()]); this.lookup = new HashMap<E, Integer>(); int index = 0; for (E nextVertex : vertices) { this.vertices[index] = nextVertex; lookup.put(nextVertex, index); index++; } this.adjMatrix = new boolean[index][index]; }

public void addEdge(E v1, E v2) { Integer index1 = this.lookup.get(v1); Integer index2 = this.lookup.get(v2); if (index1 == null || index2 == null) { throw new java.util.NoSuchElementException(); } this.adjMatrix[index1][index2] = true; }

...

18

to create adjacency matrix, constructor must be given a list of all vertices

however, matrix entries are not directly accessible by vertex name

index vertex requires vertices array

vertex index requires lookup map

Page 19: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

DiGraphMatrix

...

public boolean containsEdge(E v1, E v2) { Integer index1 = this.lookup.get(v1); Integer index2 = this.lookup.get(v2); return index1 != null && index2 != null && this.adjMatrix[index1][index2]; }

public Set<E> getAdj(E v) { int row = this.lookup.get(v); Set<E> neighbors = new HashSet<E>(); for (int c = 0; c < this.adjMatrix.length; c++) { if (this.adjMatrix[row][c]) { neighbors.add(this.vertices[c]); } } return neighbors; }}

19

containsEdge is easy lookup indices of verticesthen access row/column

getAdj is more work lookup index of vertex traverse rowcollect all adjacent vertices in a set

Page 20: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

DiGraphList

public class DiGraphList<E> implements Graph<E>{ private Map<E, Set<E>> adjLists; public DiGraphList() { this.adjLists = new HashMap<E, Set<E>>(); }

public void addEdge(E v1, E v2) { if (!this.adjLists.containsKey(v1)) { this.adjLists.put(v1, new HashSet<E>()); } this.adjLists.get(v1).add(v2); }

public Set<E> getAdj(E v) { if (!this.adjLists.containsKey(v)) { return new HashSet<E>(); } return this.adjLists.get(v); }

public boolean containsEdge(E v1, E v2) { return this.adjLists.containsKey(v1) && this.getAdj(v1).contains(v2); }}

20

we could implement the adjacency list using an array of LinkedLists simpler to make use of

HashMap to map each vertex to a Set of adjacent vertices (wastes some memory, but easy to code)

note that constructor does not need to know all vertices ahead of time

Page 21: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

GraphMatrix & GraphList

public class GraphMatrix<E> extends DiGraphMatrix<E> { public GraphMatrix(Collection<E> vertices) { super(vertices); } public void addEdge(E v1, E v2) { super.addEdge(v1, v2); super.addEdge(v2, v1); }}

21

constructors simply call the superclass constructors

addEdge adds edges in both directions using the superclass addEdge

public class GraphList<E> extends DiGraphList<E> { public GraphList() { super(); } public void addEdge(E v1, E v2) { super.addEdge(v1, v2); super.addEdge(v2, v1); }}

can utilize inheritance to implement simple graph variants can utilize the same internal structures, just add edges in both directions

Page 22: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Graph traversal

many applications involve systematically searching through a graph starting at some vertex, want to traverse the graph and inspect/process vertices

along paths

e.g., list all the computers on the networke.g., find a sequence of flights that get the customer from BOS to SFO (perhaps with

some property?)e.g., display the football teams in the partial ordering

similar to tree traversals, we can follow various patterns recall for trees: inorder, preorder, postorder

for arbitrary graphs, two main alternatives depth-first search (DFS): boldly follow a single path as long as possible breadth-first search (BFS): tentatively try all paths level-by-level

22

Page 23: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

DFS vs. BFS

Consider the following digraph:

23

depth-first search (starting at 0) would select an adjacent vertex 0 1 select adjacent vertex from there 0 1 3 select adjacent vertex from there 0 1 3 2 select adjacent vertex from there 0 1 3 2

4 …

breadth-first search (starting at 0) would generate length-1 paths 0 1

0 4 expand to length-2 paths 0 1 3

0 4 1 expand to length-3 paths 0 1 3 2

0 4 1 3

Page 24: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Bigger examples

DFS? BFS?

24

Page 25: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

DFS implementation

depth-first search can be implemented easily using recursion

25

public static <E> void DFS1(Graph<E> g, E v) { GraphSearch.DFS1(g, v, new HashSet<E>()); } private static <E> void DFS1(Graph<E> g, E v, Set<E> visited) { if (!visited.contains(v)) { System.out.println(v); visited.add(v); for (E adj : g.getAdj(v)) { GraphSearch.DFS1(g, adj, visited); } } }

here, it just prints the vertices as they are visited – we could process in a variety of ways

need to keep track of the visited vertices to avoid cycles

here, pass around a HashSet to remember visited vertices

Page 26: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

DFS implementation

depth-first search can be implemented easily using recursion

26

public static <E> void DFS1(Graph<E> g, E v) { GraphSearch.DFS1(g, v, new HashSet<E>()); } private static <E> void DFS1(Graph<E> g, E v, Set<E> visited) { if (!visited.contains(v)) { System.out.println(v); visited.add(v); for (E adj : g.getAdj(v)) { GraphSearch.DFS1(g, adj, visited); } } }

here, it just prints the vertices as they are visited – we could process in a variety of ways

need to keep track of the visited vertices to avoid cycles

here, pass around a HashSet to remember visited vertices

Page 27: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Stacks & Queues

DFS can alternatively be implemented using a stack

start with the initial vertex on the stack at each step, visit the vertex at the top of

the stack when you visit a vertex, pop it & push its

adjacent neighbors (as long as not already visited)

5 1 3 3 2 4

2 2 2 2 2 2

0 4 4 4 4 4 4 4

BFS is most naturally implemented using a queue

start with the initial vertex in the queue at each step, visit the vertex at the front

of the queue when you visit a vertex, remove it & add

its adjacent neighbors (as long as not already visited)

0 4 2 13 5 4 24 3 5 41 4 3 51 4 32 1 42 12

27

5

Page 28: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Implementationspublic static <E> void DFS2(Graph<E> g, E v) { Stack<E> unvisited = new Stack<E>(); unvisited.push(v); Set visited = new HashSet<E>(); while (!unvisited.isEmpty()) { E nextV = unvisited.pop(); if (!visited.contains(nextV)) { System.out.println(nextV); visited.add(nextV); for (E adj : g.getAdj(nextV)) { unvisited.push(adj); } } }}

28

public static <E> void BFS(Graph<E> g, E v) { Queue<E> unvisited = new LinkedList<E>(); unvisited.add(v); Set visited = new HashSet<E>(); while (!unvisited.isEmpty()) { E nextV = unvisited.remove(); if (!visited.contains(nextV)) { System.out.println(nextV); visited.add(nextV); for (E adj : g.getAdj(nextV)) { unvisited.add(adj); } } }}

DFS utilizes a Stack of unvisited vertices results in the longest path being

expanded

as before, uses a Set to keep track of visited vertices & avoid cycles

BFS is identical except that the unvisited vertices are stored in a Queue

results in the shortest path being expanded

similarly uses a Set to avoid cycles

Page 29: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Topological sort

a slight variant of DFS can be used to solve an interesting problem topological sort: given a partial ordering of items (e.g., teams in a beat-path

graph), generate a possible complete ordering (e.g., a possible ranking of teams)

29

Consider Bears subgraph only: 1. CHI 2. ATL 3. TB 4. TEN 5. SEA 6. IND 7. SD 8. MIN 9. CAR10. ARI11. STL

1. CHI 2. TB 3. SD 4. MIN 5. ATL 6. TEN 7. SEA 8. ARI 9. STL10. IND11. CAR

Page 30: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Topological sort

30

public static <E> void DFS1(Graph<E> g, E v) { GraphSearch.DFS1(g, v, new HashSet<E>());} private static <E> void DFS1(Graph<E> g, E v, Set<E> visited) { if (!visited.contains(v)) { System.out.println(v); visited.add(v); for (E adj : g.getAdj(v)) { GraphSearch.DFS1(g, adj, visited); } }}

public static <E> void topo(Graph<E> g, E v) { GraphSearch.topo(g, v, new HashSet<E>());} private static <E> void topo(Graph<E> g, E v, Set<E> visited) { if (!visited.contains(v)) { visited.add(v); for (E adj : g.getAdj(v)) { GraphSearch.topo(g, adj, visited); } System.out.println(v); }}

if you take the recursive DFS code and do the print AFTER the recursion, you end up with a topological sort (in reverse)

Page 31: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Finite State Machines

many useful problems can be defined using a special weighted graph a Finite State Machine (a.k.a. Finite Automaton) defines finite set of states (i.e.,

vertices) along with transitions between those states (i.e., edges)

e.g., the logic controlling a coin-operated turnstile

31

can be in one of two states: locked or unlocked if locked, pushing it does not allow passage & stays locked

inserting coin unlocks it if unlocked, pushing allows passage & then relocks

inserting coin keeps it unlocked

Page 32: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

More complex examplee.g., a vending machine that takes coins (N, D, Q) and sells gum for 35¢ (for now,

assume exact change only)

32

10¢

20¢

15¢

25¢

30¢

Q

Q

D

D D

DD

DN

N

N

N

Q

N

N

N

35¢

0¢ is the start state (a.k.a. the initial state) there can only be one

35¢ is a goal state (a.k.a. an accepting state)

there can be multiple ones

HOW COULD WE HANDLE RETURN CHANGE?

Page 33: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Recognition/acceptance

we say that a FSM recognizes a pattern if all sequences that meet that pattern terminate in an accepting state

33

alternatively, we say that a FSM defines a language made up of the sequences accepted by the FSM a language is known as a regular language if there exists a FSM that accepts it

note: * denotes arbitrary repetition, + denotes OR

L1 = 1*(01*0)*1* L2 = (01 + 10)*

Page 34: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Deterministic vs. nondeterministic

the examples so far are deterministic, i.e., each transition is unambiguous a FSM can also be nondeterministic, if there are multiple transitions from a state

with different labels/weights nondetermineistic FSM are especially useful for defining complex regular

languages

34

Page 35: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

Uses of FSM

FSM machines are useful in many areas of CS designing software for controlling devices (e.g., vending machine, elevator, …)

may start by identifying states, constructing a table of state transitions, finally implementing in code

parsing symbols for programming languages or text processingmay define the language using rules or via a regular expression, then build a FSM to parse the characters into tokens & symbols

designing circuitrycommonly used to model complex circuitry, design components

35

Page 36: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

MFT CS exam – sample question

36

Page 37: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

GRE CS exam – sample question

37

Page 38: 1 CSC 321: Data Structures Fall 2013 Graphs & search  graphs, isomorphism  simple vs. directed vs. weighted  adjacency matrix vs. adjacency lists

GRE CS exam – sample question

38