Algorithms (OBF) Dummies_SPARK

Algorithms (of, by, and for) Dummies

Algorithms (of, by, and for) DummiesAkshay PadmanabhaRaja SelvakumarMIT ESP- Spark 2015

Introduction of ourselvesRaja SelvakumarCourse X, Class of 2017Akshay PadmanabhaCourse VI, VIII, Class of 2017

What is an algorithm?Finite and precise sequence of instructions for solving a problemMinimizing costs of operation: time, resources, energyExamples: tying a shoe, determining maximum in list of numbers, getting directions to classes for SPLASH

Topics in AlgorithmsGraph SearchData StructuresBFSDFSHash TablesBSTGraph SearchList of all adjacent nodes adjacency matrixABCDEFGA -> E,D,GD -> B,AB -> D,E,FE -> A,BF -> B,CC -> FG -> AGraph Search - BFSBFS Breadth First SearchExpands all surrounding nodes concurrently to find desired pathABCDEFGPath from A -> C:->A->AG, AE, AD->AG, AEB, ADB->AEBF, AEBD, ADBF, ADBE->AEBFC*Paths are entered in reverse alphabetic order (i.e. A, AG,AE,AD)No PruningGraph Search - DFSDFS Depth First SearchExpands first node entered until path terminatesABCDEFG*Paths are entered in reverse alphabetic order (i.e. A, AG,AE,AD)Path from A -> C:->A->AG, AE, AD->AG, AEB->AEBF, AEBD->AEBFCNo PruningWhich method was better (in this example)?Graph Search - BFSBFS Breadth First SearchExpands all surrounding nodes concurrently to find desired pathABCDEFGFind all nodes:->A->AG, AE, AD->AG, AEB, ADB->AEBF, AEBD, ADBF, ADBE->AEBFC, AEBDA, ADBFC, ADBEA->AEBFC, AEDBAG, AEDBAE, ADBFC, *Paths are entered in reverse alphabetic order (i.e. A, AG,AE,AD)No PruningCYCLEFind all nodes:->A->AG, AE, AD->AG, AEB, ADB->AEBF, AEBD->AEBFC->AEBFCWith PruningFind all nodes:->A->AG, AE, AD->AG, AEB->AEBF, AEBD->AEBFC->AEBFC->AEBDAG, AEBDAEFind all nodes:->A->AG, AE, AD->AG, AEB->AEBF, AEBD->AEBFC->AEBFCGraph Search - DFSDFS Depth First SearchExpands first node entered until path terminatesABCDEFG*Paths are entered in reverse alphabetic order (i.e. A, AG,AE,AD)No PruningWith PruningADB

AEBDA

Graph Search Discussion QuestionConsider that Akshay has returned from his travels abroad to Kendall Square. He wants to return back to main campus but as he is tired from his journey, he wants to take the shortest path back. Propose a path for Akshay to take using BFS and DFS.Assume pruning is in placeConsider that paths are inserted in numerical order

12345678101214151617181921222325262713Graph Search- SummaryBFS Breadth First SearchPrefer breadth over depth, expand all attached nodes concurrentlyUseful for finding nodes small distances awayDFS Depth First SearchPrefer depth over breadth, expand first node all the way until breaksUseful for finding nodes further awayPruningPrevents cycles in graph- no revisiting of nodes!Topics in AlgorithmsGraph SearchData StructuresBFSDFSHash TablesBSTWhat is a binary tree?BST stands for a Binary Search TreeWhat is a binary tree?A tree where each node has at most 2 children:What is a tree?A graph with no cycles!

So what is a BST?A Binary Search Tree (BST) is an ordered binary treeWhat does this mean?The value at a node is larger than its left child and smaller than its right child

Inserting an ElementHow do we insert an element into a BST?If the element is greater than the node, we move to the right childOtherwise, we move to the left child

1592Balanced BSTWhat is a balanced BST?If there are n elements in the BST, the height of a balanced BST is at most log2 nSimpler terms: most nodes have 2 children

1248= 15For 16:log2 16 = 4(new height)Cool Things about BSTs Finding an ElementIf there are n elements in a balanced BST, at most how many elements do we need to visit to find if the element is in the BST?log2 n!! (we have to travel the height of the tree worst-case)

Cool Things about BSTs Sorting a ListHow do we get a sorted list from a BST?In-order traversal!Why does this work?It works because the left child is always less than the parent node, and the right child is always greater!Thus, BSTs can be used for sorting!

What is a Hash Table?Say we have a (key, value) pair:For example, (John, 1), (Lisa, 7), etc.How do we put these keys into a structure so that we can retrieve these keys easily?Use a hash table!

What is a Hash Function?

A hash table uses a list of bucketsThese buckets is simply an array of objectsA hash function takes a key and puts it into a slot in the arraySmall ExampleSuppose we have an empty array of size 5: [ - , - , - , - , - ]Say the hash function we are using is put the object at the location specified by the value of the object (looping if necessary)For example, adding 2: [ - , 2 , - , - , - ]Adding 9: [ - , - , - , 9 , - ]Take this empty array and this hash function and use this as a hash table to put the following values: 5, 7, 14, 1[ 1 , 7 , - , 9 , 5 ]Retrieving ValuesLets take the final array from before: [ 1 , 7 , - , 9 , 5 ]How do we check to see if a value is in the array?Let us see if 7 is in the array.First, we take 7 and apply our hash function to it. This lets us know that 7 should be in slot 2. We then check slot 2 in the array and we see that in fact 7 IS in the array! This only took looking at one slot in the array.Let us now try finding 2 in the array. We apply our hash function and find that we need to look at slot 2. Since we do not see a 2 in the slot, we know that 2 is NOT in the array. Again, this only took looking at one slot in the array.CollisionsLets take the final array from before: [ 1 , 7 , - , 9 , 5 ]What if we try to put 2?This spot is taken! This is called a collisionWhat do we do?We can put the value in the next available open slot (this is called open addressing) :[ 1 , 7 , 2 , 9 , 5 ]However, looking for an element can take up to n look-ups in the array (where there are n elements in the array), which is very badChainingWhat are other ways we can handle collisions?We can use chaining!Let us use the same array as before ([ 1 , 7 , - , 9 , 5 ]) and try to add 2 again.This time, we keep track of all the elements hashed into the slot:[ 1 , [7, 2] , - , 9 , 5 ]Now, the chance that we have to use more than one lookup is smaller than before, as long as we are using good hash functions!Deleting ElementsHow do we delete elements?Let us use the same array in the chaining example: [ 1 , [7, 2] , - , 9 , 5 ]How do we delete 9? First we must see if 9 is in the array (using the hashing function) and then if it exists, we replace it with a -: [ 1 , [7, 2] , - , - , 5 ]However, if we try to delete 7, we should actually delete it, rather than replacing it with a -: [ 1 , 2 , - , 9 , 5 ]We see that this works better with chaining rather than open addressing: we need a flag for when an element is deleted![ 1 , 7 , 2 , 9 , 5 ] becomes [ 1 , F , 2 , 9 , 5 ]

Interview Question 1You are given two arrays. Find all the elements that are in both arrays.

We put all the elements of the first array in a hash table. Then, we see if any of the elements in the second array are in the hash table.We see that this algorithm is really fast because it goes through each element only once.Interview Question 2You are given a large graph with a starting node. How would you find all paths from the starting node to every node 2 edges away from the starting node?

We use BFS from the starting node and stop it when we are two edges away from the starting node.We see that this algorithm is really fast (unlike BFS) because it does not travel down the whole depth of the graph.Interview Question 3Find the number of times every element appears in an array.

We create a hash table. If the element is not in the hash table, we add it to the hash table with value 1. If the element is in the hash table, we increment the value for that key by 1. Finally, we go through each key in the hash table and return the key-value pair, which represents the number of times each element appears in the array.We see that this algorithm is really fast because goes through each element in the array only once.

Algorithms (OBF) Dummies_SPARK

Documents

Transcript of Algorithms (OBF) Dummies_SPARK