ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics...

13
ISOM MIS 215 Module 5 – Binary Trees

Transcript of ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics...

Page 1: ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.

ISOM

MIS 215 Module 5 – Binary Trees

Page 2: ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.

ISOM

Where are we?

2

Intro toJava, Course

Java lang.basics

Arrays

Introduction

Newbie Programmers Developers ProfessionalsDesigners

MIS215

Binary Search

SearchTechniques Sorting Techniques

Bubblesort

Basic Algorithms

Fast Sorting algos(quicksort, mergesort)

Hashtables

Graphs,Trees

LinkedLists

Stacks,Queues

List Structures Advanced structures

Page 3: ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.

ISOM

Today’s buzzwords

• Key A component of an object that is typically used for quick retrieval of the

object

• Hashing function A function that takes an object and generates a number (or some form of

an address) to a location where the object should be placed

• Hash Table A data structure that stores items in designated places using hashing

functions for speeding up key-based search

Page 4: ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.

ISOM

Hash Tables:A New Data Structure

• Start with an array that holds the hash table.• Use a hash function to take a key and map it to some

index in the array. This function will generally map several different keys to the same index.

• If the desired record is in the location given by the index, then we’re finished, otherwise we must use some method to resolve the collision that may have occurred between two records wanting to go to the same location.

• This process is called hashing. To use hashing we must find good hash functions determine how to resolve collisions

Page 5: ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.

ISOM

Hash Function Requirements

• Hash functions must:Be fast – computed in O(1) timeDistribute keys evenly over the hash

table

• PreferablyAll location of hash table should have

equal probability of being filled

Page 6: ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.

ISOM

Collision Resolution with Open Addressing

• Linear Probing:

Linear probing starts with the hash address and searches sequentially for the target key or an empty position. The array should be considered circular, so that when the last location is reached, the search proceeds to the first location of the array.

a b c d e f

a b c d e f

a b c d e f

Clustring:

Page 7: ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.

ISOM

Collision Resolution with Open Addressing(Contd.)

• Quadratic Probing: If there is a collision at hash address h, quadratic

probing goes to location h+1, h+4, h+9,…, that is, at locations h+ i2 for i=1,2,...

• Other Methods:Key-dependent increments;Random probing

Page 8: ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.

ISOM

Chained Hash Tables

Page 9: ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.

ISOM

Issues with Chaining

• Worst case – all elements map to the same cell – all operations O(N)!

• Load factor:Ratio of actual number of elements to

maximum possible number of elements

• Turns out – at high load factors (>0.7) open addressing performs worse than chaining.

Page 10: ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.

ISOM

Avoid High Load factors!

• Start with a large enough hash table compared to expected number of entries

• RehashingOnce a high load factor is detected,

create a new hash table and rehash all elements to the new table

Page 11: ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.

ISOM

Birthday Surprise: How Collisions are Possible?

• If 24 or more randomly chosen people are in a room, then it is likely that at least two will share their birthday. For hashing, the birthday surprise says that for any problem of reasonable size, collisions will almost certainly occur.

Page 12: ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.

ISOM

Comparison of data structures

Unordered array

Ordered array

Unordered Linked List

Ordered Linked List

Binary Tree

Binary Search Tree

Hash Table

Insert

Remove

Find

Page 13: ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.

ISOM

Summary & Discussion

• In using a hash table, let the nature of the data and the required operations help you decide between chaining and open addressing.

• Hash functions must usually be custom-designed for the kind of keys used for accessing the hash table.

• Recall from the analysis of hashing that some collisions will almost inevitably occur.

• For open addressing, clustering is unlikely to be a problem until the hash table is more than half full.