Computer notes - Hashing

18
Class No.35 Data Structures http://ecomputernotes. com

description

An AVL tree, ordered by key insert: a standard insert; (log n) find: a standard find (without removing, of course); (log n) remove: a standard remove; (log n)

Transcript of Computer notes - Hashing

Page 1: Computer notes - Hashing

Class No.35

Data Structures

http://ecomputernotes.com

Page 2: Computer notes - Hashing

Skip List: Implementation

S0

S1

S2

S3

4512 23 34

34

23 34

http://ecomputernotes.com

Page 3: Computer notes - Hashing

Implementation: TowerNode

TowerNode will have array of next pointers. Actual number of next pointers will be

decided by the random procedure. Define MAXLEVEL as an upper limit on

number of levels in a node.

40 50 60

head tail

20 3026 57

Tower Node

http://ecomputernotes.com

Page 4: Computer notes - Hashing

Implementation: QuadNode

A quad-node stores:• item• link to the node before• link to the node after• link to the node below• link to the node above

This will require copying the key (item) at different levels

x

quad-node

http://ecomputernotes.com

Page 5: Computer notes - Hashing

Skip Lists with Quad Nodes

56 64 78 31 34 44 12 23 26

31

64 31 34 23

S0

S1

S2

S3

http://ecomputernotes.com

Page 6: Computer notes - Hashing

Performance of Skip Lists

In a skip list with n items

• The expected space used is proportional to n.

• The expected search, insertion and deletion time is proportional to log n.

Skip lists are fast and simple to implement in practice

http://ecomputernotes.com

Page 7: Computer notes - Hashing

Implementation 5: AVL tree

An AVL tree, ordered by key insert: a standard insert; (log n) find: a standard find (without

removing, of course); (log n) remove: a standard remove;

(log n)

key entry

key entry key entry

key entry

and so on

http://ecomputernotes.com

Page 8: Computer notes - Hashing

Anything better?

So far we have find, remove and insert where time varies between constant logn.

It would be nice to have all three as constant time operations!

http://ecomputernotes.com

Page 9: Computer notes - Hashing

An array in which TableNodes are not stored consecutively

Their place of storage is calculated using the key and a hash function

Keys and entries are scattered throughout the array.

Implementation 6: Hashing

key entry

Key hash function

array index

4

10

123

http://ecomputernotes.com

Page 10: Computer notes - Hashing

insert: calculate place of storage, insert TableNode; (1)

find: calculate place of storage, retrieve entry; (1)

remove: calculate place of storage, set it to null; (1)

Hashing

key entry

4

10

123

All are constant time (1) !

http://ecomputernotes.com

Page 11: Computer notes - Hashing

Hashing

We use an array of some fixed size T to hold the data. T is typically prime.

Each key is mapped into some number in the range 0 to T-1 using a hash function, which ideally should be efficient to compute.

http://ecomputernotes.com

Page 12: Computer notes - Hashing

Example: fruits

Suppose our hash function gave us the following values: hashCode("apple") = 5

hashCode("watermelon") = 3hashCode("grapes") = 8hashCode("cantaloupe") = 7hashCode("kiwi") = 0hashCode("strawberry") = 9hashCode("mango") = 6hashCode("banana") = 2

kiwi

bananawatermelon

applemango

cantaloupegrapes

strawberry

0

1

2

3

4

5

6

7

8

9

http://ecomputernotes.com

Page 13: Computer notes - Hashing

Example

Store data in a table array: table[5] = "apple"

table[3] = "watermelon" table[8] = "grapes" table[7] = "cantaloupe" table[0] = "kiwi" table[9] = "strawberry" table[6] = "mango" table[2] = "banana"

kiwi

bananawatermelon

applemango

cantaloupegrapes

strawberry

0

1

2

3

4

5

6

7

8

9

http://ecomputernotes.com

Page 14: Computer notes - Hashing

Example

Associative array: table["apple"]

table["watermelon"] table["grapes"] table["cantaloupe"] table["kiwi"] table["strawberry"] table["mango"] table["banana"]

kiwi

bananawatermelon

applemango

cantaloupegrapes

strawberry

0

1

2

3

4

5

6

7

8

9

http://ecomputernotes.com

Page 15: Computer notes - Hashing

Example Hash Functions

If the keys are strings the hash function is some function of the characters in the strings.

One possibility is to simply add the ASCII values of the characters:

TableSizeABChExample

TableSizeistrstrhlength

i

)%676665()(:

%][)(1

0

http://ecomputernotes.com

Page 16: Computer notes - Hashing

Finding the hash function

int hashCode( char* s ){

int i, sum;sum = 0;for(i=0; i < strlen(s); i++ ) sum = sum + s[i]; // ascii value

return sum % TABLESIZE;

}

http://ecomputernotes.com

Page 17: Computer notes - Hashing

Example Hash Functions

Another possibility is to convert the string into some number in some arbitrary base b (b also might be a prime number):

TbbbABChExample

Tbistrstrhlength

i

i

)%676665()(:

%][)(

210

1

0

http://ecomputernotes.com

Page 18: Computer notes - Hashing

Example Hash Functions

If the keys are integers then key%T is generally a good hash function, unless the data has some undesirable features.

For example, if T = 10 and all keys end in zeros, then key%T = 0 for all keys.

In general, to avoid situations like this, T should be a prime number.

http://ecomputernotes.com