Radix Search Tree

Radix search tree

By:Ajay kumar gupta

3rd cse

Radix Search tree

It is quite often the case that search keys are very long, consisting of many characters. In such a situation, the cost of comparing a search key for equality with a key from the data structure can be

a dominant cost which cannot be neglected. Digital tree searching uses such a comparison at each tree node; and it is possible in most cases to get by with only one comparison per search.

2

ExampleA 00001S 10011E 00101R 10010C 00011H 01000I 01001N 01110G 00111X 11000M 01101P 10000L 01100

A

P

X

M

E

G

C

L

H

NI

S

R

10

0

00

0

0

0

0

0

0

0

0

0

1

11

11

1

1

1

1

1

1

1

The idea is to not store keys in tree nodes at all, but rather to put all the keys in external nodes of the tree.

Thus, we have two types of nodes:

• internal nodes, which just contain links to other nodes,

• external nodes, which contain keys and no links.

4

To search for a key in such a structure, we just branch according to its bits, as above, but we don't compare it to anything until we get to an external node. Each key in the tree is stored in an external node on the path described by the leading bit pattern of the key and each search key winds up at one external node, so one full key comparison completes the search:

5

6

ExampleA 00001S 10011E 00101R 10010C 00011

A C

E

R S

0

0

0

0

0

0

0

1

1

1

1

1

1

1

1

0

Searching

1:Start from the root node and from the most significant bit

2:Go forward bit by bit on the trie until you find a leaf node

3:Check if the item is in the leaf node

7

Insertion

1:Find the place of the item by following bits

2:If there is nothing, just insert the item there as a

leaf node

3:If there is something on the leaf node, it becomes a new innernode. Build a new subtree or subtrees to that inner node depending how the item to be inserted and the item that was in the leaf node differs.

4:Create new leaf nodes where you store the item that was to be inserted and the item that was originally in the leaf node.

9

10

Example - insertionA 00001S 10011E 00101R 10010C 00011H 01000

A C

E

R S

H

0

0

0

0

0

0

0

1

1

1

1

1

1

1

1

0

External node - empty

11

Example - insertionA 00001S 10011E 00101R 10010C 00011H 01000I 01001

A C

E

R SH

0

0

0

0

0

0

0

1

1

1

1

1

1

1

1

0I

0

0

0

1

1

1

External node - occupied

Deletion

Remove the item Remove the leaf node where the item was If the nearest sister node is also a leaf node, shorten

the tree until the sister node differs only by one bit from some other branch of the tree.

12

13

Radix Search Trees - summary

• Program implementation - Necessity to maintain two types of nodes Low-level implementation

• Complexity: about logN bit comparisons in average case and b bit comparisons in the worst case in a tree built from N random b-bit keys.

Annoying feature: One-way branching for keys with a large number of common leading bits :

The number of the nodes may exceed the number of the keys.

On average – N/ln2 = 1.44N nodes

14

The following are the main advantages of tries over binary search trees (BSTs):

1:Looking up keys is faster. Looking up a key of length m takes worst case O(m) time. A BST takes O(log n) time, where n is the number of elements in the tree, because lookups depend on the depth of the tree, which is logarithmic in the number of keys. Also, the simple operations tries use during lookup, such as array indexing using a character, are fast on real machines.

2:Tries can require less space when they contain a large number of short strings, because the keys are not stored explicitly and nodes are shared between keys with common initial subsequences.

Advantages and drawbacks, relative to binary search tree

A trie can also be used to replace a hash table, over which it has the following advantages:

Looking up data in a trie is faster in the worst case, O(m) time, compared to an imperfect hash table. An imperfect hash table can have key collisions. A key collision is the hash function mapping of different keys to the same position in a hash table. The worst-case lookup speed in an imperfect hash table is O(N) time, but far more typically is O(1), with O(m) time spent evaluating the hash.

There are no collisions of different keys in a trie. Buckets in a trie which are analogous to hash table buckets that store

key collisions are only necessary if a single key is associated with more than one value.

There is no need to provide a hash function or to change hash functions as more keys are added to a trie.

A trie can provide an alphabetical ordering of the entries by key

15

Tries do have some drawbacks as well:

Tries can be slower in some cases than hash tables for looking up data, especially if the data is directly accessed on a hard disk drive or some other secondary storage device where the random access time is high compared to main memory.

It is not easy to represent all keys as strings, such as floating point numbers, which can have multiple string representations for the same floating point number, e.g. 1, 1.0, 1.00, +1.0, etc.

Tries are frequently less space-efficient than hash tables. Unlike hash tables, tries are generally not already available in

programming language toolkits. dependent on computer’s architecture – difficult to do

efficient implementations in some high-level languages

16

Application

Dictionary representation A common application of a trie is storing a dictionary,

such as one found on a mobile telephone. Such applications take advantage of a trie's ability to quickly search for, insert, and delete entries.

Tries are also well suited for implementing approximate matching algorithms, including those used in spell checking software.

17

Radix Search Tree

Documents

Transcript of Radix Search Tree