Chapter 10 Sorting and Searching - Lakehead...
Transcript of Chapter 10 Sorting and Searching - Lakehead...
![Page 1: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/1.jpg)
CS 2412 Data Structures
Chapter 10
Sorting and Searching
![Page 2: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/2.jpg)
Some concepts
• Sorting is one of the most common data-processing
applications.
• Sorting algorithms are classed as either internal or external.
• Sorting order can be either ascending sequence or descending
sequence.
• Sort stability is an attribute of a sort, indicating that data with
equal keys maintain their relative input order in the output.
• Sort efficiency usually is based on the comparisons and moves
required for the sorting. The best possible sorting algorithms
are O(n log n).
• During the sorting process, each traversal of the data is
referred to as a sort pass.
Data Structure 2016 R. Wei 2
![Page 3: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/3.jpg)
Selection sorts
• Heap sort: we have already discussed. First build a heap. Then
remove the root of the heap and put the last element to the
root and reheap down.
• Straight selection sort: In each pass of the selection sort, the
smallest element is selected from the unsorted sublist and
exchange with the element at the beginning of the unsorted list.
Data Structure 2016 R. Wei 3
![Page 4: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/4.jpg)
Data Structure 2016 R. Wei 4
![Page 5: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/5.jpg)
Algorithm selectionSort (list, last)
set current to 0
loop (until last element sorted)
set smallest to current
set walker to current +1
loop (walker key < smallest key)
set smallest to walker
increment walker
end loop
exchange (current, smallest)
increment current
end loop
Data Structure 2016 R. Wei 5
![Page 6: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/6.jpg)
The efficiency of selection sort
• Straight select sort: O(n2). The algorithm has two level of
loops, each of the loop executes about n times.
• Heap sort: O(n log n). To build a heap, about n log n loops are
needed. To sort from the heap needs another n log n loops. In
big-O notation, the complexity is O(n log n).
Data Structure 2016 R. Wei 6
![Page 7: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/7.jpg)
Insertion sorts
• Straight insertion sort: the list is divided into sorted and
unsorted sublists. In each pass the first element of the unsorted
sublist is inserted into the sorted sublist at correct position.
• Shell sort: the list is divided into K segments and each
segment is sorting (the segments are dispersed through the
list). After each passing, the number of segments is reduced
according to a increment. When the number of segments is
reduced to 1, the list is sorted.
Data Structure 2016 R. Wei 7
![Page 8: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/8.jpg)
Data Structure 2016 R. Wei 8
![Page 9: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/9.jpg)
Algorithm insertionSort(list, last)
set current to 1
loop (until last element sorted)
move current element to hold
set walker to current - 1
loop (walker >= 0 AND hold key < walker key)
move walker element right one element
decrement walker
end loop
move hold to walker + 1 element
increment current
end loop
Data Structure 2016 R. Wei 9
![Page 10: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/10.jpg)
The main idea for the Shell sort is divide the list into segments and
use insertion sort to sort each segment.
The positions of the elements of a segment are at a distance of
increment. In the following example, the list is of size 10. The 5
segments for increment K = 5 are as follows:
Segment 1. A[0], A[5]
Segment 2. A[1], A[6]
Segment 3. A[2], A[7]
Segment 4. A[3], A[8]
Segment 5. A[4], A[9]
Then for increment K = 2
Segment 1. A[0], A[2], A[4], A[6], A[8]
Segment 2. A[1], A[3], A[5], A[7], A[9]
Data Structure 2016 R. Wei 10
![Page 11: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/11.jpg)
Data Structure 2016 R. Wei 11
![Page 12: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/12.jpg)
Data Structure 2016 R. Wei 12
![Page 13: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/13.jpg)
Algorithm shellSort (list, last)
set incre to last / 2
loop (incre not 0)
set current to incre
loop(until last element sorted)
move current element to hold
set walker to current - incre
loop (walker>=0 AND hold key < walker key)
move walker element one increment right
set walker to walker - incre
end loop
move hold to walker + incre element
increment current
end loop
set incre to incre / 2
end loop
Data Structure 2016 R. Wei 13
![Page 14: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/14.jpg)
void shellSort (int list [], int last)
{
int hold;
int incre;
int walker;
incre = last / 2;
while (incre != 0)
{
for (int curr = incre; curr <= last; curr++)
{
hold = list [curr];
walker = curr - incre;
while (walker >= 0 && hold < list [walker])
{
list [walker + incre] = list [walker];
walker = ( walker - incre );
Data Structure 2016 R. Wei 14
![Page 15: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/15.jpg)
} // while
list [walker + incre] = hold;
} // for walk
incre = incre / 2;
} // while
return;
} // shellSort
Note
In the above algorithm, the increment start from n/2, then each
pass reduce half of the size. This is not the most efficient way, but
simple. The ideal increments should be set so that no two elements
will appear at same segment more than once. But this is not easy
in general.
Data Structure 2016 R. Wei 15
![Page 16: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/16.jpg)
Insertion sort efficiency:
• Straight insertion sort: O(n2). The algorithm has two
embedded loops. The execute times is about n(n+ 1)/2.
• Shell sort: the complexity is difficult to analysis. Using
empirical studies show that the average sort complexity is
O(n1.25)
Data Structure 2016 R. Wei 16
![Page 17: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/17.jpg)
Exchange sorts
• Bubble sort: the list in divided into two sublists: sorted and
unsorted. The smallest element is bubbled from the unsorted
sublist to the sorted sublist each time.
• Quick sort: each time a pivot is selected. Then the elements
less than pivot and the elements greater or equal to pivot are
separated into two sublist. The pivot is put at its ultimately
correct location in the list.
Data Structure 2016 R. Wei 17
![Page 18: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/18.jpg)
Example:
23 78 45 8 56 32
8 ∥23 78 45 32 56
8 23 ∥32 78 45 56
8 23 32 ∥45 78 56
8 23 32 45 ∥56 78
Data Structure 2016 R. Wei 18
![Page 19: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/19.jpg)
Algorithm bubbleSort(list, last)
set current to 0
set sorted to false
loop (current <= last AND sorted false)
set walker to last
set sorted to true
loop (walker > current)
if (walker dta < walker -1 data)
set sorted to false
exchange (list, walker, walker -1)
end if
decrement walker
end loop
increment current
end loop
Data Structure 2016 R. Wei 19
![Page 20: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/20.jpg)
Data Structure 2016 R. Wei 20
![Page 21: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/21.jpg)
Note for quick sort
• There are different methods for selecting the pivot.
– Select the first element.
– Select the middle element.
– Select the median value of three elements: left, right and
the element in the middle of the list. This text uses this
method.
• When the partition becomes small, a straight insertion sort can
be used, which may be more efficient.
Data Structure 2016 R. Wei 21
![Page 22: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/22.jpg)
Example for one pass of a quick sort:
Data Structure 2016 R. Wei 22
![Page 23: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/23.jpg)
Algorithm medianLeft(sortData, left, right)
set mid to (left + right ) /2
if (left key > mid key)
exchange (sortData, left, mid)
end if
if (left key > right key)
exchange ( sortData, left, right)
end if
if(mid key > right key)
exchange (sortData, mid, right)
end if
exchange (sortData, left, mid) //put pivot in left.
Data Structure 2016 R. Wei 23
![Page 24: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/24.jpg)
Data Structure 2016 R. Wei 24
![Page 25: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/25.jpg)
Data Structure 2016 R. Wei 25
![Page 26: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/26.jpg)
The list in Figure 12-15 is sorted as follows:
Data Structure 2016 R. Wei 26
![Page 27: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/27.jpg)
The exchange sort efficiency:
• Bubble sort: O(n2). There are two loops in the algorithm. The
comparison is about n(n+ 1)/2.
• Quick sort: O(n logn). The algorithm has 5 loops. However,
for each pass, the partition is general half size as previous pass.
Roughly say, there are total log2 n passes.
Data Structure 2016 R. Wei 27
![Page 28: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/28.jpg)
void bubbleSort (int list [], int last)
{
int temp;
for (int current = 0, sorted = 0;
current <= last && !sorted;
current++)
for (int walker = last, sorted = 1;
walker > current;
walker--)
if (list[ walker ] < list[ walker - 1 ])
{
sorted = 0;
temp = list[walker];
list[walker] = list[walker - 1];
list[walker - 1] = temp;
} // if
return;
} // bubbleSort
Data Structure 2016 R. Wei 28
![Page 29: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/29.jpg)
External sorts
In external sorting, portions of the data may be stored in secondary
memory during the sorting process.
One important method for the external sort is merge the (sorted)
files in to one sorted file.
Data Structure 2016 R. Wei 29
![Page 30: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/30.jpg)
Merge sorts
A simple merge is merge two sorted files into one file. For example,
we have two sorted lists:
• 1, 3, 5
• 2, 4, 6, 8, 10
After we merged these two list, we should obtain the following list:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
Data Structure 2016 R. Wei 30
![Page 31: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/31.jpg)
The following algorithm merges two sorted files file1, file2.
The combined data are written into file3
Algorithm mergeFiles
open files
read (file1 into record1)
read (file2 into record2)
loop (not end file1 or not end file2)
if (record1.key <= record2.key)
write (record1 to file3)
read (file1 into record1)
if (end of file1)
set record1.key to infinity
end if
else
write (record2 to file3)
Data Structure 2016 R. Wei 31
![Page 32: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/32.jpg)
read (file2 into record2)
if (end of file2)
set record2 key to infinity
end if
end if
end loop
close files
end mergeFiles
Data Structure 2016 R. Wei 32
![Page 33: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/33.jpg)
Merge unsorted files:
• Form merge runs for the files. Each run is ordered.
• The end of each run is identified by a stepdown.
• Merge each run of the two files.
• When one run is stepdown, the another run is rollout (copied
to the merged file).
Data Structure 2016 R. Wei 33
![Page 34: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/34.jpg)
Data Structure 2016 R. Wei 34
![Page 35: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/35.jpg)
The sorting process:
• Sort phase: Divide the file into merge files according to the size
of memory. Foe example, if we have 2300 records, but the
memory only can handle 500 records. We first read in 500
records and sort it as the first merge run. Then read and sort
501-1000 records as first run of the merge 2, etc.
• Merge phase: merge the sorted runs.
Data Structure 2016 R. Wei 35
![Page 36: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/36.jpg)
Data Structure 2016 R. Wei 36
![Page 37: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/37.jpg)
There are different merge concepts. We discuss 3 of them as
examples
• Natural merge: after merge, all data are written in one file and
need a distribute phase to redistribute the data to two files.
• Balance merge: use a constant number of input merge files and
the same number of output merger files.
• Ployphase merge: A constant number of input merge files are
merged to one output merge file, the input merge files are
immediately reused when their input has been completely
merged.
Data Structure 2016 R. Wei 37
![Page 38: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/38.jpg)
Data Structure 2016 R. Wei 38
![Page 39: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/39.jpg)
Data Structure 2016 R. Wei 39
![Page 40: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/40.jpg)
Searching
• Binary search: for sorted list.
• Sequential search:
– Straight sequential search: each time check if the key
equals to the target AND if it is the last key.
– Sentinel sequential search: add the target at the end of the
list so that each time just check if key equals to the
target.
– Probability search: when a target is found, move the
element containing target up one location. In this way, most
frequent targets are easier to found.
Data Structure 2016 R. Wei 40
![Page 41: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/41.jpg)
Hashed list searches
• Hashing is a method using key-to-address mapping to find the
data quickly.
• The basic idea is using a hash function to map a key (which is
at a large range) to a index (which is at a small range) of data.
• Some keys may be mapped to a same index (synonyms). Then
we need some method to solve the collision.
• The main part of hashing is to find good hashing methods.
Data Structure 2016 R. Wei 41
![Page 42: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/42.jpg)
Data Structure 2016 R. Wei 42
![Page 43: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/43.jpg)
Hashing methods:
• Direct method: the range of keys and the range of index are
the same.
• Subtraction method: subtract a fixed number from the key.
Also require both ranges are the same.
• Modulo-division method: index= key MODULO listSize
• Digit-extraction method: select digits at certain positions as
the index.
• Midsquare method: key is squared and the middle digits are
used as index.
Data Structure 2016 R. Wei 43
![Page 44: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/44.jpg)
• Folding method: fold shift (key is divided into parts whose size
matches the size of the index. Then the left and right parts are
shifted and added with the middle part); fold boundary (the
left and right numbers are folded on a fixed boundary between
them and the center number. The two outside values are
reversed).
Data Structure 2016 R. Wei 44
![Page 45: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/45.jpg)
• Rotation method: rotating the last character to the front of the
key. Usually used by incorporating with other methods.
• Pseudorandom method: the key is used as the seed in a
pseudorandom number generator, the resulting random number
is then scaled into the possible index range.
Data Structure 2016 R. Wei 45
![Page 46: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/46.jpg)
Some concepts used in collision resolution method:
• Load factor: the number of elements in the list divided by the
number of physical allocated for the list, expressed as
percentage (better less than 75).
α =k
n× 100.
• Clustering: as data are added to a list and collisions are
resolved, some hashing algorithms tend to cause data to group
within the list.
Data Structure 2016 R. Wei 46
![Page 47: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/47.jpg)
Data Structure 2016 R. Wei 47
![Page 48: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/48.jpg)
Open addressing to resolve collisions (disadvantage: each collision
resolution increases the probability of future collisions).
• Linear probe: when data cannot be stored in the home address,
we resolve the collision by adding 1 to the current address.
Data Structure 2016 R. Wei 48
![Page 49: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/49.jpg)
• Quadratic probe: the increment is the collision probe number
squared.
Data Structure 2016 R. Wei 49
![Page 50: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/50.jpg)
• Pseudorandom collision resolution (double hashing): use a
pseudorandom number to resolve the collision. Use the collision
address as the key of the the pseudorandom generator.
Data Structure 2016 R. Wei 50
![Page 51: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/51.jpg)
• Key offset (double hashing): calculate the new address as a
function of the old address and the key.
For example:
offSet = key / listSize
address = (offSet + old address) modulo listSize
Data Structure 2016 R. Wei 51
![Page 52: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/52.jpg)
Linked list collision resolution: use a separate area to store
collisions and chains all synonyms together in a linked list (usually
use LIFO sequence). Two storage areas are used: prime area and
the overflow area.
Data Structure 2016 R. Wei 52
![Page 53: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/53.jpg)
Bucket hashing: keys are hashed to buckets, nodes that
accommodate multiple data occurrences. (disadvantage: use more
empty space, when the bucket is full, collision occurs)
Data Structure 2016 R. Wei 53
![Page 54: Chapter 10 Sorting and Searching - Lakehead Universityccc.cs.lakeheadu.ca/cs2412/slides/cs2412-s10.pdfroot and reheap down. Straight selection sort: In each pass of the selection sort,](https://reader035.fdocuments.net/reader035/viewer/2022070705/5e93785b3d7ba519592135d1/html5/thumbnails/54.jpg)
Combination approaches may used:
bucket hashing first, then a linear probe is used if bucket is full.
Data Structure 2016 R. Wei 54