240-222 CPT: Search/171 240-222 Computer Programming Techniques Semester 1, 1998 Objectives of these...
-
Upload
gyles-daniel -
Category
Documents
-
view
217 -
download
0
Transcript of 240-222 CPT: Search/171 240-222 Computer Programming Techniques Semester 1, 1998 Objectives of these...
240-222 CPT: Search/17 1
240-222 Computer Programming Techniques240-222 Computer Programming TechniquesSemester 1, 1998Semester 1, 1998
Objectives of these slides:– to discuss searching: its implementation,
and some complexity analyses
17. Searching
240-222 CPT: Search/17 2
Overview:Overview:
1. Searching Definition
2. External/Internal Searching
3. Simplifying the Search
4. Analysing Searching
5. Linear Search
6. Binary Search
7. Comparison of Searching Algorithms
240-222 CPT: Search/17 3
1. Searching Definition1. Searching Definition
We are given a collection of n elements:a0, a1, ... an-1
Each ai consists of two parts – a unique ID, or key, and some data.
Searching is the process of finding an ai such that its key equals some specified key value, k.
240-222 CPT: Search/17 4
2. External/Internal Searching2. External/Internal Searching
Searching algorithms fall into two broad categories:
– external searching(large amount of data on disk)
– internal searching(small amount of data in memory)
240-222 CPT: Search/17 5
2.1. External Searching2.1. External Searching
The data is too large to be all loaded into memory at once.
Scan all the data files to find the requested item.
Try to minimise the number of blocks read.
240-222 CPT: Search/17 6
2.2. Internal Searching 2.2. Internal Searching
Traverse the data structure holding the items.
Arrays, lists and trees can be used as data structures.
Try to minimise the number of key comparisons.
240-222 CPT: Search/17 7
3. Simplifying the Search3. Simplifying the Search
3.1. Internal Search (arrays)
3.2. Simplified Search Data Structure
3.3. Search Function Interface
3.4. Search Driver
240-222 CPT: Search/17 8
3.1. Internal Search (arrays)3.1. Internal Search (arrays)
Array-based search data structure:
#define SIZE 100
typedef data_item ?? /* e.g. string */struct elem { int key; data_item s;}
struct elem a[SIZE];
240-222 CPT: Search/17 9
Pictorially:Pictorially:
. . . . . . .
. . . . . . .5 12 4 45 6 11
jim jane bob ann jane jim
key
dataitem
a[0] a[1] a[2] a[3] a[4] . . . . . . .
240-222 CPT: Search/17 10
3.2. Simplified Search Data Structure3.2. Simplified Search Data Structure
Remove keys
Represent the data items as integers
Each integer (data item) is unique: no duplicates– can use data items as keys
e.g.
12 5 7 1 4 17. . . . . .
a[0] a[1] a[2] a[3] . . . . . .
240-222 CPT: Search/17 11
3.3. Search Function Interface3.3. Search Function Interface
int search(int array[], int key, int size)
{ /* ... */}
The function examines the array looking for an item containing key.– Success: return array index
– Failure: return -1
240-222 CPT: Search/17 12
3.4. Search Driver3.4. Search Driver
/* Search an integer array */#include <stdio.h>#define SIZE 100
int search(int [], int, int);
void main(){ int a[SIZE], x, searchkey, index; for (x = 0; x < SIZE; x++) a[x] = 2 * x; printf("Enter integer search key:\n"); scanf("%d", &searchkey); :
continued
240-222 CPT: Search/17 13
index = search(a, searchkey, SIZE); if (index != -1) printf("Array index %d\n", index); else printf("Value not found\n");}
int search(int array[], int key, int size){ /* search implementation */}
240-222 CPT: Search/17 14
4. Analysing Search4. Analysing Search
Searching algorithms should be both space and time efficient.
With internal searching, the critical factor is the number of key comparisons, C.
For each searching algorithm, consider its best, worst and average case performance in terms of C.
240-222 CPT: Search/17 15
5. Linear Search 5. Linear Search
5.1. Linear Search Algorithm
5.2. Linear Search Function
5.3. Linear Search Program
5.4. Analysis of Linear Search
5.5. Sorted Linear Search
5.6. Recursive Linear Search
240-222 CPT: Search/17 16
5.1. Linear Search Algorithm5.1. Linear Search Algorithm
Move along the data structure, comparing the search key, k, to the key value of the current item until:– a match is found
or– all the data structure has been considered
240-222 CPT: Search/17 17
5.2. Linear Search Function5.2. Linear Search Function
int linear_search(int array[], int key,
int size){ int n;
for (n = 0; n < size; n++) if (array[n] == key) return n;
return -1;}
240-222 CPT: Search/17 18
5.3. Linear Search Program Fig 6.185.3. Linear Search Program Fig 6.18
/* Linear search of an array */#include <stdio.h>#define SIZE 100
int linear_search(int [], int, int);
void main(){ int a[SIZE], x, searchkey, index; for (x = 0; x < SIZE; x++) a[x] = 2 * x;
printf("Enter integer search key:\n"); scanf("%d", &searchkey);
: continued
240-222 CPT: Search/17 19
index = linear_search(a, searchkey, SIZE);
if (index != -1) printf("Array index %d\n", index); else printf("Value not found\n");}
240-222 CPT: Search/17 20
ExecutionExecution
Enter Integer Search Key:36Array index 18
Enter integer search key:37Value not found
240-222 CPT: Search/17 21
5.4. Analysis of Linear Search 5.4. Analysis of Linear Search
Consider an array with n items.
In the best case, find a match at the start of the array:
Cmin = 1
In the worst case, all items are examined:Cmax = n
240-222 CPT: Search/17 22
In the average case, about half of the items are considered:
Caverage = n/2 = O(n)
240-222 CPT: Search/17 23
Meaning of O()Meaning of O()
Read O() as “about”, where constants and small values are ignored. Concentrate on large changes.
For example:– 5n + 2 = O(n)
– 5n2 + n = O(n2)
– 6 = O(1) /* constant */
O() is useful for giving rough estimates.
240-222 CPT: Search/17 24
5.5. Sorted Linear Search5.5. Sorted Linear Search
int sl_search(int array[], int key, int size)
{ int n;
for (n = 0; n < size; n++) if (array[n] == key) return n; else if (array[n] > key) return -1; /* larger key found */
return -1;}
240-222 CPT: Search/17 25
The actual speed of the function will increase (for some arrays) but the average complexity remains at O(n)
240-222 CPT: Search/17 26
5.6. Recursive Linear Search5.6. Recursive Linear Search
int rl_search(int array[], int key,int index, int size)
{ if (index >= size) return -1; if (array[index] == key) return index; else return( rl_search(array, key,
index+1, size) );}
240-222 CPT: Search/17 27
Call from Call from main()main()::
element = rl_search(a, searchkey, 0, SIZE)
/* 0 is initial index */
240-222 CPT: Search/17 28
6. Binary Search6. Binary Search
6.1. Binary Search Algorithm
6.2. Execution of Binary Search
6.3. Binary Search Function
6.4. Analysis of Binary Search
6.5. Recursive Binary Search Function
240-222 CPT: Search/17 29
6.1. Binary Search Algorithm6.1. Binary Search Algorithm
A divide-and-conquer algorithm
Search for key value k:
1. Take middle item of array segment: amid
2. If k == amid's key, then the search is successful:
240-222 CPT: Search/17 30
3. Otherwise, if the range of array items under consideration is empty then the search has failed
4. If k < amid's key then restrict search to lower half of array and go to step 1
5. If k > amid's key then restrict search to upper half of array and go to step 1
240-222 CPT: Search/17 31
6.2. Execution of Binary Search 6.2. Execution of Binary Search
Searching for 15 in2, 4, 6, 7, 10, 11, 15, 17, 20, 29, 30
Since 15 > the middle key (11), consider its upper half:15, 17, 20, 29, 30
Since 15 < the middle key (20), consider its lower half:15, 17
15 == middle item, success.
240-222 CPT: Search/17 32
6.3. Binary Search Function6.3. Binary Search Functionint binary_search(int array[],
int key, int size){ int low = 0, high = size - 1 , mid;
while ( low <= high) { mid = (low+high) / 2; if (key < array[mid]) high = mid - 1; else if (key > array[mid]) low = mid + 1; else /* found match */ return mid; } return -1; /* no match */}
240-222 CPT: Search/17 33
6.4. Analysis of Binary Search 6.4. Analysis of Binary Search
Consider an array with 2k items.
At each iteration:– either one or two key comparisons are made
– the number of items under consideration is halved
At an array range of size 1, either:– found the item, or
– item not in array.
240-222 CPT: Search/17 34
It takes k iterations to reach array range of size 1:
2k items ฎ 1 item in k steps
sok items ฎ 1 item in log2k steps
Thus, for an array of size n, it takes (about) log2n steps/iterations.
240-222 CPT: Search/17 36
Worst CaseWorst Case
Key is not in the array
Array is divided in half log2n times
Each iteration requires roughly 2 key comparisons
Cmax ญ 2 * no. of iterations2 * log2n
= O(log2n)
240-222 CPT: Search/17 37
Average CaseAverage Case
Need to consider all possible cases:– matches at all positions
– misses at all positions
The result:Caverage ญ 1.8 * logn
= O(log2n)
Close to worst case performance
240-222 CPT: Search/17 38
6.5. Recursive Binary Search Function 6.5. Recursive Binary Search Function int rb_search(int array[], int left,
int right, int key){ int mid; if (left > right) /* nowhere to look */ return -1; mid = (left+right)/2; if (array[mid] == key) return mid; if (array[mid) < key) return( rb_search(array, mid+1,
right, key) ); else return( rb_search(array, left,
mid-1, key) );}
240-222 CPT: Search/17 39
The initial call from main():The initial call from main():
element = rb_search(a, 0, SIZE-1,
searchkey);
Average complexity remains at O(log2n)
240-222 CPT: Search/17 40
7. Comparison of Searching Algorithms7. Comparison of Searching Algorithms
Linear search is O(n)
Binary search is O(log2n)
Plug in values of n, to see that binary search is better (faster).