16. Algo analysis & Design - Data Structures using C++ by Varsha Patil
-
Upload
widespreadpromotion -
Category
Data & Analytics
-
view
448 -
download
0
Transcript of 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil
1Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
16. Algorithm Analysis
And Design
2Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Objectives After completing this chapter, the reader will
be able to understand the following:
Basic tools needed to develop and analyze algorithms
Methods to compute the efficiency of algorithms
Ways to make a wise choice among many solutions for a given problem
3Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
IntroductionAlgorithm AnalysisAsymptoti c Notations (W, p, O)Big Omega (W)
4Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
IntroductionAlgorithm AnalysisAsymptoti c Notations (W, p, O)Big Omega (W)
5Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Graph Abstract Data Type
The following steps elaborate the general structure of the divide-and-conquer strategy
If the data size n of problem P is fundamental, calculate the result of P(n) and go to step 4
If the data size n of problem P is not fundamental, divide the problem P(n) into equivalent subproblems P(n1), P(n2), … P(ni) such that i ≥ 1
Apply divide-and-conquer recursively to each individual subproblem P(n1), P(n2), …,P(ni)
Combine the results of all subproblems P(n1), P(n2),…, P(ni) to get the final solution of P(n)
6Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Analysis of Quick sort At level one, only one call to a partition is
made with n elements; at level two, Atmost two calls are made with elements (n - 1), and so on
C(n) = O(n2)
7Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
DIVIDE-A ND-CONQUER General Knapsack Problem Elements of Greedy Strategy
8Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Elements of Greedy Strategy
To decide whether a problem can be solved using a greedy strategy, the following elements should be considered:
Greedy-choice property Optimal substructure Greedy Method
9Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Greedy-choice property A problem exhibits greedy-choice property if a globally optimal solution can be arrived at by making a locally optimal greedy choice
That is, we make the choice that seems best at that time without considering the results from the sub problems
10Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Dynamic Programming The General Method Elements of Dynamic Programming Principle of Optimality Limitations of Dynamic Programming Knapsack Problem
11Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Dynamic Programming The General Method Elements of Dynamic Programming Principle of Optimality Limitations of Dynamic Programming Knapsack Problem
12Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Elements of Dynamic Programming
A dynamic programming solution has the following three components:
Formulate the answer as a recurrence relation or a recursive algorithm
Show that the number of different instances of your recurrence is bounded by a polynomial
Specify an order of evaluation for the recurrence
13Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Elements of Dynamic Programming
To decide whether a problem can be solved using the dynamic programming method, the following three elements of dynamic programming should be considered:
Optimal substructure Overlapping subproblems Memorization
14Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Optimal Substructure A problem exhibits optimal substructure if an
optimal solution to the problem contains within it optimal solutions to subproblems
It also means that dynamic programming (and greedy method) might apply
15Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Overlapping subproblems(CONTD…)
When a recursive algorithm revisits the same problem repeatedly, it is said that the optimization problem has overlapping sub problems
This is beneficial for dynamic programming
It solves each subproblem once and stores the answer in a table
This answer can be searched in constant time when required. This is contradictory to the divide-and-conquer strategy where a new problem is generated at each step of recursion
16Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Overlapping subproblems(CONTD…)
A problem exhibits optimal substructure if an optimal solution to the problem contains within it optimal solutions to subproblems
It also means that dynamic programming (and greedy method) might apply
17Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Memorization However, it uses the control structure similar to
the recursive algorithm In a memorized recursive algorithm, an entry is
maintained in a table for the solution to each subproblem
Initially, all entries contain a special value, which indicates that the entry is not yet used
For each subproblem, which is encountered for the first time, its solution is computed and stored in the table
Next time, for that subproblem, its entry is searched and the value is used
This can be implemented using hashing
18Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
The principle of optimality states that an optimal sequence of decisions has the property that whatever the initial state and decision are, the remaining decisions must constitute an optimal decision sequence with regard to the state resulting from the first decision
Principle of Optimality
19Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Pattern matching is the process of finding the presence of a particular string (pattern) in the given string (text)
Pattern Matching
20Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Database search Search engine Text editors Intrusion detection Natural language processing Feature detection in digitized images
A few such applications are as follows:
21Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
The most popular are the following:
Brute-force approachBoyer–Moore algorithmKnuth–Morris–Pratt algorithmRobin–Karp algorithmText partitioning algorithmSemi-numerical algorithm
Popular techniques forstring pattern search
22Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
String : A string is a finite sequence of symbols that are chosen from a set or alphabet:
Alphabet is a set of characters or symbols
Substring: A substring or subsequence of a string is a subset of the symbols in a string where the order of elements is preserved
Suffix: A suffix of S is a substring S[i, …, m − 1], where i ranges between 0 and m − 1
23Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Brute-force Approach
This is a simple straight forward approach based on the comparison of a pattern character by character with a string
24Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
The steps involved in this approach are as follows:
Adjust the pattern P at beginning of the text
Start moving from left to right and compare the character of pattern to the corresponding character in text
Continue with step 2 until successful (all characters of the pattern are matched) or unsuccessful (a mismatch is detected)
25Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Boyer–Moore Algorithm
Boyer and Moore have developed an efficient pattern matching algorithm
Instead of sliding by one character to the right at a time, in Boyer–Moore approach, the sliding to the right is done in longer steps
The algorithm scans the character of pattern from right to left beginning with the rightmost character
If the text symbol compared with the rightmost pattern symbol does not occur in the pattern at all, then the pattern can be shifted by m positions (where m is length of pattern)
26Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Knuth–Morris–Pratt Algorithm
The researchers Knuth, Morris, and Pattern proposed a linear time algorithm for the string matching problem
In this approach, a matching time of O(n) is achieved by avoiding comparisons with characters of T that have previously been involved in comparison with some element
of the pattern P to be matched so that backtracking is avoided
27Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
KMP Matcher The KMP matcher finds the occurrence of
the pattern P in text T and returns the number of shifts of P, after which the occurrence is found taking T, P, and prefix function p as inputs
28Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
TRIES A compact data structure that represents a
set of strings (such as all the words in a text) known as tries
A trie is a tree-based data structure for storing strings to make pattern matching faster
A trie helps in pattern matching in time that is proportional to the length of the pattern
Tries can be used to perform prefix query for information retrieval
Prefix query searches for the longest prefix of a given string that matches a prefix of some string in the tries
29Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
There are variants of tries, which are listed as follows:
Standard tries Compressed trie Suffix tries
30Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
The standard trie for a set of strings S is an ordered tree such that
Each node but the root is labelled with a character;
The children of a node are alphabetically ordered;
The paths from the external nodes to the oot yield the strings of S
31Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Compressed Tries Similar to the standard trie, a compressed is
a tree-based data structure For storing strings in order to make pattern
matching much faster This is an optimized approach for pattern
matching specially suitable for applications where time is a more crucial factor
32Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Following are the unique characteristics of compressed tire:
A compressed trie (or Patricia trie) has internal nodes of degree at least 2
It is obtained from standard trie by compressing chains of redundant nodes
33Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Compressed Trie
b
id uo
ok
il sh y
s
ell
to
ck p
34Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Suffix Triesm e g h a v i n d 0 1 2 3 4 5 6 7 8
d e me ghavindnd
ndghavindnd
ghavindvind
35Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
Suffix Tries A suffix trie is a compressed Trie for all
the suffixes of a text This is a compressed Trie, and hence,
possesses all features a compressed trie and makes it more powerful for making a search faster as it includes all suffixes of a text
36Oxford University Press © 2012
Data Structures Using C++ by Dr Varsha Patil
End of Chapter 16…!