16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

36
Oxford University Press © 2012 Data Structures Using C++ by Dr Varsha Patil 1 16. Algorithm Analysis And Design

Transcript of 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

Page 1: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

1Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

16. Algorithm Analysis

And Design

Page 2: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

2Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Objectives After completing this chapter, the reader will

be able to understand the following:

Basic tools needed to develop and analyze algorithms

Methods to compute the efficiency of algorithms

Ways to make a wise choice among many solutions for a given problem

Page 3: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

3Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

IntroductionAlgorithm AnalysisAsymptoti c Notations (W, p, O)Big Omega (W)

Page 4: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

4Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

IntroductionAlgorithm AnalysisAsymptoti c Notations (W, p, O)Big Omega (W)

Page 5: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

5Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Graph Abstract Data Type

The following steps elaborate the general structure of the divide-and-conquer strategy

If the data size n of problem P is fundamental, calculate the result of P(n) and go to step 4

If the data size n of problem P is not fundamental, divide the problem P(n) into equivalent subproblems P(n1), P(n2), … P(ni) such that i ≥ 1

Apply divide-and-conquer recursively to each individual subproblem P(n1), P(n2), …,P(ni)

Combine the results of all subproblems P(n1), P(n2),…, P(ni) to get the final solution of P(n)

Page 6: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

6Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Analysis of Quick sort At level one, only one call to a partition is

made with n elements; at level two, Atmost two calls are made with elements (n - 1), and so on

C(n) = O(n2)

Page 7: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

7Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

DIVIDE-A ND-CONQUER General Knapsack Problem Elements of Greedy Strategy

Page 8: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

8Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Elements of Greedy Strategy

To decide whether a problem can be solved using a greedy strategy, the following elements should be considered:

Greedy-choice property Optimal substructure Greedy Method

Page 9: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

9Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Greedy-choice property A problem exhibits greedy-choice property if a globally optimal solution can be arrived at by making a locally optimal greedy choice

That is, we make the choice that seems best at that time without considering the results from the sub problems

Page 10: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

10Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Dynamic Programming The General Method Elements of Dynamic Programming Principle of Optimality Limitations of Dynamic Programming Knapsack Problem

Page 11: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

11Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Dynamic Programming The General Method Elements of Dynamic Programming Principle of Optimality Limitations of Dynamic Programming Knapsack Problem

Page 12: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

12Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Elements of Dynamic Programming

A dynamic programming solution has the following three components:

Formulate the answer as a recurrence relation or a recursive algorithm

Show that the number of different instances of your recurrence is bounded by a polynomial

Specify an order of evaluation for the recurrence

Page 13: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

13Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Elements of Dynamic Programming

To decide whether a problem can be solved using the dynamic programming method, the following three elements of dynamic programming should be considered:

Optimal substructure Overlapping subproblems Memorization

Page 14: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

14Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Optimal Substructure A problem exhibits optimal substructure if an

optimal solution to the problem contains within it optimal solutions to subproblems

It also means that dynamic programming (and greedy method) might apply

Page 15: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

15Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Overlapping subproblems(CONTD…)

When a recursive algorithm revisits the same problem repeatedly, it is said that the optimization problem has overlapping sub problems

This is beneficial for dynamic programming

It solves each subproblem once and stores the answer in a table

This answer can be searched in constant time when required. This is contradictory to the divide-and-conquer strategy where a new problem is generated at each step of recursion

Page 16: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

16Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Overlapping subproblems(CONTD…)

A problem exhibits optimal substructure if an optimal solution to the problem contains within it optimal solutions to subproblems

It also means that dynamic programming (and greedy method) might apply

Page 17: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

17Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Memorization However, it uses the control structure similar to

the recursive algorithm In a memorized recursive algorithm, an entry is

maintained in a table for the solution to each subproblem

Initially, all entries contain a special value, which indicates that the entry is not yet used

For each subproblem, which is encountered for the first time, its solution is computed and stored in the table

Next time, for that subproblem, its entry is searched and the value is used

This can be implemented using hashing

Page 18: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

18Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

The principle of optimality states that an optimal sequence of decisions has the property that whatever the initial state and decision are, the remaining decisions must constitute an optimal decision sequence with regard to the state resulting from the first decision

Principle of Optimality

Page 19: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

19Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Pattern matching is the process of finding the presence of a particular string (pattern) in the given string (text)

Pattern Matching

Page 20: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

20Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Database search Search engine Text editors Intrusion detection Natural language processing Feature detection in digitized images

A few such applications are as follows:

Page 21: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

21Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

The most popular are the following:

Brute-force approachBoyer–Moore algorithmKnuth–Morris–Pratt algorithmRobin–Karp algorithmText partitioning algorithmSemi-numerical algorithm

Popular techniques forstring pattern search

Page 22: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

22Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

String : A string is a finite sequence of symbols that are chosen from a set or alphabet:

Alphabet is a set of characters or symbols

Substring: A substring or subsequence of a string is a subset of the symbols in a string where the order of elements is preserved

Suffix: A suffix of S is a substring S[i, …, m − 1], where i ranges between 0 and m − 1

Page 23: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

23Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Brute-force Approach

This is a simple straight forward approach based on the comparison of a pattern character by character with a string

Page 24: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

24Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

The steps involved in this approach are as follows:

Adjust the pattern P at beginning of the text

Start moving from left to right and compare the character of pattern to the corresponding character in text

Continue with step 2 until successful (all characters of the pattern are matched) or unsuccessful (a mismatch is detected)

Page 25: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

25Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Boyer–Moore Algorithm

Boyer and Moore have developed an efficient pattern matching algorithm

Instead of sliding by one character to the right at a time, in Boyer–Moore approach, the sliding to the right is done in longer steps

The algorithm scans the character of pattern from right to left beginning with the rightmost character

If the text symbol compared with the rightmost pattern symbol does not occur in the pattern at all, then the pattern can be shifted by m positions (where m is length of pattern)

Page 26: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

26Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Knuth–Morris–Pratt Algorithm

The researchers Knuth, Morris, and Pattern proposed a linear time algorithm for the string matching problem

In this approach, a matching time of O(n) is achieved by avoiding comparisons with characters of T that have previously been involved in comparison with some element

of the pattern P to be matched so that backtracking is avoided

Page 27: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

27Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

KMP Matcher The KMP matcher finds the occurrence of

the pattern P in text T and returns the number of shifts of P, after which the occurrence is found taking T, P, and prefix function p as inputs

Page 28: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

28Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

TRIES A compact data structure that represents a

set of strings (such as all the words in a text) known as tries

A trie is a tree-based data structure for storing strings to make pattern matching faster

A trie helps in pattern matching in time that is proportional to the length of the pattern

Tries can be used to perform prefix query for information retrieval

Prefix query searches for the longest prefix of a given string that matches a prefix of some string in the tries

Page 29: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

29Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

There are variants of tries, which are listed as follows:

Standard tries Compressed trie Suffix tries

Page 30: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

30Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

The standard trie for a set of strings S is an ordered tree such that

Each node but the root is labelled with a character;

The children of a node are alphabetically ordered;

The paths from the external nodes to the oot yield the strings of S

Page 31: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

31Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Compressed Tries Similar to the standard trie, a compressed is

a tree-based data structure For storing strings in order to make pattern

matching much faster This is an optimized approach for pattern

matching specially suitable for applications where time is a more crucial factor

Page 32: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

32Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Following are the unique characteristics of compressed tire:

A compressed trie (or Patricia trie) has internal nodes of degree at least 2

It is obtained from standard trie by compressing chains of redundant nodes

Page 33: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

33Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Compressed Trie

b

id uo

ok

il sh y

s

ell

to

ck p

Page 34: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

34Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Suffix Triesm e g h a v i n d 0 1 2 3 4 5 6 7 8

d e me ghavindnd

ndghavindnd

ghavindvind

Page 35: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

35Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

Suffix Tries A suffix trie is a compressed Trie for all

the suffixes of a text This is a compressed Trie, and hence,

possesses all features a compressed trie and makes it more powerful for making a search faster as it includes all suffixes of a text

Page 36: 16. Algo analysis & Design - Data Structures using C++ by Varsha Patil

36Oxford University Press © 2012

Data Structures Using C++ by Dr Varsha Patil

End of Chapter 16…!