Download - Vijay Gandhi Masters Student Advisor Dr. Shashi Shekhar Committee Members Dr. Bradley Carlin

1

Plan B Defense –Context-Inclusive Approach to Speed-up Function

Evaluation for Statistical Queries: An Extended Abstract

Vijay GandhiMasters Student

AdvisorDr. Shashi Shekhar

Committee MembersDr. Bradley Carlin

Dr. Jaideep Srivastava

Department of Computer ScienceUniversity of Minnesota, USA

Biography Bachelor of Computer Science & Engineering, Madras University Master of Science, Computer Science, University of Minnesota (current) 3 years of work experience in the field of Data warehousing and Business

Intelligence at Oracle Corporation

Publications

Context-Inclusive Approach to Speed-up Function Evaluation for Statistical Queries: An Extended Abstract, Vijay Gandhi, James Kang, Shashi Shekhar, Junchang Ju, Eric Kolaczyk, Sucharita Gopal. IEEE International Conference on Data Mining Workshop on Spatio and Spatio-Temporal Data Mining (SSTDM), 2006

Parallelizing Multi-scale and Multi-granular Spatial Data Mining Algorithm, Vijay Gandhi, Mete Celik, Shashi Shekhar. PGAS Programming Models Conference 2006

Context-Inclusive Approach to Speed-up Function Evaluation for Statistical Queries, Vijay Gandhi, James Kang, Shashi Shekhar, Junchang Ju, Eric Kolaczyk, Sucharita Gopal. Submitted to the Journal on Knowledge and Information Systems, 2007

2

Scope of the talk NSF Project on Land-use classification Joint collaboration with Boston University Main goal: Reduce the execution time of the algorithm Contributions:

Approach Description Results Effort (in hrs)

Black box tuning Limiting factor was modified to control

the precision

Reduced CPU time by 50% with

98% accuracy

20

Code Conversion Code was converted from MATLAB to C

Reduced CPU time by about 50%

120

Parallel implementation

Converted code from C to UPC on

Cray X1

Obtained near linear scalability

20

Algorithm Changes

Context-Inclusive approach

100

3

4

Overview Motivation Background Problem Statement Related Work Contribution Validation Conclusion & Future Work

5

Motivation Land-cover Change

Loss of land - 217 square miles of Louisiana’s coastal lands were transformed to water after Hurricanes Katrina and Rita.

Deforestation – Brazil lost 150,000 sq. km. of forest between

May 2000 and August 2006 Urban Sprawl

Mississippi River Delta, Louisiana(Red represents land loss between 2004

and 2005. Courtesy: USGS)

Deforestation, Ariquemes, Brazil(Courtesy: Global Change Program,

University of Michigan)

Urban Sprawl in Atlanta(Red indicates expansion between

1976 and 1992)

6

Multiscale Multigranular Image Classification (MSMG) Input: Class hierarchy, Likelihood of specific classes

Conifer Hardwood Brush Grass

Likelihood of specific-classesLand-use Class Hierarchy

Output: Classified images at multiple scales

Scale: 2x2 Scale: 4x4 Scale: 64x64. . .

Scale: 1x1

Source courtesy: Boston University

Background Algorithm

Divide input image into quad segments recursively

Calculate the log-likelihood for each class using Expectation Maximization

Decide the class for each segment

ProfilingInput Image: 128 x 128

Function # Calls % Time

gen_loglike 294,915 80

7

Background Gen_loglike

gen_loglike calculates the log-likelihood of a non-specific class It uses Expectation Maximization # gen_loglike = f ( # general classes, image size, spatial scale)

Number of iterations depends on the number of general class, image size, and spatial scale

8

10

Problem Statement Given:

Algorithm for Multiscale Multigranular Image Classification Input image with likelihood of specific classes, class hierarchy

Find: Classification at each quad segment

Objective: Minimize computation time

Constraints: Use the Expectation Maximization (EM) algorithm to calculate

quality measure of each non-specific class High Accuracy

Classification Examples Likelihood of classes are compared to

find the best candidate

C

C1 C2

0.9 0.9

0.9 0.9

LikelihoodC1

0.1 0.1

0.1 0.1

LikelihoodC2

Best Candidate

C1

0.9 0.9

0.9 0.9

0.1 0.1

0.1 0.1

C2

0.1 0.9

0.1 0.9

0.9 0.1

0.9 0.1

C

0.9 0.1

0.1 0.9

0.1 0.9

0.9 0.1

C

0.8 0.8

0.1 0.1

0.2 0.2

0.9 0.9

C

0.75 0.75

0.1 0.1

0.25 0.25

0.9 0.9

C1

3.6

0.4

2.0

2.0

2.2

2.3

0.4

3.6

2.0

2.0

1.8

1.7

11

12

Algorithm: Expectation Maximization Given:

Class hierarchy, Likelihood of specific classes

Find: Best Class and corresponding likelihood for a region (e.g. 2x2 region)

Likelihood of a specific class = sum of corresponding likelihood

(Likelihood of C1 = 2.2; Likelihood of C2 = 1.8 ) Log-likelihood of best specific class = -3.4296 (C1)

Likelihood of non-specific class (EM):

1. Initialize the proportion of each corresponding specific class

2. Multiply each likelihood by corresponding specific class proportion

3. Add the likelihood at corresponding pixel

4. Divide the value in step 1 by corresponding value in Step 2

5. Average the likelihood for each specific class

6. Repeat Step 2 to Step 5 until required accuracy

Likelihood of classes C1 and C2

at a 2x2 region

C

C1 C2

Class hierarchy

Lij(C1) Lij(C2)

0.8 0.8

0.1 0.1

0.2 0.2

0.9 0.9

Example

13

Execution Trace: Expectation Maximization

Given: Class hierarchy, Likelihood of

specific classes

Find: Best Class for the 2x2 region

Likelihood of C:

1. Iteration 1: EM(p1n, p2n)

2. Multiply: L1ij(C1) = Lij(C1) . p1n; L2ij(C2) = Lij(C2) . p2n

3. Add: Lij = L1ij(C1) .+ L2ij(C2)

4. Divide: L1ij(C1) = L1ij(C1)./Lij; L2ij(C2) = L2ij(C2)./Lij

5. Average: p1n+1 = Avg(L1ij(C1)); p2n+1 = Avg(L2ij(C2))

0.1 0.1

0.45 0.45

0.4 0.4

0.05 0.05

EM(0.5, 0.5)

0.5 0.5

0.5 0.5

0.2 0.2

0.9 0.9

0.8 0.8

0.1 0.1

0.55, 0.45


at a 2x2 region

C

C1 C2

Class hierarchy

Lij(C1) Lij(C2)

0.8 0.8

0.1 0.1

0.2 0.2

0.9 0.9

After 17 iterations: Mixing proportions = (0.6042, 0.3958)

Likelihood of C:

Log-likelihood of best specific class = -3.4296 (C1) Log-likelihood considering penalties:

C (penalty: 4.3922) = -7.1235 C1 (penalty: 4.0456) = -7.4752

Best class: Class with maximum log-likelihood C1

Execution Trace: Expectation Maximization


at a 2x2 region

C

C1 C2

Lij(C1) Lij(C2)

0.8 0.8

0.1 0.1

0.2 0.2

0.9 0.9

0.31664 0.31664

0.03958 0.03958

0.12084 0.12084

0.10875 0.10875

Log-likelihood of C = -2.7314

14

18

Related Work

EM Performance improvement

Single candidate[ Aitken’s Method ][ Triple Jump – Huang et al.]

Multiple candidate[Context-Inclusive]

19

Context-Exclusive Approach Instance Tree

Each candidate model is analyzed independently until convergence

The candidate model with maximum likelihood is selected

Instance Tree

Context-Exclusive Approach:1. Select the best specific class, Brush

(Specific classes do not require EM)2. Vegetation is evaluated until convergence (46)3. Forest is evaluated until convergence (34)4. Non-Forest is evaluated until convergence (3)5. Select the best class (Non-Forest)

1.2.

3.4.

1

2

3

4

Land-use Class Hierarchy

Total iterations: 46 + 34 + 3 = 83

20

Limitations of Context-Exclusive Approach Computational Scalability

For 512 x 512 pixels - 7 hours of CPU time Where is the computational bottleneck?

80% of total execution time is spent in computing maximum likelihood

Number of function calls is dependent on the number of pixels, and spatial scale

CPU Time for example datasets

As spatial scale increases, the computation time increases exponentially

Contributions – Context Inclusive Approaches Context-Inclusive Approaches – Ideal and Heuristic Context-Inclusive Approach – Ideal

If we may calculate a theoretical upper bound on likelihood for each class, it may be used to filter candidates

If upper bound calculation is not significant and has good filtering property, Context-Inclusive approach may perform better

21


Context-Inclusive Approach - Ideal:1. Select the best specific class, Brush (Specific classes do not require EM)2. Calculate upper bound for each non-specific class (Non-forest , Forest , Vegetation )3. Assume Non-forest is evaluated first (3 iteration)4. We may prune Forest because upper bound of Forest is less than likelihood of Non-forest (0 iteration)5. We may prune Vegetation because upper bound of Vegetation is less than likelihood of Non-forest (0 iteration)6. Non-forest is selected

Total EM iterations: 3+ 0 + 0 = 3

Context-Inclusive Approach – Ideal Lemma

Context Inclusive is correct such that each region is classified as with the best candidate class from the user-defined concept hierarchy. Likelihood value of a non-specific class can never go beyond its upper bound Upper bound can be used to compare the likelihood of non-specific classes

Discussion Is there any method to calculate the upper bound? C. Biernacki. An Asymptotic Upper Bound of the Likelihood to Prevent Gaussian

Mixtures from Degenerating,Preprint, 2005 Expensive

22

23

Context Inclusive Approach - HeuristicInstance Tree is evaluated with context

Each candidate model is analyzed until it is better than the current best

Uses a instance-level syntax tree

Context-Inclusive Approach - Heuristic:1. Select the best specific class, Brush (Specific classes do not require EM)2. Vegetation is evaluated until convergence (46)3. Forest is evaluated (4)4. Non-Forest is evaluated (1)5. Non-Forest is the best-so-far

1.2.

3.4.

1

2

3

4 Land-use Class Hierarchy

Total iterations: 46 + 4 + 1 = 51

24

Context-Exclusive vs. Context-Inclusive HeuristicAlgorithm 1 Context-Exclusive Approach

1: Function ContextExclusive(set Cand)2: Select the best specific class3: for each candidate model c Cand do4: repeat5: Refine quality measure for each candidate model c Cand6: until EM converges7: end for8: Select candidate model with the maximum quality measure9: return c

1: Function ContextInclusive(set Cand)2: Select the best specific class3: for each remaining candidate model c Cand do4: repeat5: Refine quality measure for each candidate model c Cand 6: until EM converges OR quality measure exceeds best so far7: end for8: Select candidate model that is best so far9: return c

Algorithm 2 Context-Inclusive Approach - Heuristic

26

Experimental Design

Input: Synthetic dataset and Real dataset Language: MATLAB Platform: UltraSparc III 1.1 GHz, 1 GB RAM Measurements: Number of Iterations, CPU Time, Accuracy

ImageClassification

Benchmark Datasets

Limiting Factor

Measurements

Experimental Design

Experimental Questions: How does Context-Exclusive compare to Context-Inclusive Heuristic

approach? Accuracy Computational Efficiency

Candidates:Context-Exclusive,Context-Inclusive Heuristic

CompareClassifications

ClassificationAccuracy

Synthetic

Real

27

Experiments – Dataset S Synthetic Dataset 128 x 128 pixels, 7 Classes Input: Class hierarchy, Likelihood of specific classes

Conifer Hardwood Brush Grass

Likelihood of specific-classesLand-use Class Hierarchy



Scale: 1x1

28

Experiments – Dataset R Real Dataset; Plymouth County, Massachusetts 128 x 128 pixels, 12 Classes Input: Class hierarchy, Likelihood of specific classes




Barren Brush Pitch Pine Bogs

…

Scale: 1x1

How accurate is Context-Inclusive as compared to Context-Exclusive?

Accuracy (Limiting Factor = 0.00001)

Accuracy of Above 99% for Synthetic Dataset About 98% for Real Dataset

30

31

How does computation of Context-Exclusive Compare that of Context-Inclusive? Number of Iterations (Limiting Factor: 0.00001)

Iterations reduced 67% for Synthetic Dataset 61% for Real Dataset

Dataset S Dataset R

32

CPU Time (Limiting Factor = 0.00001)

CPU Time reduced 53% for Synthetic Dataset 47% for Real Dataset

How does computation of Context-Exclusive compare to that of Context-Inclusive?

Dataset S Dataset R

33

Conclusion Context-Inclusive approach for function evaluation Experimental results supporting contributions Reduced the CPU Time by 50% without sacrificing

accuracy Future work: Context-Inclusive with Upper bound

SummaryApproach Description Results Effort (in hrs)

Black box tuning Limiting factor was modified to control

the precision

Reduced CPU time by 50% with 98%

accuracy

20

Code Conversion Code was converted from MATLAB to C

Reduced CPU time by about 50%

120

Parallel implementation

Converted code from C to UPC on

Cray X1

Obtained near linear scalability

20

Algorithm Changes Context-Inclusive approach

Reduced CPU time without sacrificing

accuracy

100

34

Acknowledgements

James Kang Dr. Junchang Ju Dr. Eric Kolaczyk Dr. Sucharita Gopal

35

40

Forest

Non-Forest

VegetationL1

Context-Exclusive Approach Instance Tree

Each candidate model is analyzed independently until convergence

The candidate model with maximum likelihood is selected

Instance Tree

Iterations

Qua

lity

Mea

sure

Context-Exclusive Approach:1. Vegetation is evaluated until convergence, L12. Forest is evaluated until convergence, L23. Non-Forest is evaluated until convergence, L3

L2

L3

41

Contributions Context Inclusive Approach Instance Tree is evaluated with context

Each candidate model is analyzed until it is better than the current best

Uses a instance-level syntax tree

Context-Inclusive Approach:1. Vegetation is evaluated until convergence, L12. Forest is evaluated until L23. Non-Forest is evaluated until L3

Forest

Non-Forest

Vegetation

Iterations

Qua

lity

Mea

sure

L1L2

L3