Dual Bitmap Index: Space-Time Efficient Bitmap Index for Equality and Membership Queries

24
Dual Bitmap Index: Space-Time Efficient Bitmap Index for Equality and Membership Queries Niwan Wattanakitrungroj and Sirirut Vanichayobon Artificial Intelligence Research Laboratory Department of Computer Science, Prince of Songkla

description

Dual Bitmap Index: Space-Time Efficient Bitmap Index for Equality and Membership Queries. Niwan Wattanakitrungroj and Sirirut Vanichayobon Artificial Intelligence Research Laboratory Department of Computer Science, Prince of Songkla University. Introduction Variations of Bitmap Index - PowerPoint PPT Presentation

Transcript of Dual Bitmap Index: Space-Time Efficient Bitmap Index for Equality and Membership Queries

Page 1: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

Dual Bitmap Index: Space-Time Efficient Bitmap

Index for Equality and Membership Queries

Niwan Wattanakitrungroj and Sirirut Vanichayobon

Artificial Intelligence Research LaboratoryDepartment of Computer Science, Prince of Songkla University

Page 2: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 2/24

Introduction

Variations of Bitmap Index

- Simple Bitmap Index

- Interval Bitmap Index

- Scatter Bitmap Index

- Encoded Bitmap Index

- Dual Bitmap Index

Performance Study

Conclusion

Outline

Page 3: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 3/24

2541

SO N G K L A

B K K

D ata M art

D ata W ar ehouse0

10

20

30

40

50

60

70

80

90

100

1st Qtr 2nd Qtr 3rd Qtr 4th Qtr

East

West

North

1st Qtr

2nd Qtr

3rd Qtr

4th Qtr

2535

M etadata R epository

E xtractT ransf rom

L oadR ef resh

S erver

O L A P S erver

M onitorng & A dm instration

A nalysisQ uery /R eportingD ata M ining

T oolsD ata S ource

O perational dbs

E xternal S ource

D ata W areho using A rch i tecture

Introduction

- A data warehouse is a large repository of information accessed through OLAP application.

- A majority of requests for information from a data warehouse involve dynamic ad hoc queries.

- The ability to answer these queries quickly is a critical issue in the data warehouse environment.

Page 4: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 4/24

Introduction

Summary tables

Indexes

Parallel machines

To speed up query processing :

Page 5: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 5/24

Bitmap Index

simple to represent

uses less space

more CPU-efficient

low-cost Boolean operations

Characteristic :

Introduction :

Page 6: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 6/24

Bitmap IndexName Gender Education

Suda F BS

Wichai M BS

Jonh M MS

Marry F PhD

Somsak M BS

… …

F

1

0

0

1

0

BS

1

1

0

0

1

MS

0

0

1

0

0

M

0

1

1

0

1

PhD

0

0

0

1

0

RID

1

2

3

4

5

RID

1

2

3

4

5

Select Count(*)

From Employee

Where Gender=“F”;

Answer : 2

Select Name

From Employee

Where Gender=“M” and

Education=“MS”

Answer : John

Introduction :

Select Name

From Employee

Where Education in {MS,PhD}

Answer : John, Marry

Employee Table

Equality Query

Membership Query

Page 7: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 7/24

Introduction

Variations of Bitmap Index

- Simple Bitmap Index

- Interval Bitmap Index

- Scatter Bitmap Index

- Encoded Bitmap Index

- Dual Bitmap Index

Performance Study

Conclusion

Outline

Page 8: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 8/24

Related WorkSimple Bitmap Index C = 15 15 bitmap vectors

Variations of Bitmap Index

Let C be a number of distinct values of the indexed attribute(Cardinallity).

Bitmap vectors : 0 1 2 1, , ,..., CS S S S " " vA v s

Query :

" " 22A S

Page 9: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 9/24

3v

Interval Bitmap Index Related Work

C = 15 8 bitmap vectors

Variations of Bitmap Index

Bitmap vectors : 1

0 1 2 2, , ,..., , C

I I I I

0

0

1

1

0

if 0, 0,

if 1, 2,

if 1, 3,

" " if ,

v v

v

I v m

I v C

I v C

A v I I v m

I I

12

1

0

if , 0,

if m 1, 0,

( ) if 1

C

v m v m

v m m

I I v C m

I I v C

Query" " 2 32A I I , jI j j m1,2

C

m

Page 10: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 10/24

Scatter Bitmap Index C = 15 8 bitmap vectors,

Variations of Bitmap Index

Related Work

Bitmap vectors : 1 2 1 0 1, ,..., , , ,...,C CL L L Z Z Z

1, m C

( - ) ( )

( - ) mod( )

if " "

otherwise

1 1 1

1 1 1

0v m v m

v m v m

Z Z vA v

Z L

m = 5

Query

" " 1 22A Z L

Page 11: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 11/24

Encoded Bitmap Index Related Work

C = 15 4 bitmap vectors

Variations of Bitmap Index

Mapping all Bitmap Vector

Query :Bitmap vectors : log 10 1 2, , ,..., CE E E E

" "2A

Page 12: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 12/24

Introduction

Variations of Bitmap Index

- Simple Bitmap Index

- Interval Bitmap Index

- Scatter Bitmap Index

- Encoded Bitmap Index

- Dual Bitmap Index

Performance Study

Conclusion

OutlineVariations of Bitmap Index

Page 13: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 13/24

Dual Bitmap Index

Variations of Bitmap Index

Encoding Scheme of five bitmap indices

Need

C bitmap vectors

Need

bitmap vectors

2

CNeed

bitmap vectors

2 C

Need

bitmap vectors

log C Need

bitmap vectors

. . 2 0 25 0 5C

Page 14: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 14/24

Dual Bitmap Index

Variations of Bitmap Index

Page 15: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 15/24

1. Assign an increasing sequence of numbers to each of the distinct values of A (i.e., 0,1,…,C-1).

4. For each value v on record at position i in A

1

0

iD

if i = r and s

otherwise

where 2( ) 0.25 0.5 ,r hiC v

rrnrn

vrs mod2

)1)((1

and v is the value of an indexed attribute for any record.

Creation of Dual Bitmap Index

C =15 A = {0,1,2,…,14}

Variations of Bitmap Index

2. Calculate n :

2 0.25 0.5n C (The total number of bitmap vectors created )

hiC

2

nhiC3. Calculate : (the highest value of C that can be represent

by n bitmap vector)

n = 6

hiC = 15

Page 16: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 16/24

1. Find the sequence number of the searching value.

2. " " r sA v D D

where 2( ) 0.25 0.5 ,r hiC v

rrnrn

vrs mod2

)1)((1

and v is the value of an indexed attribute for any record.

Equality and Membership Queries

“A = 2” 5 2D D

Variations of Bitmap Index : Propose Bitmap Index

Page 17: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 17/24

Introduction

Variations of Bitmap Index

- Simple Bitmap Index

- Interval Bitmap Index

- Scatter Bitmap Index

- Encoded Bitmap Index

- Dual Bitmap Index

Performance Study

Conclusion

Outline

Page 18: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 18/24

Performance study

Page 19: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 19/24

Performance studyNumber of bitmap vectors used to

represent an attribute with cardinality C

(Space)

Scatter

Dual

Encoded

Simple

Interval

Scatter

Dual

Encoded

Page 20: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 20/24

Performance study

Page 21: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 21/24

Space-Time Trade-off for five Bitmap Indices

C=50, N=1,000,000 (The data sets from TPC-H Benchmark)

Performance study

Simple

Interval

Scatter

Encoded

Dual

Page 22: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 22/24

Introduction

Variations of Bitmap Index

- Simple Bitmap Index

- Interval Bitmap Index

- Scatter Bitmap Index

- Encoded Bitmap Index

- Dual Bitmap Index

Performance Study

Conclusion

Outline

Page 23: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 23/24

Conclusion

Dual bitmap index uses less space while maintaining query processing time for equality and membership queries.

Dual Bitmap Index achieves this by representing each attribute value using only two bitmap vectors, and only the low-cost Boolean AND operation is used to answer equality query.

Dual Bitmap Index has better space-time performance than the other bitmap indexing techniques.

Simple Bitmap Index requires the most space.

Encoded Bitmap Index’ s processing time is the worst.

Page 24: Dual Bitmap Index: Space-Time Efficient Bitmap  Index for Equality  and Membership Queries

ISCIT 2006Artificial Intelligence Research Group 24/24

Thank You

Question & answer