Review: Scalable Semantic Web Data Management Using Vertical Partitioning
-
Upload
guillermo-cabrera -
Category
Technology
-
view
497 -
download
0
description
Transcript of Review: Scalable Semantic Web Data Management Using Vertical Partitioning
![Page 1: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/1.jpg)
Abadi, Marcus, Madden, HollenbachVLDB 2007
Presented by: {Gui}llermo CabreraThe University of Texas at Austin
![Page 2: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/2.jpg)
Problem
Storage Goal
RDBMS use
RDF Physical Organization
Column store vs. Row Store
Materialized Path Expressions
Experiment & Results
Discussion
![Page 3: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/3.jpg)
Performance: Self-joins
Many triples
![Page 4: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/4.jpg)
Achieve scalability & performance in triple storage
Survey approaches in RDBMS
Benefits of vertical partition and column store
![Page 5: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/5.jpg)
1 table with 3 indexed columns?
Multi layer architecture◦ Translate -> Optimize -> Execute
Mapping tables for long URI and literals
Jena, Oracle, Sesame, 3store (Hyunjun),
Hexastore (Donghyuk)
![Page 6: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/6.jpg)
Property tables◦ Clustered property table
Denormalize RDF (wider tables)
Clustering algorithm
NULL values
![Page 7: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/7.jpg)
![Page 8: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/8.jpg)
Property tables◦ Property-Class Tables
Exploit the type property
Properties may exist in multiple tables
![Page 9: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/9.jpg)
![Page 10: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/10.jpg)
Advantage:◦ Fewer joins
Disadvantage:◦ NULL values
◦ Multivalued attributes are complicated
![Page 11: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/11.jpg)
Vertical Partition◦ n two-column tables, n = # of unique properties
◦ Table sorted by subject
Merge join
![Page 12: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/12.jpg)
![Page 13: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/13.jpg)
• Advantage
Multi valued attributes supported
No clustering algorithm (Property tables)
Only accessed properties are read
• Disadvantage
Use of multiple properties (table joins)
Inserts expensive
![Page 14: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/14.jpg)
Triple Store
Property Table
Vertical Partition (Row Store)
Vertical Partition Store (Column Store)
![Page 15: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/15.jpg)
Why?
Projection is free
Tuple headers (metadata on row)◦ 35 bytes in Postgres vs. 8 bytes in C-Store
Column oriented compression◦ Run-length encoding (ex. 1,1,1,2,2 1x3, 2x2)
Optimized merge join◦ Prefetching
![Page 16: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/16.jpg)
<BookID1, Author, http://preamble/FoxJoe>
<http://preamble/FoxJoe,wasBorn, “1860”>
Find all books whose authors were born in 1860
![Page 17: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/17.jpg)
![Page 18: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/18.jpg)
Barton Libraries Dataset
Longwell Queries◦ Calculating counts
◦ Filtering
◦ Inference
![Page 19: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/19.jpg)
8.3 GB – Triple Store (Postgres)
14 GB – Property Table (Postgres)
5.2 GB – Vertically Partitioned (Postgres)
2.7 GB – Vertically Partitioned (C-store)
Including indices and mapping table
![Page 20: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/20.jpg)
![Page 21: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/21.jpg)
![Page 22: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/22.jpg)
![Page 23: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/23.jpg)
Replace ◦ subject-object joins subject-subject joins
![Page 24: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/24.jpg)
Add 60 integer valued columns
7 GB increase in size
![Page 25: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/25.jpg)
Great for reads, writes not considered
What about load times?
Using another benchmark (ex. LUBM)?
Native XML databases for RDF/XML?
Test triple store in Sesame
![Page 26: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/26.jpg)
![Page 27: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.fdocuments.net/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/27.jpg)