What Counts for a DBA Observant Louis Davidson ([email protected])
HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) [email protected].
-
Upload
harry-warren -
Category
Documents
-
view
226 -
download
0
Transcript of HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) [email protected].
![Page 2: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/2.jpg)
Hierarchies
3
![Page 3: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/3.jpg)
4
Hierarchy Types• Trees - Single Parent Hierarchies
• Graphs – Multi Parent Hierarchies
• Note: Graphs can be complex to deal with as a whole, but often you can deal with them as a set of trees
ScrewPiece of Wood
Wood with Tape Screw and Tape
Tape
![Page 4: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/4.jpg)
5
Hierarchy Uses
• Trees• Species
• Jurisdictions
• “Simple” Organizational Charts (Or at least the base manager-employee part of the organization)
• Directory folders
• Graph• Bill of materials
• Complex Organization Chart (all those dotted lines!)
• Genealogies • Biological (Typically with limit cardinality of parents to 2 )
• Family Tree – (Sky is the limit)
![Page 5: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/5.jpg)
Implementation of a Hierarchy• “There is more than one way to shave a dog”
• None of which are pleasant for the dog or the shaver
• And the doctor who orders it only asks for a bald dog
• Hierarchies are not at all natural to manipulate/query using relational code• And the natural, recursive processing of a node at a time is horribly
difficult and slow in relational code
• So, multiple methods of processing them have arisen through the years
• The topic (much like the topic of how cruel it is to shave a dog), inspires religious-like arguments
• I find all of the implementation possibilities fascinating, so I set out to do an overview of them all…
6
![Page 6: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/6.jpg)
7
Working with Trees - Background
• Node Recursion
• Relational Recursion
![Page 7: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/7.jpg)
Cycles in Hierarchies
8
Parent
Child
• “I’m my own grandpa” syndrome• Must be understood or can cause infinite loop in
processing• Generally disallowed in trees• Generally handled in graphs
Grandparent
![Page 8: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/8.jpg)
9
Tree Processing Algorithms
• There are several methods for processing trees in SQL
• We will cover• Fixed Levels
• Adjacency List
• HierarchyId
• Path Technique
• Nested Sets
• Kimball Helper Table
• Without giving away too much, pretty much all of the methods have some use…
![Page 9: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/9.jpg)
Preconceived Notions
Which method/algorithm do you expect to be fastest?
• Fixed Levels
• Adjacency List
• HierarchyId
• Path Technique
• Nested Sets
• Kimball Helper Table
10
![Page 10: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/10.jpg)
Coding for trees• Manipulation:
• Creating a new node
• Moving/Reparenting a node
• Deleting a node (with/without children)
• Usage• Getting the children of a node
• Getting the parent of a node
• Aggregating along the tree
• Note: No tree algorithms allow for “simple” SQL solutions to all of these problems
• We will have demos of all of these operations…
11
![Page 11: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/11.jpg)
12
Reparenting Example
• Starting with:
• Perhaps ending with:Dragging along all of it’s child nodes along
with it
![Page 12: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/12.jpg)
13
Implementing a tree – Fixed Levels
CREATE TABLE CompanyHierarchy( Company varchar(100) NULL, Headquarters varchar(100) NULL, Branch varchar(100) NULL, PRIMARY KEY (Company, Headquarters, Branch))
Very limited, but very fast and easy to work with
I will not demo this structure today because it’s use is both extremely obvious and limited
![Page 13: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/13.jpg)
14
Implementing a tree – Adjacency List
• Every row includes the key value of the parent in the row
• Parent-less rows have no parent value
• Code to get information out is the most complex to write (though not as inefficient as it might seem)
• CREATE TABLE CompanyHierarchy( Organization varchar(100) NOT NULL PRIMARY KEY, ParentOrganization varchar(100) NULL REFERENCES CompanyHierarchy (Organization), Name varchar(100) NOT NULL)
![Page 14: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/14.jpg)
15
Implementing a tree – Path Method
Every row includes a representation of the path to their parent
Processing makes use of like and string processing (I have seen a case that used fixed length binary values)
Limitation on path size for string manipulation/indexing
CREATE TABLE CompanyHierarchy( OrganizationId int NOT NULL PRIMARY KEY, Name varchar(100) NOT NULL, Path varchar(900))
![Page 15: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/15.jpg)
16
Implementing a tree – HierarchyId
Somewhat unnatural method to the typical SQL Programmer
Similar to the Path Method, and has some of the same limitations when moving around nodes
Node path does not use data natural to the table, but rather positional locationing
CREATE TABLE CompanyHierarchy( OrganizationId int NOT NULL PRIMARY KEY, Name varchar(100) NOT NULL, OrgNode hierarchyId not null)
![Page 16: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/16.jpg)
17
Implementing a tree – Nested Sets
Query processing is done using range queries
Structure is quite slow to maintain due to fragile structure
Can produce excellent performance for queries
CREATE TABLE CompanyHierarchy( Organization varchar(100) NOT NULL PRIMARY KEY, Name varchar(100) NOT NULL, Left int NOT NULL, Right int NOT NULL)
![Page 17: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/17.jpg)
18
Implementing a tree – Kimball Helper
• Developed initially for data warehousing since data is modified all at once with a fixed cost
• Basically explodes the hierarchy into a table that turns all hierarchy manipulations into a relational query
• Maintenance can be slightly costly, but using the data is extremely fast
![Page 18: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/18.jpg)
19
Implementing a tree – Kimball Helper• For the rows in yellow, expands to the table shown:
ParentId ChildId Distance ParentRootNode
ChildLeafNode
1 1 0 1 01 2 1 1 01 4 2 1 11 5 2 1 12 2 0 0 02 4 1 0 12 5 1 0 1
![Page 19: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/19.jpg)
Demo Setup
• For each style of hierarchy, we will see how to:• Implement a physical model that models the corporate
hierarchy of the previous graphics
• Create Stored Procedures for Insert, Delete, and Reparenting a Node
• Queries to access and aggregate the data in the hierarchy
• We will do this for two sets of data, the data in the presentation, and then a randomly generated set.
20
![Page 20: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/20.jpg)
21
Demo Code
• Example code available in download
![Page 21: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/21.jpg)
Did I change any of your minds?
22
![Page 22: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/22.jpg)
Graphs
• Generally implemented in same manner as adjacency list• Can be processed in the same manner as an adjacency
list
• Primary difference is child can have > 1 parent node
• Cycles are generally acceptable
• Graph structure will always be external to data structure
• Graphs are even more natural data structures than trees
23
![Page 23: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/23.jpg)
Graphs are Everywhere
• Almost any many to many can be a graph
24
Movie
ActorActingCast
DirectorMovieDirectory
![Page 24: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/24.jpg)
25
Graph Demo
Person
InterestPersonInterest
PersonConnection
![Page 25: HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org) drsql@hotmail.com.](https://reader034.fdocuments.net/reader034/viewer/2022051000/56649d995503460f94a83a19/html5/thumbnails/25.jpg)
Contact info
• Louis Davidson - [email protected]
• Website – http://drsql.org <-- Get slides here
• Twitter – http://twitter.com/drsql
• SQL Blog http://sqlblog.com/blogs/louis_davidson
• Simple Talk Blog – What Counts for a DBAhttp://www.simple-talk.com/community/blogs/drsql/default.aspx