Complex queries in a distributed multi-model database
-
Upload
max-neunhoeffer -
Category
Technology
-
view
66 -
download
0
Transcript of Complex queries in a distributed multi-model database
![Page 1: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/1.jpg)
Complex queries in adistributed multi-modeldatabaseMax Neunhöffer
Tech Talk Geekdom, 16 March 2015
www.arangodb.com
![Page 2: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/2.jpg)
Documents and collections{
"_key": "123456",
"_id": "chars/123456",
"name": "Duck",
"firstname": "Donald",
"dob": "1934-11-13",
"hobbies": ["Golf",
"Singing",
"Running"],
"home":
{"town": "Duck town",
"street": "Lake Road",
"number": 17},
"species": "duck"
}
When I say “document”,Imean “JSON”.A “collection” is a set ofdocuments in a DB.The DB can inspect thevalues, allowing forsecondary indexes.Or one can just treat theDB as a key/value store.Sharding: the data of acollection is distributedbetween multiple servers.
1
![Page 3: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/3.jpg)
Documents and collections{
"_key": "123456",
"_id": "chars/123456",
"name": "Duck",
"firstname": "Donald",
"dob": "1934-11-13",
"hobbies": ["Golf",
"Singing",
"Running"],
"home":
{"town": "Duck town",
"street": "Lake Road",
"number": 17},
"species": "duck"
}
When I say “document”,Imean “JSON”.
A “collection” is a set ofdocuments in a DB.The DB can inspect thevalues, allowing forsecondary indexes.Or one can just treat theDB as a key/value store.Sharding: the data of acollection is distributedbetween multiple servers.
1
![Page 4: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/4.jpg)
Documents and collections{
"_key": "123456",
"_id": "chars/123456",
"name": "Duck",
"firstname": "Donald",
"dob": "1934-11-13",
"hobbies": ["Golf",
"Singing",
"Running"],
"home":
{"town": "Duck town",
"street": "Lake Road",
"number": 17},
"species": "duck"
}
When I say “document”,Imean “JSON”.A “collection” is a set ofdocuments in a DB.
The DB can inspect thevalues, allowing forsecondary indexes.Or one can just treat theDB as a key/value store.Sharding: the data of acollection is distributedbetween multiple servers.
1
![Page 5: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/5.jpg)
Documents and collections{
"_key": "123456",
"_id": "chars/123456",
"name": "Duck",
"firstname": "Donald",
"dob": "1934-11-13",
"hobbies": ["Golf",
"Singing",
"Running"],
"home":
{"town": "Duck town",
"street": "Lake Road",
"number": 17},
"species": "duck"
}
When I say “document”,Imean “JSON”.A “collection” is a set ofdocuments in a DB.The DB can inspect thevalues, allowing forsecondary indexes.
Or one can just treat theDB as a key/value store.Sharding: the data of acollection is distributedbetween multiple servers.
1
![Page 6: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/6.jpg)
Documents and collections{
"_key": "123456",
"_id": "chars/123456",
"name": "Duck",
"firstname": "Donald",
"dob": "1934-11-13",
"hobbies": ["Golf",
"Singing",
"Running"],
"home":
{"town": "Duck town",
"street": "Lake Road",
"number": 17},
"species": "duck"
}
When I say “document”,Imean “JSON”.A “collection” is a set ofdocuments in a DB.The DB can inspect thevalues, allowing forsecondary indexes.Or one can just treat theDB as a key/value store.
Sharding: the data of acollection is distributedbetween multiple servers.
1
![Page 7: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/7.jpg)
Documents and collections{
"_key": "123456",
"_id": "chars/123456",
"name": "Duck",
"firstname": "Donald",
"dob": "1934-11-13",
"hobbies": ["Golf",
"Singing",
"Running"],
"home":
{"town": "Duck town",
"street": "Lake Road",
"number": 17},
"species": "duck"
}
When I say “document”,Imean “JSON”.A “collection” is a set ofdocuments in a DB.The DB can inspect thevalues, allowing forsecondary indexes.Or one can just treat theDB as a key/value store.Sharding: the data of acollection is distributedbetween multiple servers.
1
![Page 8: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/8.jpg)
Graphs
A
B
D
E
F
G
C
"likes"
"hates"
A graph consists of vertices andedges.Graphs model relations, can bedirected or undirected.Vertices and edges aredocuments.Every edge has a _from and a _toattribute.The database offers queries andtransactions dealing with graphs.For example, paths in the graphare interesting.
2
![Page 9: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/9.jpg)
Graphs
A
B
D
E
F
G
C
"likes"
"hates"
A graph consists of vertices andedges.
Graphs model relations, can bedirected or undirected.Vertices and edges aredocuments.Every edge has a _from and a _toattribute.The database offers queries andtransactions dealing with graphs.For example, paths in the graphare interesting.
2
![Page 10: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/10.jpg)
Graphs
A
B
D
E
F
G
C
"likes"
"hates"
A graph consists of vertices andedges.Graphs model relations, can bedirected or undirected.
Vertices and edges aredocuments.Every edge has a _from and a _toattribute.The database offers queries andtransactions dealing with graphs.For example, paths in the graphare interesting.
2
![Page 11: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/11.jpg)
Graphs
A
B
D
E
F
G
C
"likes"
"hates"
A graph consists of vertices andedges.Graphs model relations, can bedirected or undirected.Vertices and edges aredocuments.
Every edge has a _from and a _toattribute.The database offers queries andtransactions dealing with graphs.For example, paths in the graphare interesting.
2
![Page 12: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/12.jpg)
Graphs
A
B
D
E
F
G
C
"likes"
"hates"
A graph consists of vertices andedges.Graphs model relations, can bedirected or undirected.Vertices and edges aredocuments.Every edge has a _from and a _toattribute.
The database offers queries andtransactions dealing with graphs.For example, paths in the graphare interesting.
2
![Page 13: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/13.jpg)
Graphs
A
B
D
E
F
G
C
"likes"
"hates"
A graph consists of vertices andedges.Graphs model relations, can bedirected or undirected.Vertices and edges aredocuments.Every edge has a _from and a _toattribute.The database offers queries andtransactions dealing with graphs.
For example, paths in the graphare interesting.
2
![Page 14: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/14.jpg)
Graphs
A
B
D
E
F
G
C
"likes"
"hates"
A graph consists of vertices andedges.Graphs model relations, can bedirected or undirected.Vertices and edges aredocuments.Every edge has a _from and a _toattribute.The database offers queries andtransactions dealing with graphs.For example, paths in the graphare interesting.
2
![Page 15: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/15.jpg)
Query 1Fetch all documents in a collectionFOR p IN people
RETURN p
[ { "name": "Schmidt", "firstname": "Helmut",
"hobbies": ["Smoking"]},
{ "name": "Neunhöffer", "firstname": "Max",
"hobbies": ["Piano", "Golf"]},
...
]
(Actually, a cursor is returned.)
3
![Page 16: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/16.jpg)
Query 1Fetch all documents in a collectionFOR p IN people
RETURN p
[ { "name": "Schmidt", "firstname": "Helmut",
"hobbies": ["Smoking"]},
{ "name": "Neunhöffer", "firstname": "Max",
"hobbies": ["Piano", "Golf"]},
...
]
(Actually, a cursor is returned.)
3
![Page 17: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/17.jpg)
Query 1Fetch all documents in a collectionFOR p IN people
RETURN p
[ { "name": "Schmidt", "firstname": "Helmut",
"hobbies": ["Smoking"]},
{ "name": "Neunhöffer", "firstname": "Max",
"hobbies": ["Piano", "Golf"]},
...
]
(Actually, a cursor is returned.)3
![Page 18: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/18.jpg)
Query 2Use filtering, sorting and limitFOR p IN people
FILTER p.age >= @minage
SORT p.name, p.firstname
LIMIT @nrlimit
RETURN { name: CONCAT(p.name, ", ",
p.firstname),
age : p.age }
[ { "name": "Neunhöffer, Max", "age": 44 },
{ "name": "Schmidt, Helmut", "age": 95 },
...
]
4
![Page 19: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/19.jpg)
Query 2Use filtering, sorting and limitFOR p IN people
FILTER p.age >= @minage
SORT p.name, p.firstname
LIMIT @nrlimit
RETURN { name: CONCAT(p.name, ", ",
p.firstname),
age : p.age }
[ { "name": "Neunhöffer, Max", "age": 44 },
{ "name": "Schmidt, Helmut", "age": 95 },
...
]
4
![Page 20: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/20.jpg)
Query 3Aggregation and functionsFOR p IN people
COLLECT a = p.age INTO L
FILTER a >= @minage
RETURN { "age": a, "number": LENGTH(L) }
[ { "age": 18, "number": 10 },
{ "age": 19, "number": 17 },
{ "age": 20, "number": 12 },
...
]
5
![Page 21: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/21.jpg)
Query 3Aggregation and functionsFOR p IN people
COLLECT a = p.age INTO L
FILTER a >= @minage
RETURN { "age": a, "number": LENGTH(L) }
[ { "age": 18, "number": 10 },
{ "age": 19, "number": 17 },
{ "age": 20, "number": 12 },
...
]
5
![Page 22: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/22.jpg)
Query 4JoinsFOR p IN @@peoplecollection
FOR h IN houses
FILTER p._key == h.owner
SORT h.streetname, h.housename
RETURN { housename: h.housename,
streetname: h.streetname,
owner: p.name,
value: h.value }
[ { "housename": "Firlefanz",
"streetname": "Meyer street",
"owner": "Hans Schmidt", "value": 423000
},
...
]
6
![Page 23: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/23.jpg)
Query 4JoinsFOR p IN @@peoplecollection
FOR h IN houses
FILTER p._key == h.owner
SORT h.streetname, h.housename
RETURN { housename: h.housename,
streetname: h.streetname,
owner: p.name,
value: h.value }
[ { "housename": "Firlefanz",
"streetname": "Meyer street",
"owner": "Hans Schmidt", "value": 423000
},
...
]
6
![Page 24: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/24.jpg)
Query 5
Modifying dataFOR e IN events
FILTER e.timestamp<"2014-09-01T09:53+0200"
INSERT e IN oldevents
FOR e IN events
FILTER e.timestamp<"2014-09-01T09:53+0200"
REMOVE e._key IN events
7
![Page 25: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/25.jpg)
Query 6Graph queriesFOR x IN GRAPH_SHORTEST_PATH(
"routeplanner", "germanCity/Cologne",
"frenchCity/Paris", {weight: "distance"} )
RETURN { begin : x.startVertex,
end : x.vertex,
distance : x.distance,
nrPaths : LENGTH(x.paths) }
[ { "begin": "germanCity/Cologne",
"end" : {"_id": "frenchCity/Paris", ... },
"distance": 550,
"nrPaths": 10 },
...
]
8
![Page 26: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/26.jpg)
Query 6Graph queriesFOR x IN GRAPH_SHORTEST_PATH(
"routeplanner", "germanCity/Cologne",
"frenchCity/Paris", {weight: "distance"} )
RETURN { begin : x.startVertex,
end : x.vertex,
distance : x.distance,
nrPaths : LENGTH(x.paths) }
[ { "begin": "germanCity/Cologne",
"end" : {"_id": "frenchCity/Paris", ... },
"distance": 550,
"nrPaths": 10 },
...
]
8
![Page 27: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/27.jpg)
Life of a query1. Text and query parameters come from user
2. Parse text, produce abstract syntax tree (AST)3. Substitute query parameters4. First optimisation: constant expressions, etc.5. Translate AST into an execution plan (EXP)6. Optimise one EXP, producemany, potentially better EXPs7. Reason about distribution in cluster8. Optimise distributed EXPs9. Estimate costs for all EXPs, and sort by ascending cost10. Instanciate “cheapest” plan, i.e. set up execution engine11. Distribute and link up engines on different servers12. Execute plan, provide cursor API
9
![Page 28: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/28.jpg)
Life of a query1. Text and query parameters come from user2. Parse text, produce abstract syntax tree (AST)
3. Substitute query parameters4. First optimisation: constant expressions, etc.5. Translate AST into an execution plan (EXP)6. Optimise one EXP, producemany, potentially better EXPs7. Reason about distribution in cluster8. Optimise distributed EXPs9. Estimate costs for all EXPs, and sort by ascending cost10. Instanciate “cheapest” plan, i.e. set up execution engine11. Distribute and link up engines on different servers12. Execute plan, provide cursor API
9
![Page 29: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/29.jpg)
Life of a query1. Text and query parameters come from user2. Parse text, produce abstract syntax tree (AST)3. Substitute query parameters
4. First optimisation: constant expressions, etc.5. Translate AST into an execution plan (EXP)6. Optimise one EXP, producemany, potentially better EXPs7. Reason about distribution in cluster8. Optimise distributed EXPs9. Estimate costs for all EXPs, and sort by ascending cost10. Instanciate “cheapest” plan, i.e. set up execution engine11. Distribute and link up engines on different servers12. Execute plan, provide cursor API
9
![Page 30: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/30.jpg)
Life of a query1. Text and query parameters come from user2. Parse text, produce abstract syntax tree (AST)3. Substitute query parameters4. First optimisation: constant expressions, etc.
5. Translate AST into an execution plan (EXP)6. Optimise one EXP, producemany, potentially better EXPs7. Reason about distribution in cluster8. Optimise distributed EXPs9. Estimate costs for all EXPs, and sort by ascending cost10. Instanciate “cheapest” plan, i.e. set up execution engine11. Distribute and link up engines on different servers12. Execute plan, provide cursor API
9
![Page 31: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/31.jpg)
Life of a query1. Text and query parameters come from user2. Parse text, produce abstract syntax tree (AST)3. Substitute query parameters4. First optimisation: constant expressions, etc.5. Translate AST into an execution plan (EXP)
6. Optimise one EXP, producemany, potentially better EXPs7. Reason about distribution in cluster8. Optimise distributed EXPs9. Estimate costs for all EXPs, and sort by ascending cost10. Instanciate “cheapest” plan, i.e. set up execution engine11. Distribute and link up engines on different servers12. Execute plan, provide cursor API
9
![Page 32: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/32.jpg)
Life of a query1. Text and query parameters come from user2. Parse text, produce abstract syntax tree (AST)3. Substitute query parameters4. First optimisation: constant expressions, etc.5. Translate AST into an execution plan (EXP)6. Optimise one EXP, producemany, potentially better EXPs
7. Reason about distribution in cluster8. Optimise distributed EXPs9. Estimate costs for all EXPs, and sort by ascending cost10. Instanciate “cheapest” plan, i.e. set up execution engine11. Distribute and link up engines on different servers12. Execute plan, provide cursor API
9
![Page 33: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/33.jpg)
Life of a query1. Text and query parameters come from user2. Parse text, produce abstract syntax tree (AST)3. Substitute query parameters4. First optimisation: constant expressions, etc.5. Translate AST into an execution plan (EXP)6. Optimise one EXP, producemany, potentially better EXPs7. Reason about distribution in cluster
8. Optimise distributed EXPs9. Estimate costs for all EXPs, and sort by ascending cost10. Instanciate “cheapest” plan, i.e. set up execution engine11. Distribute and link up engines on different servers12. Execute plan, provide cursor API
9
![Page 34: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/34.jpg)
Life of a query1. Text and query parameters come from user2. Parse text, produce abstract syntax tree (AST)3. Substitute query parameters4. First optimisation: constant expressions, etc.5. Translate AST into an execution plan (EXP)6. Optimise one EXP, producemany, potentially better EXPs7. Reason about distribution in cluster8. Optimise distributed EXPs
9. Estimate costs for all EXPs, and sort by ascending cost10. Instanciate “cheapest” plan, i.e. set up execution engine11. Distribute and link up engines on different servers12. Execute plan, provide cursor API
9
![Page 35: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/35.jpg)
Life of a query1. Text and query parameters come from user2. Parse text, produce abstract syntax tree (AST)3. Substitute query parameters4. First optimisation: constant expressions, etc.5. Translate AST into an execution plan (EXP)6. Optimise one EXP, producemany, potentially better EXPs7. Reason about distribution in cluster8. Optimise distributed EXPs9. Estimate costs for all EXPs, and sort by ascending cost
10. Instanciate “cheapest” plan, i.e. set up execution engine11. Distribute and link up engines on different servers12. Execute plan, provide cursor API
9
![Page 36: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/36.jpg)
Life of a query1. Text and query parameters come from user2. Parse text, produce abstract syntax tree (AST)3. Substitute query parameters4. First optimisation: constant expressions, etc.5. Translate AST into an execution plan (EXP)6. Optimise one EXP, producemany, potentially better EXPs7. Reason about distribution in cluster8. Optimise distributed EXPs9. Estimate costs for all EXPs, and sort by ascending cost10. Instanciate “cheapest” plan, i.e. set up execution engine
11. Distribute and link up engines on different servers12. Execute plan, provide cursor API
9
![Page 37: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/37.jpg)
Life of a query1. Text and query parameters come from user2. Parse text, produce abstract syntax tree (AST)3. Substitute query parameters4. First optimisation: constant expressions, etc.5. Translate AST into an execution plan (EXP)6. Optimise one EXP, producemany, potentially better EXPs7. Reason about distribution in cluster8. Optimise distributed EXPs9. Estimate costs for all EXPs, and sort by ascending cost10. Instanciate “cheapest” plan, i.e. set up execution engine11. Distribute and link up engines on different servers
12. Execute plan, provide cursor API
9
![Page 38: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/38.jpg)
Life of a query1. Text and query parameters come from user2. Parse text, produce abstract syntax tree (AST)3. Substitute query parameters4. First optimisation: constant expressions, etc.5. Translate AST into an execution plan (EXP)6. Optimise one EXP, producemany, potentially better EXPs7. Reason about distribution in cluster8. Optimise distributed EXPs9. Estimate costs for all EXPs, and sort by ascending cost10. Instanciate “cheapest” plan, i.e. set up execution engine11. Distribute and link up engines on different servers12. Execute plan, provide cursor API
9
![Page 39: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/39.jpg)
Execution plans
FOR a IN collA
RETURN {x: a.x, z: b.z}
EnumerateCollection a
EnumerateCollection b
Calculation xx == b.y
Filter xx == b.y
Singleton
Calculation xx
Return {x: a.x, z: b.z}
Calc {x: a.x, z: b.z}
FILTER xx == b.y
FOR b IN collB
LET xx = a.x
Query→ EXP
Black arrows aredependenciesThink of a pipelineEach node providesa cursor APIBlocks of “Items”travel through thepipelineWhat is an “item”???
10
![Page 40: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/40.jpg)
Execution plans
FOR a IN collA
RETURN {x: a.x, z: b.z}
EnumerateCollection a
EnumerateCollection b
Calculation xx == b.y
Filter xx == b.y
Singleton
Calculation xx
Return {x: a.x, z: b.z}
Calc {x: a.x, z: b.z}
FILTER xx == b.y
FOR b IN collB
LET xx = a.x
Query→ EXPBlack arrows aredependencies
Think of a pipelineEach node providesa cursor APIBlocks of “Items”travel through thepipelineWhat is an “item”???
10
![Page 41: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/41.jpg)
Execution plans
FOR a IN collA
RETURN {x: a.x, z: b.z}
EnumerateCollection a
EnumerateCollection b
Calculation xx == b.y
Filter xx == b.y
Singleton
Calculation xx
Return {x: a.x, z: b.z}
Calc {x: a.x, z: b.z}
FILTER xx == b.y
FOR b IN collB
LET xx = a.x
Query→ EXPBlack arrows aredependenciesThink of a pipeline
Each node providesa cursor APIBlocks of “Items”travel through thepipelineWhat is an “item”???
10
![Page 42: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/42.jpg)
Execution plans
FOR a IN collA
RETURN {x: a.x, z: b.z}
EnumerateCollection a
EnumerateCollection b
Calculation xx == b.y
Filter xx == b.y
Singleton
Calculation xx
Return {x: a.x, z: b.z}
Calc {x: a.x, z: b.z}
FILTER xx == b.y
FOR b IN collB
LET xx = a.x
Query→ EXPBlack arrows aredependenciesThink of a pipelineEach node providesa cursor API
Blocks of “Items”travel through thepipelineWhat is an “item”???
10
![Page 43: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/43.jpg)
Execution plans
FOR a IN collA
RETURN {x: a.x, z: b.z}
EnumerateCollection a
EnumerateCollection b
Calculation xx == b.y
Filter xx == b.y
Singleton
Calculation xx
Return {x: a.x, z: b.z}
Calc {x: a.x, z: b.z}
FILTER xx == b.y
FOR b IN collB
LET xx = a.x
Query→ EXPBlack arrows aredependenciesThink of a pipelineEach node providesa cursor APIBlocks of “Items”travel through thepipeline
What is an “item”???
10
![Page 44: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/44.jpg)
Execution plans
FOR a IN collA
RETURN {x: a.x, z: b.z}
EnumerateCollection a
EnumerateCollection b
Calculation xx == b.y
Filter xx == b.y
Singleton
Calculation xx
Return {x: a.x, z: b.z}
Calc {x: a.x, z: b.z}
FILTER xx == b.y
FOR b IN collB
LET xx = a.x
Query→ EXPBlack arrows aredependenciesThink of a pipelineEach node providesa cursor APIBlocks of “Items”travel through thepipelineWhat is an “item”???
10
![Page 45: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/45.jpg)
Pipeline and items
FOR a IN collA EnumerateCollection a
EnumerateCollection b
Singleton
Calculation xx
FOR b IN collB
LET xx = a.x Items have vars a, xx
Items have no vars
Items are the thingies traveling through the pipeline.
An item holds values of those variables in the current frameThus: Items look differently in different parts of the planWe always deal with blocks of items for performance reasons
11
![Page 46: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/46.jpg)
Pipeline and items
FOR a IN collA EnumerateCollection a
EnumerateCollection b
Singleton
Calculation xx
FOR b IN collB
LET xx = a.x Items have vars a, xx
Items have no vars
Items are the thingies traveling through the pipeline.An item holds values of those variables in the current frame
Thus: Items look differently in different parts of the planWe always deal with blocks of items for performance reasons
11
![Page 47: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/47.jpg)
Pipeline and items
FOR a IN collA EnumerateCollection a
EnumerateCollection b
Singleton
Calculation xx
FOR b IN collB
LET xx = a.x Items have vars a, xx
Items have no vars
Items are the thingies traveling through the pipeline.An item holds values of those variables in the current frameThus: Items look differently in different parts of the plan
We always deal with blocks of items for performance reasons
11
![Page 48: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/48.jpg)
Pipeline and items
FOR a IN collA EnumerateCollection a
EnumerateCollection b
Singleton
Calculation xx
FOR b IN collB
LET xx = a.x Items have vars a, xx
Items have no vars
Items are the thingies traveling through the pipeline.An item holds values of those variables in the current frameThus: Items look differently in different parts of the planWe always deal with blocks of items for performance reasons
11
![Page 49: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/49.jpg)
Execution plans
FOR a IN collA
RETURN {x: a.x, z: b.z}
EnumerateCollection a
EnumerateCollection b
Calculation xx == b.y
Filter xx == b.y
Singleton
Calculation xx
Return {x: a.x, z: b.z}
Calc {x: a.x, z: b.z}
FILTER xx == b.y
FOR b IN collB
LET xx = a.x
12
![Page 50: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/50.jpg)
Move filters upFOR a IN collA
FOR b IN collB
FILTER a.x == 10
FILTER a.u == b.v
RETURN {u:a.u,w:b.w}
The result and behaviour does notchange, if the first FILTER is pulledout of the inner FOR.However, the number of items trave-ling in the pipeline is decreased.Note that the two FOR statementscould be interchanged!
Singleton
EnumColl a
EnumColl b
Calc a.x == 10
Return {u:a.u,w:b.w}
Filter a.u == b.v
Calc a.u == b.v
Filter a.x == 10
13
![Page 51: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/51.jpg)
Move filters upFOR a IN collA
FOR b IN collB
FILTER a.x == 10
FILTER a.u == b.v
RETURN {u:a.u,w:b.w}
The result and behaviour does notchange, if the first FILTER is pulledout of the inner FOR.
However, the number of items trave-ling in the pipeline is decreased.Note that the two FOR statementscould be interchanged!
Singleton
EnumColl a
EnumColl b
Calc a.x == 10
Return {u:a.u,w:b.w}
Filter a.u == b.v
Calc a.u == b.v
Filter a.x == 10
13
![Page 52: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/52.jpg)
Move filters upFOR a IN collA
FILTER a.x < 10
FOR b IN collB
FILTER a.u == b.v
RETURN {u:a.u,w:b.w}
The result and behaviour does notchange, if the first FILTER is pulledout of the inner FOR.However, the number of items trave-ling in the pipeline is decreased.
Note that the two FOR statementscould be interchanged!
Singleton
EnumColl a
Return {u:a.u,w:b.w}
Filter a.u == b.v
Calc a.u == b.v
Calc a.x == 10
EnumColl b
Filter a.x == 10
13
![Page 53: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/53.jpg)
Move filters upFOR a IN collA
FILTER a.x < 10
FOR b IN collB
FILTER a.u == b.v
RETURN {u:a.u,w:b.w}
The result and behaviour does notchange, if the first FILTER is pulledout of the inner FOR.However, the number of items trave-ling in the pipeline is decreased.Note that the two FOR statementscould be interchanged!
Singleton
EnumColl a
Return {u:a.u,w:b.w}
Filter a.u == b.v
Calc a.u == b.v
Calc a.x == 10
EnumColl b
Filter a.x == 10
13
![Page 54: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/54.jpg)
Remove unnecessary calculationsFOR a IN collA
LET L = LENGTH(a.hobbies)
FOR b IN collB
FILTER a.u == b.v
RETURN {h:a.hobbies,w:b.w}
The Calculation of L is unnecessary!(since it cannot throw an exception).Therefore we can just leave it out.
Singleton
EnumColl a
Calc L = ...
EnumColl b
Calc a.u == b.v
Filter a.u == b.v
Return {...}
14
![Page 55: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/55.jpg)
Remove unnecessary calculationsFOR a IN collA
LET L = LENGTH(a.hobbies)
FOR b IN collB
FILTER a.u == b.v
RETURN {h:a.hobbies,w:b.w}
The Calculation of L is unnecessary!
(since it cannot throw an exception).Therefore we can just leave it out.
Singleton
EnumColl a
Calc L = ...
EnumColl b
Calc a.u == b.v
Filter a.u == b.v
Return {...}
14
![Page 56: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/56.jpg)
Remove unnecessary calculationsFOR a IN collA
FOR b IN collB
FILTER a.u == b.v
RETURN {h:a.hobbies,w:b.w}
The Calculation of L is unnecessary!(since it cannot throw an exception).
Therefore we can just leave it out.
Singleton
EnumColl a
EnumColl b
Calc a.u == b.v
Filter a.u == b.v
Return {...}
14
![Page 57: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/57.jpg)
Remove unnecessary calculationsFOR a IN collA
FOR b IN collB
FILTER a.u == b.v
RETURN {h:a.hobbies,w:b.w}
The Calculation of L is unnecessary!(since it cannot throw an exception).Therefore we can just leave it out.
Singleton
EnumColl a
EnumColl b
Calc a.u == b.v
Filter a.u == b.v
Return {...}
14
![Page 58: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/58.jpg)
Use index for FILTER and SORTFOR a IN collA
FILTER a.x > 17 &&
a.x <= 23 &&
a.y == 10
SORT a.y, a.x
RETURN a
Assume collA has a skiplist index on “y”and “x” (in this order), then we can readoff the half-open interval between{ y: 10, x: 17 } and{ y: 10, x: 23 }from the skiplist index.
The result will automatically be sorted byy and then by x.
Singleton
EnumColl a
Filter ...
Calc ...
Sort a.y, a.x
Return a
15
![Page 59: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/59.jpg)
Use index for FILTER and SORTFOR a IN collA
FILTER a.x > 17 &&
a.x <= 23 &&
a.y == 10
SORT a.y, a.x
RETURN a
Assume collA has a skiplist index on “y”and “x” (in this order),
then we can readoff the half-open interval between{ y: 10, x: 17 } and{ y: 10, x: 23 }from the skiplist index.
The result will automatically be sorted byy and then by x.
Singleton
EnumColl a
Filter ...
Calc ...
Sort a.y, a.x
Return a
15
![Page 60: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/60.jpg)
Use index for FILTER and SORTFOR a IN collA
FILTER a.x > 17 &&
a.x <= 23 &&
a.y == 10
SORT a.y, a.x
RETURN a
Assume collA has a skiplist index on “y”and “x” (in this order), then we can readoff the half-open interval between{ y: 10, x: 17 } and{ y: 10, x: 23 }from the skiplist index.
The result will automatically be sorted byy and then by x.
Singleton
Sort a.y, a.x
Return a
IndexRange a
15
![Page 61: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/61.jpg)
Use index for FILTER and SORTFOR a IN collA
FILTER a.x > 17 &&
a.x <= 23 &&
a.y == 10
SORT a.y, a.x
RETURN a
Assume collA has a skiplist index on “y”and “x” (in this order), then we can readoff the half-open interval between{ y: 10, x: 17 } and{ y: 10, x: 23 }from the skiplist index.
The result will automatically be sorted byy and then by x.
Singleton
Return a
IndexRange a
15
![Page 62: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/62.jpg)
Data distribution in a clusterRequests
DBserver DBserver DBserver
CoordinatorCoordinator
4 2 5 3 11
The shards of a collection are distributed across the DBservers.
The coordinators receive queries and organise theirexecution
16
![Page 63: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/63.jpg)
Data distribution in a clusterRequests
DBserver DBserver DBserver
CoordinatorCoordinator
4 2 5 3 11
The shards of a collection are distributed across the DBservers.The coordinators receive queries and organise theirexecution
16
![Page 64: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/64.jpg)
Scatter/gather
EnumerateCollection
17
![Page 65: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/65.jpg)
Scatter/gather
Remote
EnumShard
Remote Remote
EnumShard
Remote
Concat/Merge
Remote
EnumShard
Remote
Scatter
17
![Page 66: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/66.jpg)
Scatter/gather
Remote
EnumShard
Remote Remote
EnumShard
Remote
Concat/Merge
Remote
EnumShard
Remote
Scatter
17
![Page 67: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/67.jpg)
Modifying queriesFortunately:
There can be at most one modifying node in each query.There can be no modifying nodes in subqueries.
Modifying nodesThe modifying node in a query
is executed on the DBservers,to this end, we either scatter the items to all DBservers,or, if possible, we distribute each item to the shardthat is responsible for the modification.Sometimes, we can even optimise away a gather/scattercombination and parallelise completely.
18
![Page 68: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/68.jpg)
Modifying queriesFortunately:
There can be at most one modifying node in each query.There can be no modifying nodes in subqueries.Modifying nodesThe modifying node in a query
is executed on the DBservers,
to this end, we either scatter the items to all DBservers,or, if possible, we distribute each item to the shardthat is responsible for the modification.Sometimes, we can even optimise away a gather/scattercombination and parallelise completely.
18
![Page 69: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/69.jpg)
Modifying queriesFortunately:
There can be at most one modifying node in each query.There can be no modifying nodes in subqueries.Modifying nodesThe modifying node in a query
is executed on the DBservers,to this end, we either scatter the items to all DBservers,or, if possible, we distribute each item to the shardthat is responsible for the modification.
Sometimes, we can even optimise away a gather/scattercombination and parallelise completely.
18
![Page 70: Complex queries in a distributed multi-model database](https://reader033.fdocuments.net/reader033/viewer/2022051414/55a9a67f1a28abc2518b4897/html5/thumbnails/70.jpg)
Modifying queriesFortunately:
There can be at most one modifying node in each query.There can be no modifying nodes in subqueries.Modifying nodesThe modifying node in a query
is executed on the DBservers,to this end, we either scatter the items to all DBservers,or, if possible, we distribute each item to the shardthat is responsible for the modification.Sometimes, we can even optimise away a gather/scattercombination and parallelise completely.
18