Implementation of Minimum Edge Constraints Sets for ... · graph needs to be explicitly stored....
Transcript of Implementation of Minimum Edge Constraints Sets for ... · graph needs to be explicitly stored....
Implementation of Minimum EdgeConstraints Sets for Proximity Graphs
by
Edward Duong
A thesis submitted tothe Faculty of Graduate and Postdoctoral Affairs
in partial fulfillment of the requirements for the degree of
Master of Computer Science
Carleton UniversityOttawa, Ontario
c© 2018Edward Duong
Abstract
For certain edge-constrained proximity algorithms, not every edge of the resulting
graph needs to be explicitly stored. This has implications for graph compression. We
examine the application and runtime performance on minimum edge-constrained
algorithms for three proximity graph types: Delaunay triangulation, Gabriel graph
and minimum spanning tree.
Implementation details on these algorithms are given, their performance in both
large real world datasets and randomized datasets are evaluated. In addition, their
compression metrics, the number of edges that are reduced from the constraint
edge set, are investigated.
ii
Acknowledgments
Firstly, I would like to thank my friends, colleagues and family for all their support.
For the past few years, I have been pursuing a part-time Master’s degree while
working full-time. I realize now that this involves many sacrifices. I would like to
thank those who have supported me throughout this journey - your patience and
thoughtfulness have not gone unnoticed.
From the beginning of undergraduate at Carleton University, up until my last
graduate course, I’d like to extend my gratitude to all the professors who have
passionately imparted their knowledge. They have shaped my view of computer
science for years to come. It is their passion to teach and their love for learning that
inspires me to continue down the path of graduate studies.
I would like to extend my appreciation to the students, graduates and researchers
who have shared resources to help make this thesis possible. They have given in-
valuable suggestions, recommendations and provided extensive free-to-use geome-
try platforms. I would like to thank Gregory Blint for his latex template (which he
has shared over GitHub). I would like to thank the programmers and researchers
of CGAL and Boost, who have spent countless hours crafting wonderful libraries for
the public.
iii
Last, but not least, I would like to thank Professor Smid for the many hours he
provided in the form of supervision. With great detail, he explained the concepts
of their research which forms the basis for this thesis. His experience with compu-
tational geometry software was paramount in suggesting a platform that met our
goals. Without his expertise and guidance, this work would not have been possible.
iv
Contents
Abstract ii
Acknowledgments iii
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Problem Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . 16
2 Preliminaries 18
2.1 Geometric Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 Dynamic Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 Minimum Edge-Constrained Proximity Graphs 26
3.1 Constrained Delaunay Triangulation . . . . . . . . . . . . . . . . . . 26
3.2 Minimum Constrained Delaunay Triangulation . . . . . . . . . . . . 28
3.2.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
v
3.3 Constrained Gabriel Graph . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4 Minimum Constrained Gabriel Graph . . . . . . . . . . . . . . . . . . 32
3.4.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.5 Constrained Minimum Spanning Tree . . . . . . . . . . . . . . . . . . 35
3.6 Minimum Constrained Minimum Spanning Tree . . . . . . . . . . . . 37
3.6.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4 Geometry Platforms and Technologies 44
4.1 Survey of Geometry Platforms . . . . . . . . . . . . . . . . . . . . . . 44
4.1.1 CGAL (Computational Geometry Algorithms Library) . . . . . 46
4.1.2 Boost Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.1.3 LEDA (Library of Efficient Data types and Algorithms) . . . . 48
4.1.4 JTS (Java Topology Suite) . . . . . . . . . . . . . . . . . . . . 49
4.1.5 Shapely . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2 Selected Platform and Technologies . . . . . . . . . . . . . . . . . . . 51
5 Data Structures and Functions 53
5.1 Graph Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.2 Graph Containers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2.1 Vertex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2.2 Edge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2.3 Vertex Container . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.2.4 Edge Container . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2.5 Geometric Functions . . . . . . . . . . . . . . . . . . . . . . . 61
5.2.6 Graph Traversal . . . . . . . . . . . . . . . . . . . . . . . . . . 63
vi
6 Implementation Details 65
6.1 Minimum Constrained Delaunay Triangulation . . . . . . . . . . . . 66
6.2 Minimum Constrained Gabriel Graph . . . . . . . . . . . . . . . . . . 69
6.3 Minimum Constrained Minimum Spanning Tree . . . . . . . . . . . . 71
7 Experimental Results 76
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
7.1.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.1.2 Input Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.2 On Performance Results . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.2.1 Minimum Edge-Constrained Delaunay Triangulation . . . . . 83
7.2.2 Minimum Edge-Constrained Gabriel Graph . . . . . . . . . . 85
7.2.3 Minimum Edge-Constrained Minimum Spanning Tree . . . . 87
8 Conclusion 102
8.1 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . 102
8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Bibliography 105
vii
List of Figures
1.1 Proximity graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Minimum Constrained Delaunay triangulation example . . . . . . . . 3
1.3 Minimum Constrained Gabriel graph example . . . . . . . . . . . . . 6
1.4 Minimum Constrained minimum spanning tree example . . . . . . . 7
1.5 3-color graph coloring . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 2-degree edge constraint example . . . . . . . . . . . . . . . . . . . . 9
2.1 Constrained Delaunay triangulation property . . . . . . . . . . . . . 20
2.2 Heavy edge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3 Link cut tree paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.1 Constrained Delaunay triangulation . . . . . . . . . . . . . . . . . . . 28
3.2 Delaunay edge flip . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Constrained Gabriel graph . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Delaunay but not Gabriel . . . . . . . . . . . . . . . . . . . . . . . . 34
3.5 MST cut property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.6 MST cycle property . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.7 Example of minimum edge-constrained minimum spanning tree (part
2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
viii
4.1 CGAL triangulation package . . . . . . . . . . . . . . . . . . . . . . . 47
4.2 Boost Geometry class diagram . . . . . . . . . . . . . . . . . . . . . . 48
4.3 LEDA algorithm snapshot . . . . . . . . . . . . . . . . . . . . . . . . 49
4.4 JTS class snapshot . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.5 Shapely triangulation packages . . . . . . . . . . . . . . . . . . . . . 51
5.1 Adjacency List / Adjacency Matrix Type . . . . . . . . . . . . . . . . . 54
5.2 Edge struct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.3 Traversing CGAL vertices, edges and faces . . . . . . . . . . . . . . . 64
6.1 Infinity vertex representation in CGAL . . . . . . . . . . . . . . . . . 68
7.1 Washington DC road network . . . . . . . . . . . . . . . . . . . . . . 91
7.2 Hawaii road network . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.3 Kaua’i County, Hawaii road network . . . . . . . . . . . . . . . . . . 93
7.4 Randomized disc with short edges . . . . . . . . . . . . . . . . . . . 94
7.5 Randomized disc with random edges . . . . . . . . . . . . . . . . . . 95
7.6 Randomized circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
7.7 Edge differences in the CDT (V, S) . . . . . . . . . . . . . . . . . . . 97
7.8 Minimum Delaunay triangulation for Kaua’i County, Hawaii . . . . . 98
7.9 Minimum Gabriel graph for Kaua’i County, Hawaii . . . . . . . . . . 99
7.10 Minimum edge minimum spanning tree for Kaua’i County, Hawaii . . 100
7.11 Minimum Delaunay triangulation for Washington DC . . . . . . . . . 101
ix
List of Algorithms
3.1 Minimum edge-constrained Delaunay triangulation . . . . . . . . . . . 30
3.2 Minimum edge-constrained Gabriel graph . . . . . . . . . . . . . . . . 35
3.3 Kruskal’s Euclidean Minimum Spanning Tree . . . . . . . . . . . . . . 37
3.4 Constrained Minimum Spanning Tree (backed by Kruskal’s MST) . . . 38
3.5 Minimum edge-constrained Minimum Spanning Tree . . . . . . . . . . 42
6.1 Impl. minimum edge-constrained Delaunay triangulation . . . . . . . 66
6.2 Impl. minimum edge-constrained Gabriel graph . . . . . . . . . . . . 69
6.3 Impl. minimum edge-constrained Minimum Spanning Tree . . . . . . 72
x
Chapter 1
Introduction
Proximity graph is a term used to describe graphs that have a relationship associated
with nearness or distance of vertices to one another in Euclidean space.[34] For
some proximity graphs, adjacency is defined by an empty region between any pair of
vertices. In particular, of the 3 proximity graphs we discuss: Delaunay triangulation,
Gabriel graph and minimum spanning tree (MST), the first two graphs adhere to
this definition.
A Delaunay triangulation DT (V ) for a set V of points in the plane is the triangu-
lation with vertex set V such that there is an edge pq, p, q ∈ V , if p and q lie on the
boundary of some disc that is empty of all other vertices of V . We call the property
that defines the empty region the Delaunay property.
We define the Gabriel graph similarly. A Gabriel graph GG(V ) for a set V of
points in the plane is the graph with vertex set V where there is an edge pq, p,
q ∈ V , if the disc with diameter ||p−q||, with p and q on its boundary, is empty of all
other vertices of V . We call the property that defines the empty region the Gabriel
property.
1
Chapter 1. Introduction 2
Figure 1.1: A) represents a vertex set, V . B) is the Delaunay triangulation, DT (V ).C) is the Gabriel graph, GG(V ). D) is the minimum spanning tree, MST (V ).
While the minimum spanning tree is a proximity graph, its traditional definition
does not use an empty region property for adjacency. Instead, a minimum spanning
tree, MST (G), for a graph G = (V,E), vertex set V , edge set E and a cost function,
mapping an edge to a cost, is the spanning tree of G with minimum cost. That is
to say, any vertex pair in MST (G) is connected by exactly one path and there is no
edge in E that can replace an edge in MST (G) such that it has a lower total cost.
(We provide an example of a minimum spanning tree in figure 1.1.)
Constrained graph is the term given to a graph that enforces limitations on their
characteristics. For the 3 proximity graphs we investigate, our interest is in their
edge-constrained variants. An edge-constrained graph forces specific edges to exist
despite possibly violating properties in their non-constrained version. A real-world
use of this type of graph could be to guide an autonomous vehicle through a differ-
ent route by constraining it, instead of a typical route, due to, perhaps, an accident.
Edge-constrained graphs do not explicitly need to store every edge. It is possible
to devise algorithms to find the minimum constrained edge set, which we typically
denote as S, such that S is minimum and can reconstruct the edge-constrained
graph. By minimum, we mean there is no smaller subset than S that is capable of
reconstructing the edge-constrained graph.
Chapter 1. Introduction 3
Figure 1.2: A) is a vertex set. Thick edges in B) represent constraint edges in theedge-constrained Delaunay triangulation, C). The graph D) is the minimum edgeset that can reconstruct C).
We provide an example for a minimum constrained Delaunay triangulation in
figure 1.2. As expected, the constrained Delaunay triangulation C) contains all of
the original constrained edges of B). However, the minimum constrained Delaunay
triangulation, D), contains only one constrained edge. It is capable of reconstructing
the constrained Delaunay triangulation C).
Our goal is to address questions regarding the effectiveness and performance
of minimum edge-constrained algorithms in practice. Firstly, we are interested in
their compression effectiveness. We define a compression ratio as (|E| − |S|)/|E|,
where |E| is the cardinality of the constrained edge set and |S| is the cardinality of
the minimum edge set. The closer the compression ratio to 1.0, the more effective
the algorithm at reducing the constraint set size. Our second area of interest is
the execution time to identify and construct the minimum edge set. This gives us
a better understanding to the feasibility of the algorithm in real-world software.
Lastly, there is interest to execute these algorithms on real-world and randomized
datasets to better understand their limitations on different graphs.
Chapter 1. Introduction 4
We organize the sections in the introduction as follows. Section 1.1 describes our
motivation and interest in the problem. In section 1.2, we outline and categorize the
problem that we aim to implement. In section 1.3, similar problems are discussed.
In section 1.4 we state our contributions of the thesis. Lastly, section 1.5 describes
the larger organization of the thesis.
1.1 Motivation
Graphs are pervasive in everyday life. They describe social connections, geographic
information systems (GIS), route planning, and many other disciplines. They are
found in the construction of efficient road networks of large cities, allowing traf-
fic to flow freely, avoiding congestion and bottlenecks. Even the Internet and the
information that flows through it is based on graph theory; we imagine, a vertex
represents an address and an edge represents a communication line between two
addresses.
As graphs become larger, so, too, do their demands on resources. They require
faster processors, more memory and larger storage mediums to support their size.
As such, there is interest in algorithms that reduce graph memory footprint. Often,
algorithms address this problem by efficiently representing large graphs without
explicitly storing the full graph and by providing a way to restore them to their full
representation.
It must be mentioned that a large part of the motivation is inspired by Devillers
et al. Their contributions to graph research and more recently their paper titled,
Minimal Set of Constraints for 2D Constrained Delaunay Reconstruction[8], provided
Chapter 1. Introduction 5
the basis for the construction of the minimum constrained Delaunay triangulation
algorithm. Their algorithm is able to compresses constrained Delaunay triangula-
tions by storing only a subset of the original constrained edges, greatly reducing the
size of the graph and as a result, reducing memory requirement.
More recently, work by Bose et al.[3], broadens the minimum constrained graph
research by finding minimum constrained edge sets for other proximity graphs.
Their paper, Essential Constraints of Edge-Constrained Proximity Graphs, outlines the
proofs and algorithms on minimum edge constraints for: Gabriel graph, minimum
spanning tree and β-skeleton.
1.2 Problem Statements
We investigate 3 minimum constrained proximity graph algorithms, provide an im-
plementation for each and analyze their performance. The 3 proximity graphs are:
Delaunay triangulation, Gabriel graph and minimum spanning tree. We assume the
graphs use the Euclidean metric as their measure of distance. Below, we give the
problem statement for each algorithm without delving too deeply in their details.
(Formal definitions will be given in Chapters 2 and 3.)
Minimum Edge-Constrained Delaunay Triangulation:
A constrained Delaunay triangulation CDT (G) for a plane graph G = (V,E)
with vertex set V and constraint edge set E is a generalization of a Delaunay
triangulation that forces edges from E to be part of the triangulation, i.e. E ⊆
CDT (V,E). Given a constrained Delaunay triangulation CDT (V,E), we are in-
terested in finding the smallest subset S such that S ⊆ E and CDT (V,E) =
Chapter 1. Introduction 6
CDT (V, S).
Minimum Edge-Constrained Gabriel Graph:
The constrained Gabriel graph CGG(G) for a plane graphG = (V,E) with vertex
set V and constraint edge set E is a generalization of a Gabriel graph that forces
edges from E to be present in CGG(G). For a given constrained Gabriel graph
CGG(V,E), we are interested in the smallest edge set S such that S ⊆ E and
CGG(V,E) = CGG(V, S). See figure 1.3 for an example.
Figure 1.3: A) is a vertex set. Thick edges in B) represent constraint edges in theedge-constrained Gabriel graph, C). The graph D) is the minimum edge contrainedGabriel graph that can reconstruct C).
Minimum Edge-Constrained Minimum Spanning Tree:
The constrained minimum spanning tree CMST (F ) for a plane forest F =
(V,E) with vertex set V and constraint edge set E is a minimum spanning tree
and E ⊆ CMST (V,E). Given CMST (V,E), we are interested in the smallest sub-
set S for which S ⊆ E and CMST (V,E) = CMST (V, S). We provide an example
in figure 1.4.
Chapter 1. Introduction 7
Figure 1.4: A) is a vertex set. Thick edges in B) represent constraint edges in theedge-constrained MST, C). The graph D) is the minimum edge-constrained MSTthat can reconstruct C).
1.3 Related Work
Our problem deals with two broader topics: geometric compression and constrained
graphs. The former describes methodologies for encoding or representing graphs
in an efficient manner. The latter considers graph problems with a defined set of
restrictions that the graph must satisfy. Both topics are independent and stand on
their own right with decades of research. Unsurprisingly, their applicability makes
them well-suited for use in real-world problems.
Geometric Compression
Geometric compression is an important field that deals with efficient representation
of geometric objects.[24] The basis of computational geometry lies in the way it
represents shapes, polyhedrons and complex objects; and the semantics on how to
encode or decode them. For example, to describe a triangle, we must define its
vertices, edges, dimensions, and positions.
Chapter 1. Introduction 8
Figure 1.5: A 3-coloring example[1] of a Petersen graph. Coloring graph problemsare part of a family of constraint satisfaction problems.
Describing a graph in the sense of computational geometry means encoding the
attributes of vertices and edges in a way that a computer system can make use of it.
A schema for representing a graph in memory can be as simple as a list of vertices
with an inner list of pointers to other vertices, where a vertex is an ordered pair of
integers. However, this explicit form takes no advantage of any compression.
Constrained Graphs
A constrained graph problem involves a graph and a set of constraints. These con-
straints are abstract limitations on the graph that it must satisfy. Take for instance,
the famous 3-color vertex coloring problem (Figure 1.5). Each vertex is given the
constraint that no adjacent vertices share the same color. The objective is to find a
’coloring’, if possible, that assigns to each vertex one of three colors until all vertices
are colored. Another example (Figure 1.6) is a graph with a constraint that each
vertex must have degree (the number of incident edges) of at most 2.
Chapter 1. Introduction 9
Figure 1.6: An example of a graph with degree constraint of at most 2.
More broadly, a survey[17] of constrained classification describes two major
groups. The first group is a partition of objects whereby they share properties with
objects in the same class. These properties are similar to objects in the same class,
while dissimilar to others objects in other classes. The second group is a hierarchical
classification containing a nested set of partitions. The first classification is of in-
terest because it holds the class of constrained graphs. More, specifically proximity
graphs. Their property, that a pair of vertices shares an edge when close in distance
and do not share an edge when afar, form the basis of similarity and dissimilarity.
Among the graphs that belong in the constrained classification survey are: De-
launay triangulation, Gabriel graph, relative neighborhood graph and minimum
spanning tree. And while it requires additional proof, their relationship to one an-
other is: MST ⊆ RNG ⊆ GG ⊆ DT .[34]
We divide the following subsection into three parts, one part for each of the
constrained proximity graphs.
Chapter 1. Introduction 10
Constrained Delaunay Triangulation
The classical Delaunay triangulation problem is named after Boris Delaunay for his
publication[7] in 1934, describing a method for triangulating points. For a set of
points V , a Delaunay triangulation DT (V ) is the triangulation connecting points q
and r, q 6= r if there exists a disc C with q and r on its boundary such that no other
point p ∈ V lies inside C. This property is also known as the “Delaunay property”,
“Delaunay criterion” or “Delaunay condition”.
Both Delaunay triangulation and Voronoi diagram are analogous by their duality,
i.e. starting with either graph, one can convert to the other. That implies, solving a
Voronoi diagram effectively solves the Delaunay triangulation over the same graph.
One of the fastest methods for computing a Voronoi diagram is the Fortune plane
sweep[11], which computes the Voronoi diagram using a sweep line. This sweep
line moves from vertex-to-vertex, sorted by axis. As the sweep line passes a vertex,
it forms parabolas perpendicular to the sweep line that expand equidistant from
that vertex. To convert the Voronoi diagram to a Delaunay triangulation, we place
a vertex in every face and an edge crossing two faces in the Voronoi diagram.
On the topic of constrained Delaunay triangulation, in as early as 1978, com-
puter scientist D.T. Lee introduced this problem as generalized Delaunay triangula-
tion [22]. While his work does not explicitly provide an algorithm, he was the first
to define the constraint problem as a generalization of the Delaunay triangulation,
a type of triangulation that includes specific edges in the graph such that it is as
close as possible to a Delaunay triangulation.
Near a decade later, computational geometry researcher, Paul Chew, referred
to this problem as obstacle triangulation[5]. He devised a divide and conquer
Chapter 1. Introduction 11
algorithm[4], inspired by a Voronoi diagram algorithm by Yap et al[35]. Chew’s
algorithm assumes all vertices are bound in an imaginary rectangle, subdivided into
strips. Each strip maintains edges that cross it, but only if one or more endpoints of
that edge lie in the strip. (It would be infeasible to store all crossing edges as there
could be upwards of O(n2) of them.) Recursively, the graph is put back together,
adjacent strips are stitched together to a single strip. Sophisticated logic is used to
ensure constraint edges are handled. The running time is O(n log n) for a vertex
set of size n, similar to the algorithm complexity for computing an unconstrained
Delaunay triangulation or Voronoi diagram.
Further improvements have been made to different aspects of constrained De-
launay triangulations. For example, there has been work done on a dynamic data
structure that supports insertion and removal of constraints, allowing for iterative
graph construction.[19][6] Their work also includes support for cases that are con-
sidered invalid in the original input, such as degeneration from edge intersections
and overlapping edges.
In terms of practical and performant algorithms, Sloan’s algorithm[30] is capa-
ble of iteratively computing a constrained Delaunay triangulation with O(n2) worst
case and O(n5/4) average case by first creating the unconstrained Delaunay trian-
gulation and augmenting it with an adjacency list that is used to apply constrained
edges. It is considered among the fastest of algorithms in practice.
Constrained Gabriel Graph
Gabriel graphs have their origin in statistical analysis and sampling, a method in-
tended to survey and categorize population by geography. The name Gabriel graph
Chapter 1. Introduction 12
has since been attributed to the original authors K. R. Gabriel and R. R. Sokal for
their publication in 1969.[12]
Much like the Delaunay triangulation, Gabriel graphs express proximity of points
by their Euclidean distance. A Gabriel graph GG(V ) is a graph over a vertex set V .
An edge pq connects a pair of vertices p and q if there exists a disc with diameter
||p− q|| (centered on midpoint pq) that is empty of any other vertex.
Compared to the vast amount of research that exists for constrained Delaunay
triangulation, there is scarce literature and topics on constrained Gabriel graph
problems. One early such solution from Su et. al[32] gives a “pruning” algorithm. It
starts by constructing a CDT, then prunes select edges from it to obtain a CGG. This
works because in the hierarchy of proximity graphs, a Gabriel graph is a subgraph
of the Delaunay triangulation. Less intuitively, it follows that a constrained Gabriel
graph is also a subgraph of a constrained Delaunay triangulation. By looking at
each edge, common to two adjacent triangles, their algorithm determines if that
edge should be pruned. This step takes O(n) time for n edges. Therefore, the
majority of the time complexity of their algorithm is bounded by computing the
initial CDT, which at best, has time complexity O(n log n).
Constrained Minimum Spanning Tree
Classical Problems
The classical minimum spanning tree algorithm was discovered in 1926 by Otakar
Boruvka for the purpose of laying efficient, cost-effective electrical infrastructure
in Moravia, a former-part of the Czech Republic.[2] He was interested in finding
electrical transmission routes that were cost-effective and minimized the amount
Chapter 1. Introduction 13
of wiring needed to cover all cities and towns in a region. He framed this puzzle
as a graph theory problem, which, at the time, was not acknowledged as a true
mathematical discipline.
Boruvka’s algorithm constructs a minimum spanning tree through edge contrac-
tion. Each edge contraction joins a pair of connected vertices as part of a growing
component; each vertex is initially its own component. All components together
form a forest. Only adjacent edges with minimum cost are considered in the con-
traction step. While there is more than one component, the algorithm for each
component contracts the minimum cost adjacent edge. The end result is a single
component where the set of edges collected during the contraction step become the
minimum spanning tree.
The time complexity for Boruvka’s algorithm can be analyzed in two parts.
Each contraction step halves the number of components. Hence, there are at most
O(log |V |) iterations (for V vertices) of this step before there is a single component.
During each contraction step, there are at most |E| edges to contract. Therefore,
the total runtime is O(|E| log |V |).
In 1930, Prim’s algorithm[18] which constructs a minimum spanning tree was
discovered by Jarnık. Ironically, the algorithm is credited under Robert Prim’s name,
who rediscovered Jarnık’s work decades later. And again, the same events hap-
pened. Jarnık’s algorithm was discovered by Edsger Dijkstra two years after Prim.
In the literature, the algorithm is often referred to as the DJP algorithm, a concate-
nation of each author’s first letter of their last name.
Prim’s algorithm constructs the minimum spanning tree through growing a tree
by repeatedly adding an edge to a vertex not part of the tree with the lowest cost.
Chapter 1. Introduction 14
The edges considered are those that are adjacent to the current tree. Each vertex is
initialized with the cost and edge required to reach every other vertex. Depending
on the data structure used to calculate this, it may require upward of |V |2 space and
time if an adjacency matrix is used. Alternatively, O(|E| log |V |) space and time is
required, if a binary heap and adjacency list is used instead. Once all vertices are
reached, the algorithm reports the tree as the minimum spanning tree.
The time complexity of Prim’s algorithm is mostly bounded by the initialization
of vertex distance. There are at most |E| edges added to the tree, therefore the total
runtime if a binary heap and adjacency list are used is O(|E| log |V |).
The last classical algorithm was provided by Kruskal in 1956.[21]. It is consid-
ered conceptually simple because it leverages two data structures to abstract most
of the complexity, leaving an intuitive algorithm. The first step in Kruskal’s algo-
rithm involves sorting all edges by their weight. Starting with the lowest cost up to
the highest cost edge, each edge is added back to an initially empty graph if it does
not form a cycle with any existing edges. To determine if adding an edge forms
a cycle, a disjoint-set data structure, or more commonly known as union-find data
structure is used. This data structure reports in logarithmic time, the membership
of a vertex to a group. Each connected component in the graph forms its own group
in the disjoint-set. Clearly, if the representative vertex is the same when adding a
new edge, then it must form a cycle.
The time complexity of Kruskal is primarily bounded by the sorting step. Select-
ing an arbitrary, fast sorting algorithm, say merge sort, yields at best O(|E| log |E|)
for |E| edge weights. Following this, each edge is potentially added back to the
graph, costing O(|E|). As well, each edge checks the disjoint-set for membership
Chapter 1. Introduction 15
at the cost of O(log |E|). In total, the time complexity for Kruskal’s algorithm is
O(|E| log |E|). Since |E| ≤ |V 2|, this is O(|E| log |V |).
Constrained Graph Variants
In terms of research on constrained minimum spanning trees (CMST), no other
constraint is as well-studied as degree constraints.[26][20][28] A degree constraint
places a restriction on the number of outgoing or incoming edges a vertex can have.
This may be defined as a numerical constant or as a ratio comparing other vertices.
While not as well-studied, there is some interest in the topic of diameter CMSTs.[14]
A diameter CMST finds the minimum spanning tree among all spanning trees that
minimizes the longest path of the tree. As it turns out, this problem is NP-complete
for the majority of graph types, including complete graphs.
As of current, there is scarce information on the subject of edge-constrained
MSTs. And again, on the topic of minimum edge set for CMSTs, research is equally
scarce. To this day, the only research material originates from the work of Bose et
al.[3].
1.4 Summary of Contributions
The main contributions of this thesis are: source code for three minimal constrained
algorithms, experimental results on randomized and large, real-world graphs and
documentation, explaining the implementation of these algorithms.
The complete source code, separated into three projects, one for each algorithm,
can be accessed online (https://github.com/eduong). They are currently available
Chapter 1. Introduction 16
under BSD license for use to all parties or future improvements.
Performance metrics for different size and types of graphs are also given. We
analyze circular graphs (vertices on a circle boundary), disk graphs (vertices lie on
or within a disc) and large real-world data sets, containing well-over 60000 vertices
and 60000 edges. We provide results with exact and inexact arithmetic for the real-
world datasets. In addition, for each algorithm, we provide a simple way to test the
algorithms. This involves reconstructing the expected graph from the minimal edge
set, comparing the edges added back to an expected constraint graph.
Lastly, this thesis documents the process for designing and building each algo-
rithm, including pitfalls encountered throughout the process. Any shortcomings
or real-world considerations are documented; this include design choices for each
algorithm and the data structures, frameworks and third-party libraries used.
1.5 Organization of the Thesis
We structure the thesis into the following chapters: Chapter 2 describes important
concepts, properties and definitions that relate to the problem statement. Chap-
ter 3 describes the three main algorithms and provides a sketch of their correctness.
Chapter 4 surveys the geometry platform choices across different programming lan-
guages, examining their advantages and disadvantages. Chapter 5 further delves
into the geometry platform details, explaining their data structures and how we
make use of their API (application programming interface). Chapter 6 details the
algorithm implementation on the selected platform. We outline our difficulties and
experience encountered during implementation. Chapter 7 gives context to our
Chapter 1. Introduction 17
experimental setup and the methodology to our input datasets. The effectiveness
and execution times of the algorithms are also given. Chapter 8 summarizes our
findings and leaves the reader with a few open topics for future work.
Chapter 2
Preliminaries
This chapter describes properties, data structures and concepts that are related to
the problem statement. Unless we specify, we assume a graph to be undirected and
their edges to be straight-lined.
2.1 Geometric Properties
Definition 2.1. Tree: A tree is a connected graph with a single path between any
pair of vertices. A disjoint union of trees is referred to as a forest.
Definition 2.2. Spanning tree: Let G be an undirected graph with at least one
path between any pair of vertices. A spanning tree of G is a subgraph that has the
same vertex set as G and forms a tree.
Definition 2.3. Minimum spanning tree: Let G = (V,E) be an undirected graph
with vertex set V and edge set E. Let w(e), e ∈ E be a function that maps an edge
to a cost. A minimum spanning tree MST (G) is a spanning tree of G with minimum
18
Chapter 2. Preliminaries 19
total cost.
Definition 2.4. Plane graph: A graph G is a plane graph if it can be drawn on a
plane such that edges in G do not intersect except at their endpoints.
Definition 2.5. Triangulation: A plane graph G is a triangulation if the addition of
any edge makes G non-planar.
Definition 2.6. Delaunay property: Let V be a vertex set. Vertices p and q are
connected by a Delaunay edge if there exists a disc that passes through p and q that
is empty of all other vertices in V .
Definition 2.7. Locally Delaunay property: For a vertex set V , let T be a triangu-
lation over V . Let the triangles 4u, p, v and 4u, q, v in T share a common edge uv.
We say uv is locally Delaunay if the circumcircle u, p, v does not contain q and if the
circumcircle u, q, v does not contain p.
Definition 2.8. Vertex Visibility: In a straight-edge graph G = (V,E) where V is
the vertex set and E is the edge set, vertices, p and q are visible if the segment pq
does not intersect any other edge in E.
Definition 2.9. Constrained Delaunay Triangulation: LetG be a plane graph over
a vertex set V and constrained edge set E. The constrained Delaunay triangulation
CDT (G) is a triangulation of G such that any edge, pq, belongs to E or has the
following properties:
• p and q are visible
• p and q lie on the boundary of a disc C
Chapter 2. Preliminaries 20
Figure 2.1: Let thick edges in A) be constraints and B) be its CDT. In C), we show theconstrained Delaunay edge property for edge zx as example. Notice that verticeswithin the disc have visibility to at most one endpoint of zx. In D) we select a discthat does not meet the constrained Delaunay edge property because it contains a,which is visible to both endpoints.
• any vertex inside C has visibility to at most one endpoint: p or q.
See figure 2.1 for an example.
Definition 2.10. Gabriel property: Let p and q be vertices in a vertex set V . There
exists a Gabriel edge connecting p and q if the disc with diameter ||p − q|| is empty
of all other vertices in V .
Definition 2.11. Locally Gabriel property: Let G be a plane graph over the vertex
set V . An edge pq in G is locally Gabriel if the disc with diameter ||p − q|| does not
contain any other vertices adjacent to pq.
Definition 2.12. Constrained Gabriel graph: Let G be a plane graph over a vertex
set V and constrained edge set E. The constrained Gabriel graph CGG(G) is a plane
graph where any edge pq belongs to E or has the following properties:
Chapter 2. Preliminaries 21
• p and q are visible
• p and q lie on the boundary of the disc C with diameter ||p− q||
• any vertex inside C has visibility to at most one endpoint: p or q.
Definition 2.13. Euclidean Minimum Spanning Tree: Let DT (V ) be a Delaunay
triangulation over a vertex set V . Let the cost of an edge be the Euclidean distance
between endpoints. The Euclidean minimum spanning tree MST (V ) is a spanning
tree of DT (V ) with minimum total cost.
Definition 2.14. Constrained Euclidean Minimum Spanning Tree: Let F =
(V,E) be a plane forest for a constraint edge set E and vertex set V . The con-
strained Euclidean minimum spanning tree CMST (F ) is a spanning tree of CDT (F )
with minimum total cost where E ⊆ CMST (F ).
2.2 Dynamic Tree
The dynamic tree or link/cut tree (as it is more commonly called) is a data structure
that maintains path connectivity in rooted trees as part of a larger encompassing
forest. It has many applications for path reachability and membership queries. The
original publication of dynamic trees was devised by computer scientists, Robert
Tarjan and Daniel Sleator in 1982.[29] At the time, they refer to the data structure
as a dynamic tree.
Operations of dynamic trees have amortized O(log n) run time for n nodes; the
tree self-adjusts internally to improve subsequent operation times. This means ini-
tial operations allow for worse runtime than O(log n) as long as they are seldom
Chapter 2. Preliminaries 22
and over a sufficiently long sequence of p operations takes at worst O(p log n) time.
Dynamic trees support 4 base operations:
1. Link(v, w): Link a tree rooted at node v to a node w, belonging to another tree
2. Cut(v): Cut a tree rooted at node v, if connected by disconnecting it from its
parent node
3. Root(v): Find node v’s root node
4. Path(v): Report a property for the path v to Root(v)
The path operation can be augmented to report a property that requires aggre-
gation, such as the min / max / total cost of nodes in a path.
Queries in an unbalanced tree can be slow. In the worst case, they are no faster
than a linked list traversal. To prevent this, operations on dynamic trees decompose
paths into either preferred (solid) and unpreferred (dotted) edges. For any node
and its immediate descendants, at most one preferred edge enters a descendant
node - all other edges must be unpreferred. To construct a preferred path, an
internal operation Expose(v) or also known as Access(v), walks a path from v to its
root, and for each visited node performs a splice, changing the edge on the path to
preferred and setting the previously preferred edge, if it exists, to unpreferred.
Definition 2.15. Preferred child: The preferred child for a node v is:
• none if the last Access or Expose in v’s subtree is v
• node w if the last Access or Expose is in w’s subtree, w a child of v
Chapter 2. Preliminaries 23
Definition 2.16. Preferred path decomposition: A preferred path for a node v is
the path created by repeatedly following preferred edges of a descendant until there
are none. Preferred paths decompose nodes in the tree.
The number of splices is bounded by the height of the path that must be traversed
and the number of Expose(v) operations. The original algorithm decomposed the
height of the tree using Heavy-Light edges. An edge uv, u the parent of v, in a tree
is considered heavy if size(v), the number of nodes in subtree v, is greater than
12size(u). (See figure 2.2 for an example.) So, while a node may have multiple
edges, only one of its edges can be heavy. All other edges are referred to as light
edges. Without delving too deeply into the details, this allows the Expose operation
to take amortized O(log n) steps for n nodes.
Definition 2.17. Light-depth: Light-depth(v) is the number of light edges on a path
from Root(v) to v. The light-depth of v is ≤ log 2(n) because traversal down a light
edge at least halves the number of nodes in its subtree; any heavy edge containing
at least 12
of them.
Queries made on the preferred (solid) path in the represented tree (the depiction
of the tree and its paths) may be unbalanced. (See figure 2.3 for example). An
auxiliary data structure is used internally to store the preferred path. In the original
publication, biased binary trees were used. More recently, biased binary trees have
been replaced with splay trees for performance reasons, studies showing them to
be faster in application[23].
There may be multiple preferred paths in a represented tree, so there may also
be multiple auxillary trees. This forms a tree of auxillary trees. Auxillary trees
Chapter 2. Preliminaries 24
Figure 2.2: The edge uv is a heavy edge because size(v) = 5, size(u) = 9 andsize(v) > 1
2size(u).
connect to one another through a path parent pointer, which connect the root of an
auxillary tree to a node belonging to another auxillary tree.
Chapter 2. Preliminaries 25
Figure 2.3: In this figure[9], the represented tree on the left shows the preferredpath decomposition. For example, the preferred (solid) path a ↔ b ↔ e in the leftrepresented tree is stored in a balanced auxillary tree on the right. Nodes g, i and oeach lie on a preferred path with a single node.
Chapter 3
Minimum Edge-Constrained
Proximity Graphs
In this chapter, we provide the formal definition of the problem and the outline for
the main algorithms. Due to their complexity and length, we suggest the reader
refer to the original papers for the complete proof[8][3].
3.1 Constrained Delaunay Triangulation
When we refer to constrained Delaunay triangulation, we mean edge-constrained
Delaunay triangulation, a graph that forces specific edges to be present. Both the
Delaunay triangulation and constrained Delaunay triangulation, as their name sug-
gests, are triangulations. Each face of the graph is a triangle, with the exception
of the outermost face. Their main difference, however, is that not every edge in a
constrained Delaunay triangulation, unlike the Delaunay triangulation, needs to be
a Delaunay edge.
26
Chapter 3. Minimum Edge-Constrained Proximity Graphs 27
If we consider the Delaunay edge property for a constrained Delaunay triangu-
lation, we notice that the disc that must be empty of vertices does not always hold.
Therefore, we must use a different property. In place of the Delaunay edge property,
we define a constrained Delaunay property, a looser condition that allows for certain
vertices to exist inside the disc (normally empty of points). For a triangulation T
and an edge uv in T , we say uv is a constrained Delaunay edge if u and v lie on the
boundary of some disc C such that any other vertex inside C has visibility to at most
one endpoint, u or v.
We define a constrained Delaunay triangulation CDT (G), for a plane graph
G = (V,E) where V is the vertex set and E is constrained edge set (edges that must
be present in graph), as a triangulation such that every edge either: belongs to E or
is a constrained Delaunay edge (see figure 3.1 for an example). One notable remark
is that a CDT (V,E) where E = {∅} is equivalent to a Delaunay triangulation over
V.
The computation of a Delaunay triangulation requires O(n log n) steps for n
vertices[11]. Similarly, we can compute a CDT in the same runtime.[4] This can
be done using a divide-and-conquer technique that subdivides the graph into strips.
The algorithm recombines each strip with its neighboring strip using a complex set
of rules to triangulate edges crossing two strips while keeping in place constraint
edges. The algorithm outputs the constrained Delaunay triangulation when a single
strip remains.
Chapter 3. Minimum Edge-Constrained Proximity Graphs 28
Figure 3.1: Each edge in this CDT(V, E) either belongs to the constrained edge set,E or is a constrained Delaunay edge. Take for example edge rs, it belongs to E.The edge ps is a constrained Delaunay edge because there exists a disc, with p ands on its boundary, such that any vertex within sees at most one endpoint, i.e. q cansee s but not p.
3.2 Minimum Constrained Delaunay Triangulation
For a given plane graph G = (V,E), a minimum constrained Delaunay triangulation
is the graph MCDT (G) = (V, S) where S ⊆ E, |S| is minimum and CDT (V,E) =
CDT (V, S). Thus, S is the minimum edge set in the sense that there is no smaller
set that reconstructs CDT (V,E). Ideally, we want the cardinality of S to be as small
as possible, which in turn greatly reduces the number of edges that we must store.
Work done by Devillers et al.[8] states that the minimum edge set S comprises
of non-locally Delaunay edges from CDT (V,E). To understand why requires more
work. We sketch their proof in a similar fashion using an alternate definition for
Delaunay triangulation.
Definition 3.1. Edge flip: Let v1v2 be an edge in a triangulation T and let v3 and
v4, v3 6= v4, be the vertices belonging to 2 triangles adjacent to v1v2. Replacing
Chapter 3. Minimum Edge-Constrained Proximity Graphs 29
Figure 3.2: Delaunay triangulations attempt to avoid ’sliver’ quadrilaterals, prefer-ring to maximize the interior angle. By performing an edge flip, the left triangula-tion becomes a Delaunay triangulation on the right.
v1v2 with an edge v3v4 is a Delaunay edge flip if the sum of angles {v1, v3, v2} and
{v1, v4, v2} is greater than 180 degrees (see figure 3.2 for an illustration).
Definition 3.2. Alternate Delaunay triangulation definition: A triangulation over
a vertex set V is a Delaunay triangulation if and only if there are no edges that can
be flipped such that they become Delaunay edges.
Theorem 3.3. Let G = (V,E) be a plane graph, where V is the vertex set and E is
the edge constraints, then for CDT (G) the NLDCDT (G) (non-locally Delaunay edges)
form the minimum edge set S such that CDT (V, S) = CDT (V,E).
A sketch of the proof is as follows. Given a plane graph G = (V,E) and a
constrained Delaunay triangulation CDT (G), the edges of CDT (G) belong to either
NLDCDT (G) (the set of non-locally Delaunay edges) or LDCDT (G) (the set of locally
Delaunay edges). Both sets are subsets of CDT (G), i.e. NLDCDT (G) ⊆ CDT (G)
and LDCDT (G) ⊆ CDT (G). The NLDCDT (G) is the minimum constraint set because
Chapter 3. Minimum Edge-Constrained Proximity Graphs 30
we can add locally Delaunay edges to NLDCDT (G) to obtain CDT (G). This means
to compute the MCDT (G), we simply take away LDCDT (G) from CDT (G).
Why is NLDCDT (G) necessary, i.e. there cannot exist a smaller set S ′ ⊂ S such
that CDT (V, S ′) = CDT (V,E)? Assume S ′ = S\e′ where e′ = (u′, v′) is an edge that
can be removed such that CDT (V, S ′) = CDT (V,E). It follows that CDT (V, S ′ ∪
e′) = CDT (V,E) since we can take away e′. Consider in CDT (V, S ′) the edge e′
in triangles 4{u′, v′, q} and 4{u′, v′, p} where p and q, p 6= q are adjacent to e′.
Recall that the circumcircle {u′, v′, p} from the locally Delaunay property 2.7 does
not contain q and also {u′, v′, q} does not contain p. This is a contradiction. p or
q or both p and q fall within each other’s circumcircles because e′ is non-locally
Delaunay. Hence, there is no such edge e′ that we can remove from S.
3.2.1 Algorithm
Algorithm 3.1: Minimum edge-constrained Delaunay triangulationMCDT (G)
Input: plane graph G = (V,E)Output: minimum set S ⊆ E of constraints such that
CDT (V,E) = CDT (V, S)1 Construct CDT(G)2 Initialize S = ∅3 foreach uv ∈ CDT (G) ∩ E do4 Consider triangles 4(u, v, p) and 4(u, v, q)5 Consider circumcircles Cu,v,p and circumcircle Cu,v,q
6 if p ∈ Cu,v,q or q ∈ Cu,v,p then7 set S ← S ∪ {uv}8 end9 end
10 return S
Algorithm 3.1 accepts a plane graph as input in the form of a graph G = (V,E)
Chapter 3. Minimum Edge-Constrained Proximity Graphs 31
where V is the vertex set and E are the constraint edges, and outputs a minimum
constraint set S. Firstly, we initialize the constraint set S to the empty set, popu-
lating it throughout the for loop with non-locally Delaunay edges. After iterating
through each edge in CDT (G) ∩ E, S contains our minimum edge constraint set.
We discuss this algorithm with more detail in chapter 6.
3.3 Constrained Gabriel Graph
An edge-constrained Gabriel graph is a graph that forces certain edges to be present
while being as close as possible to a Gabriel graph. By close as possible, we mean
that not every edge in the graph needs to be a Gabriel edge, in particular, constraint
edges and edges adjacent to constraints may be non-Gabriel edges. This implies
that the disc that normally represents an empty region of the Gabriel property may
contain other vertices.
Instead of the Gabriel property, we use a looser constrained Gabriel property.
Consider a plane graph G and an edge uv in G. The edge uv is a constrained Gabriel
edge if the disc C with u and v on its boundary and diameter ||u−v|| has the property
that any other vertex inside C has visibility to at most one endpoint: u or v.
Using the constrained Gabriel edge property, we can define the constrained Gabriel
graph. Consider a plane graph G = (V,E), where V is a vertex set and E is the con-
strained edge set. A constrained Gabriel graph CGG(G) is the plane graph where
each edge either belongs to E (the edges that must be present in the graph) or is a
constrained Gabriel edge. See figure 3.3 for an example. If E is the empty set, then
the CGG(V,E) is equivalent to the Gabriel graph over V .
Chapter 3. Minimum Edge-Constrained Proximity Graphs 32
Figure 3.3: Each edge in the CGG(V, E) either belongs to the constrained edge set, Eor is a constrained Gabriel edge. Edge rs belongs to E. The edge ps is a constrainedGabriel edge; any vertex inside the disc has visibility to at most one endpoint.
We can describe a constrained Gabriel graph algorithm for a plane graph G =
(V,E) using the constrained Delaunay triangulation, CDT (G). The algorithm is
conceptually simple. First, we construct the CDT (G), then we prune edges that
are not constrained Gabriel edges nor constraint edges. The resulting graph contains
only edges which are constrained Gabriel edges or constraint edges belonging to E.
3.4 Minimum Constrained Gabriel Graph
For a plane graphG = (V,E), where V is the vertex set and E denotes the constraint
edges, the minimum constrained Gabriel graph is the graph MCGG(G) = (V, S)
where S ⊆ E, |E| is minimum and CGG(V,E) = CGG(V, S). So given S, to return
the graph to its original state, CGG(V,E), we apply the operation CGG(V, S).
The proof from Bose et al.[3] shows that non-locally Gabriel edges of CGG(G)
Chapter 3. Minimum Edge-Constrained Proximity Graphs 33
are the edges that form S, the minimum constrained edge set of MCGG(G). The
ideas are similar to the proof for the minimum Delaunay triangulation algorithm
where non-locally Delaunay edges make up the minimum constraint edge set.
Theorem 3.4. Let G = (V,E) be a plane graph and let S be the set of edges in E that
are non-locally Gabriel. S is the minimum set of constraints such that CGG(V,E) =
CGG(V, S).
A sketch-of-proof on the minimality of S from theorem 3.4 is as follows. Firstly,
S is the set of edges of E that are non-locally Gabriel. This means we can augment
S by adding edges E \S until we obtain E. Since edges in S are non-locally Gabriel,
E \S must be Gabriel edges. Therefore if we compute CGG(V, S), all edges in S by
definition of algorithm are guaranteed to exist in CGG(V, S) while all other edges
are locally Gabriel. Hence, S is a sufficient set of constraint edges.
To show why S is the minimum set, assume there is an edge set S ′ such that: S ′
is not empty, S ′ ⊂ S and G ⊆ CGG(V, S ′). Let e′ = (u′, v′) be an edge of S \S ′. This
implies there exists a point p ∈ V such that a disc with diameter ||u′ − v′|| contains
p (by definition of the non-locally Gabriel property). Since G ⊆ CGG(V, S ′), it
follows that e′ ∈ CGG({u′, v′, p}, ∅) but this is a contradiction because e′ is non-
locally Gabriel. Hence, there are no edges that can be taken away from S, so S is a
necessary set.
Surprisingly, the algorithm 3.2 for computing a minimum constrained Gabriel
graph is similar to the minimum constrained Delaunay triangulation algorithm. As
it turns out, the CDT is an important step of the algorithm. We can prune all edges
from the CDT (G) except for non-locally Gabriel edges. This is possible because of
the proximity graph hierarchy that states: CGG(G) ⊆ CDT (G). It follows that if
Chapter 3. Minimum Edge-Constrained Proximity Graphs 34
Figure 3.4: The edge uv is a Delaunay edge but not a Gabriel edge.
an edge e is locally Gabriel in CGG(G), then e is also locally Delaunay in CDT (G).
However, the converse is not always true. A locally Delaunay edge e ∈ CDT (G)
is not necessarily a locally Gabriel edge, e /∈ CGG(G). Refer to figure 3.4 for an
example.
Lemma 3.5. Given a plane graph G = (V,E), CGG(G) ⊆ CDT (G).
A proof for this is as follows: let e = (u, v) be an edge in CCG(G). If e is part of
the constrained edge set E, then e ∈ CDT (G) because the algorithm by definition
forces e to exist in its output. Otherwise, if e /∈ E, then e must be locally Gabriel
and by lemma 3.6, e must also be locally Delaunay.
Lemma 3.6. For an arbitrary triangulation T over a vertex set V , given a locally
Gabriel edge e = (u, v) and adjacent4{u, v, p} and4{u, v, q}, e is a locally Delaunay
edge as well.
Since e = (u, v) is locally Gabriel, there is a disc C with ||u−v|| as diameter empty
of points: p and q. This implies there exists circumcircle (u, v, p) that is empty of q
and circumcircle (u, v, q) that is empty of p. Hence, e is also locally Delaunay.
Chapter 3. Minimum Edge-Constrained Proximity Graphs 35
3.4.1 Algorithm
Algorithm 3.2: Minimum edge-constrained Gabriel graphMCGG (G)
Input: plane forest G = (V,E)Output: min. set S ⊆ E of constraints such that CGG(V,E) = CGG(V, S)
1 Construct CDT(G)2 Initialize S = ∅3 foreach uv ∈ CDT (G) ∩ E do4 Consider triangles 4(u, v, p) and 4(u, v, q)5 Consider the disc Cu,v with diameter ||u− v||6 if p ∈ Cu,v or q ∈ Cu,v then7 set S ← S ∪ {uv}8 end9 end
10 return S
To begin, algorithm 3.2 accepts as input a plane forest G = (V,E) and outputs
the minimum edge set S. Similar to the minimum constrained Delaunay triangula-
tion algorithm, the for loop checks for non-locally Gabriel edges in CDT (G) ∩ E,
returning them as S. We provide more details on the implementation in chapter 6.
3.5 Constrained Minimum Spanning Tree
Let G = (V,E) be a plane graph over a vertex set V and edge set E. Let w(e), e ∈ E
be a function that maps an edge to a cost. A minimum spanning tree MST (G) is the
subgraph of G that is a spanning tree with minimum cost. A Euclidean minimum
spanning tree uses the Euclidean length of an edge as its cost. There are many
simple, efficient algorithms that compute minimum spanning trees.[2][18][21] We
focus on the Euclidean variant of Kruskal’s minimum spanning tree (see algorithm
Chapter 3. Minimum Edge-Constrained Proximity Graphs 36
3.3).
Minimum spanning trees have a useful property known as the cut property. For a
MST (G) consider any cut C, a partition of V in two disjoint subsets, in G. If e is an
edge in the cut C with lowest cost among all other edges in C, then e also belongs
to the MST (G). See figure 3.5 for an example of this property. Later on, we will
see that this property is useful in the proof outline for the minimum constrained
MST.
Figure 3.5: An example of the cut property. Edges in orange represent edges in thecut set, partitioning S and V . The lowest cost edge of that cut, cost 4, is in the MST.
When we refer to constrained minimum spanning tree, we mean an edge-constrained
Euclidean minimum spanning tree. We can think of a constrained minimum span-
ning tree as forcing certain edges to exist in the spanning tree. Let F = (V,E)
a plane forest where V is the vertex set and E is the constraint edge set. A con-
strained minimum spanning tree CMST (V,E) is the MST (CDT (V,E)) such that
the constraint edges are present, i.e. E ⊆ CMST (V,E). If E is the empty set,
then CMST (V,E) is equivalent to MST (DT (V )), which is the Euclidean minimum
spanning tree over V .
The algorithm for CMST (V,E) is straightforward. It is the MST (CDT (V,E))
where the cost function maps e ∈ E to a value of 0 and all other edges have their
Chapter 3. Minimum Edge-Constrained Proximity Graphs 37
Algorithm 3.3: Kruskal’s Euclidean Minimum Spanning TreeKruskalMST (V )
Input: Vertex set VOutput: minimum spanning tree E
1 Let DS be a disjoint set initialized to {∅}2 Let DT (V ) be the Delaunay triangulation of V3 foreach u ∈ V do4 DS.MakeSet(u)5 end6 Let H be a min-heap containing all edges e ∈ DT (V )7 Initialize E = ∅8 while e = (u, v)← H.ExtractMin() not nil do9 if DS.F ind(u) 6= DS.F ind(v) then
10 set E ← E ∪ e11 end12 end13 return E
original cost. This causes the algorithm to include 0 cost edges in the MST over any
positive-cost edge. In Kruskal’s MST, we repeatedly add the minimum cost edge
that does not form a cycle to the MST until we span all vertices. We simply modify
Krukal’s algorithm to treat constraint edges as having cost 0 when sorting the edges
by their cost.
3.6 Minimum Constrained Minimum Spanning Tree
We can think of the minimum constrained MST as a graph with a subset of edges
from CMST (F ), where F = (V,E) is a plane forest, V is a vertex set and E is the
constraint edge set. In more formal terms, the minimum edge-constrained MST,
MCMST (F ) is a graph such that its edge set S is the minimum subset of E that
is able to reconstruct CMST (V,E). This means computing CMST (V, S) gives the
Chapter 3. Minimum Edge-Constrained Proximity Graphs 38
Algorithm 3.4: Constrained Minimum Spanning Tree (backed by Kruskal’sMST)
CMST F = (V,E)Input: Plane forest F containing vertices V and constraint edges EOutput: Constrained minimum spanning tree T
1 Construct CDT (F )2 Let DS be a disjoint set initialized to {∅}3 foreach u ∈ V do4 DS.MakeSet(u)5 end6 foreach e ∈ CDT (F ) do7 if e ∈ E then8 w(e)← 09 end
10 end11 Let H be a min-heap initialized with edges from CDT (F )12 Initialize T = ∅13 while e = (u, v)← H.ExtractMin() not nil do14 if DS.F ind(u) 6= DS.F ind(v) then15 set T ← T ∪ e16 end17 end18 return T
Chapter 3. Minimum Edge-Constrained Proximity Graphs 39
equivalent graph CMST (V,E). It does so by identifying edges in the MST that can
be recalculated by the CMST algorithm, i.e. we do not need to include in S edges
that are part of MST (DT (F )).
The MCMST (F ) has a few other implications. We know the minimum edge
set, S, creates the smallest subset: S ⊆ E. In addition, F = (V,E) ⊆ CMST (V,E)
by the CMST definition, so it follows that since CMST (V,E) = CMST (V, S), F ⊆
CMST (V, S). If F is a spanning tree then F = CMST (F ) because the cost of all
edges in E is 0, making it the minimum cost spanning tree.
To show that S is sufficient, we show that F = (V,E) ⊆ CMST (V, S). Consider
the edge e = (u, v) belonging to E.
• If e ∈ S, then e ∈ CMST (V, S) by definition of the CMST (V, S) algorithm
• If e /∈ S, we focus on what CMST (V, S) does to augment S such that e ∈
CMST (V, S). We show this through proof by contradiction.
For e ∈ E but e /∈ S, assume e is not part of CMST (V, S). The CMST (V, S)
assigns to all edges es ∈ S a cost of 0, then performs MST (CDT (V, S)). Consider
any cut U, V \ U such that u ∈ U and v ∈ V \ U in the graph CDT (V, S). Let edge
e′ /∈ E be in the cut. Adding e forms a cycle created by MST (CDT (V, S)) ∪ {e}.
However, this is a contradiction: cost(e) ≤ cost(e′). Hence, according to the cut
property, e ∈MST (CDT (V, S)).
Now, we sketch the proof for why S is minimum. Assume there is a non-empty
edge set S ′ such that F ⊆ CMST (V, S ′) and |S ′| < |S|. Put in another way, we say
that S ′ is a subset of S that is able to reconstruct CMST (V,E), meaning that we
can take away at least one edge from S. To show that S ′ is impossible, we show a
contradiction to the statement: F ⊆ CMST (V, S ′).
Chapter 3. Minimum Edge-Constrained Proximity Graphs 40
Let e = (u, v) be an edge of S \ S ′. This edge must come from F , i.e. e ∈ F ,
otherwise if e /∈ F , then e does not exist in S and neither would it exist in S ′.
Hence, we consider a scenario for CMST (F )) where it has a cut U , V \U such that
u ∈ U and v ∈ V \ U . The presence of e ∈ S means that it has higher cost and
takes the place of a lower cost edge e′, e′ /∈ E, from the cut. Adding e′ back forms
a cycle created by CMST (F ) ∪ {e′}, where cost(e) > cost(e′). So, CMST (V, S ′)
produces the graph containing e′ but this contradicts CMST (V, S ′) = CMST (V, S)
since CMST (V, S) does not contain e′. Therefore, there is no smaller subset S ′ and
S is minimum.
We now address the relationship that CMST (F ) has with CDT (F ). The lemma
3.7 tells us that for an arbitrary triangulation, a CMST ⊆ CDT . Note that a CDT
does not have intersections, hence CMST, a subgraph of CDT, also will not have
intersections.
Lemma 3.7. CMST (F ) ⊆ CDT (F ) where F = (V,E) is a plane forest, V is a vertex
set and E is a constraint edge set.
Consider an edge e = (u, v) belonging to CMST (F ). There are 2 cases to
consider:
• If e ∈ E, then e ∈ CDT (F ) by definition of the algorithm, i.e. e is constrained
in the Delaunay triangulation output.
• If e /∈ E, then we need to show that e ∈ CDT (F ), which is the same as proving
the non-constrained version, MST (F ) ⊆ DT (F ). Assume e ∈ MST (F ) but
e /∈ DT (F ). We use the following definitions below to show a contradiction.
Chapter 3. Minimum Edge-Constrained Proximity Graphs 41
Figure 3.6: Both the distance from uv → vp and vu→ up is greater than up→ pv.
Lemma 3.8. A Gabriel edge is a Delaunay edge. Consider vertices u and v in plane
graph G = (V,E) and the disc C that lies on vertices u, v with diameter ||u− v||. If C
does not contain any other vertex in V , then e = (u, v) is a Delaunay edge in DT (G)
by the Delaunay property.
Lemma 3.9. Heavy edge property: consider a cycle Y = {v1...vn} in a plane graph
G = (V,E) where V is the vertex set and E is the edge set with a cost associated with
each edge. If eheavy is the edge with the highest cost in Y , then eheavy is not part of the
MST (G).
Since we assume e = (u, v) is not a Delaunay edge, if we form a disc C that lies
on u, v with diameter ||u − v||, there must be some other point p inside of C. Now
consider the cycle {u, v, p} (see figure 3.6). The distance from either u or v to p is
shorter than the distance |uv|. But, this contradicts our lemma 3.9, which states u
and v are connected if they are not the edge with the highest cost (by Euclidean
distance). Hence, e ∈MST (F ).
Chapter 3. Minimum Edge-Constrained Proximity Graphs 42
Algorithm 3.5: Minimum edge-constrained Minimum Spanning TreeMCMST (F )
Input: plane forest F = (V,E)Output: minimum set S ⊆ E of constraints such that
CMST (V,E) = CMST (V, S)1 Construct T ′ =MST (CDT (F ))2 Construct CMST (F ) =MST (CDT ◦(F ))3 Initialize S = ∅4 foreach e′ ∈ T ′ do5 if CMST (F ) ∪ {e′} creates a cycle ce′ then6 foreach e ∈ ce′ ∩ E do7 if w(e) > w(e′) then8 set S ← S ∪ {e}9 end
10 end11 end12 end13 return S
3.6.1 Algorithm
Construction of a minimum edge-constrained MST MCMST (F ) takes as input a
plane forest, F = (V,E) and returns the minimum constraint edge set S. Much
like the 2 previous minimum constraint algorithms, the MCMST (F ) starts with
the construction of CDT (F ). The difference here is that we use CDT (F ) twice -
once to assemble MST (CDT (F )) and once more to assemble MST (CDT ◦(F )), the
former MST being a regular MST and the latter having a special cost function that
assigns 0 cost to all edges belonging to E.
The next step initializes our constraint edge set, S, to the empty set. Shortly, we
will iteratively add edges to S, but first we require a link cut tree, or dynamic tree
as it is sometimes known as. Using MST (CDT ◦(F )), we pre-populate the link cut
tree. Each vertex in MST (CDT ◦(F )) corresponds to a node in the link cut tree. If
Chapter 3. Minimum Edge-Constrained Proximity Graphs 43
vertex v has an edge to vertex u, the link cut tree also has an edge from the node
representing v to the node representing u, with cost equal to the Euclidean distance
|uv|.
Figure 3.7: A) represents T ′ =MST (CDF (F )). B) represents MST (CDT ◦(F )). C)and D) show the addition of a red edge from T ′ on to MST (CDT ◦(F )) that causesa cycle. E) the minimum constrained set S.
In this step, the algorithm loops over all edges e′ ∈ T ′ = MST (CDT (F )). (An
edge that exists in both MST (CDT (F )) and MST (CDT ◦(F )) is surely not a con-
straint edge.) Using e′ = (u, v), if linking the nodes u and v in the link-cut tree
of CMST (F ) yields the same connected component, then adding e′ must create a
cycle. Still, this is not indicative that edges in the cycle Ce′ that e′ forms belongs in
S; we must look at the cost of edges in the entire cycle.
Within the cycle, we are interested in edges that have Euclidean distance greater
than e′ and belong to E, i.e. edges in E ∩ Ce′. These edges are added to S. Once
the outer for loop terminates, the minimum constraint edge set S is returned. We
provide a example in figure 3.7
Chapter 4
Geometry Platforms and
Technologies
4.1 Survey of Geometry Platforms
There are many factors to consider in selecting a platform for an application. One
needs to assess the main objectives and prioritize them. For example, is software re-
liability a concern or is performance the greater concern? There are no silver bullets
for choosing a platform nor is there technology that fits all sizes. The programmer
must be aware and consciously make these decisions using the available resources
online, aligning them as closely as possible to their values.
For the applications in this thesis, we value two characteristics:
• our applications should be efficient and competitive compared to existing ap-
plications, if they exist
• our applications should leverage an existing framework for future extension
44
Chapter 4. Geometry Platforms and Technologies 45
and take advantage of well-tested software
Both goals are inline with the applications that are typically written for the pur-
pose of an experimental study. That is to say, the application is written for perfor-
mance, accuracy but without the necessary rigor of commerical level quality.
On the first objective, we strive for results that are competitive in the sense that
different algorithms that produce the same output can be compared to it, giving
researchers a better idea of their capabilities. To do this, we gather the execution
time of each algorithm. We also consider graphs with different characteristics to
give a wider variety of performance metrics.
On the second objective, we use frameworks to leverage existing features. There
are a handful of computational geometry packages or libraries that offer the neces-
sary tools to build our application. Unfortunately, it is rare to find a framework that
meets all of our needs. Well-designed frameworks are extremely difficult to craft;
they often suffer from lack of resources, funding, programmers with interest or will-
ingness to donate their time, and experts in the domain. Framework architects must
juggle between a variety of different considerations, including but not limited to,
feature richness of their API (application programming interface), maintainability
of their codebase, execution performance and responsiveness.
We looked at 5 different computational geometry libraries: CGAL, Boost Geom-
etry, LEDA, JTS (Java Topology Suite) and Shapely, across major 3 programming
languages: C++, Java and Python. We provide details on each library in the up-
coming sections.
Chapter 4. Geometry Platforms and Technologies 46
4.1.1 CGAL (Computational Geometry Algorithms Library)
CGAL is a software library that offers a wide field of algorithms and data struc-
tures, in particular for geometric processing, geographical information systems,
computer graphics, molecular biology, medical imaging and robotics (see figure
4.1). Their library supports essential operations for triangulations, Voronoi dia-
grams, boolean operations on polygons and polyhedra, point set processing, ar-
rangements of curves, surface and volume mesh generation, geometry processing,
alpha shapes, convex hull algorithms, shape analysis, AABB and KD trees.[10]
CGAL is arguably the largest available library in feature richness, maturity and
active user community. Starting from an European consortium between 8 research
institutions and universities, CGAL has since grown into its own project. Their initial
release was in 1996 and now they include many new contributors, a 13 member
editorial board and over 30 developers and reviewers from around the world.
They provide monthly releases with bug fixes and new features. Their library
is free to use for academics; however, the option to license it for commercial use
in for-profit software is also available. The platform targets the C++ language
as their main execution platform but has support for Java and Python available
through interface bindings. The current version, 4.11, is available under an GNU
General Public License (GPL) or GNU Lesser General Public License (LGPL).
4.1.2 Boost Geometry
Boost Geometry is part of the larger Boost project, a ubiquitous collection of li-
braries in the programming world of C++ (see figure 4.2). (Many founders of
Boost sit on the C++ standards committee). Boost’s goal is to construct robust li-
Chapter 4. Geometry Platforms and Technologies 47
Figure 4.1: A description of the triangulation package for CGAL
braries that support features such as tasks, data structures, linear algebra solvers,
pseudorandom number generation, multithreading, image processing, regular ex-
pressions, and unit testing. As of current, they offer over eighty individual libraries.
Boost Geometry provides dimension modules, coordinate modules and a scal-
able kernel, based on meta-functions and tag dispatching. Their geometry shape
models support many operations, some of these include calculating: area, length,
perimeter, centroid, convex hull, intersection (clipping), point in polygon check,
distance, envelope (bounding box), and transformations. For precise arithmetic
numbers and calculations, the standard math library can be replaced with the nu-
merical precision library, ttmath.[15]
According to the developers of Boost, the predominant use case for Boost Ge-
ometry is geographical information systems. Their API design allows for quick and
easy geometric and spatial operations. However, the authors do not exclude Boost
Geometry from other usage. In fact, the library is generic and has support for other
topics such as game development, computer graphic, robotics and astronomy.
The Boost library was first released in 1999 as an open source, free to use among
academics and for proprietary software projects. As the collection evolved and grew,
a decade later, the Boost Geometry library was released. Unsurprisingly, the ubiq-
Chapter 4. Geometry Platforms and Technologies 48
Figure 4.2: A small handful of the Boost Geometry concept and model offerings.
uity of Boost libraries in the C++ language made it an integrated part of the CGAL
library. Users of CGAL are likely to directly or indirectly use Boost components.
4.1.3 LEDA (Library of Efficient Data types and Algorithms)
Similar to CGAL, LEDA is a library that provides precise algorithms and data struc-
tures for graphs, topology and networks (see figure 4.3). This includes algorithms
for both 2D and 3D convex hulls, triangulation, dual graphs, closest pair and Minkowski
sums. The library also offers basic data structures, such as sequences, dictionar-
ies, trees, points, and segments, that normally are available in a library of this
maturity.[25]
LEDA was originally developed as a project from the Max Planck Institute for
Informatics in Saarbrucken, Germany. In 2001, it was purchased by Algorithmic
Solutions Software GmbH. They are the current developers and maintainers of the
source base. The source code for the library is proprietary but can be purchased at
a steep cost of approximately 6000 Euros.
LEDA supports all common C++ compilers and the following operating systems:
Chapter 4. Geometry Platforms and Technologies 49
Figure 4.3: Some of the geometry algorithms provided by LEDA.
Windows, Linux and Unix. Similar to CGAL’s licensing model, LEDA offers free
licensing to researchers, individuals or non-profit organizations. Their free-to-use
version, however, is greatly limited by features and capabilities - among them are
important graph algorithms and graph data structures that are necessary for our
application. Regrettably, the price for the lowest cost package that includes graph
features is prohibitively expensive at 1200 Euros per team license.
4.1.4 JTS (Java Topology Suite)
There are few geometry libraries, natively written for Java. JTS is arguably the
most extensive and well-maintained to date. It is an open source (GNU Lesser
General Public License) library that supports Euclidean planar linear geometry to-
Chapter 4. Geometry Platforms and Technologies 50
Figure 4.4: A snapshot of Delaunay triangulation classes offered by the JTS pack-age.
gether with a set of fundamental geometric functions. The creators claim that JTS
should be used as a core component of vector-based software such as geographical
information systems (see figure 4.4). It also provides a general-purpose algorithm
library for computational geometry.[31]
The JTS project initially was funded by GeoConnections, a Canadian national
program with responsibilities in geospatial data research, and by the province of
British Columbia. Now it is developed by a proprietary software company, Vivid
Solutions and more recently, LocationTech. The source code remains publicly avail-
able.
4.1.5 Shapely
Shapely is an open source BSD-licensed Python package for manipulation and anal-
ysis of planar geometric objects, available as an extension to native Python pack-
ages. It is strongly influenced by GEOS, the C++ port of JTS. The library supports
geometric data models, set operations and linear algebra operations (see figure
4.5). Also, available are GIS algorithms and operations, provided by their PostGIS
engine. Some computational geometry algorithms may be lacking in comparison to
LEDA or CGAL.[16]
Chapter 4. Geometry Platforms and Technologies 51
Figure 4.5: Search results for Python Delaunay triangulation packages.
The source code for Shapely is available online under an open source license.
It is currently maintained by a large group of Python community members. Initial
funding for Shapely is largely attributed to the U.S. National Endowment for the
Humanities.
4.2 Selected Platform and Technologies
Our current implementation uses CGAL 4.9 and Boost Geometry 1.59.0 in an C++
environment. CGAL has nearly two decades of software maturity, exceeding that
of other geometric libraries. As well, the C++ language is able to compile and
run native to most operating systems, guaranteeing that no virtual machine transla-
tion occurs during execution, which is a possible area of slowdown for interpreted
languages.[27] Lastly, CGAL now includes and supports Boost Geometry data struc-
tures and algorithms. This is a major incentive for selecting CGAL. Boost Geometry
compliments CGAL by including graph data structures in the form of adjacency list,
adjacency matrix and custom graphs. These addition make CGAL and Boost the
platform of choice.
As for the Python alternative, Shapely, it is missing major algorithms that are
crucial, namely constrained graph algorithms. To find alternatives in the respective
Chapter 4. Geometry Platforms and Technologies 52
platform, we need to investigate 3rd party libraries, such as OrbisGIS or generalized
GIS libraries. Unfortunately, the feature depth of these libraries is only a subset
of the features in CGAL; however, in the future, these libraries could become a
mainstay in computational geometry for Python users. At the very least, Shapely is
easily integrated into Python’s main packages.
Licensing costs are also considered. Of the libraries we examined, most were
free for academic use and provided open source code, with the exception of LEDA.
Portions of LEDA are free, but the computational geometry library is not. Disap-
pointingly, LEDA licenses are prohibitively priced, making it difficult to choose LEDA
as our platform. The work done in this thesis is for the enrichment of computa-
tional geometry and there are no plans to commercialize or distribute our software
for profit. With this in mind, any library that is free-to-use and supports academic
research is highly regarded.
Chapter 5
Data Structures and Functions
5.1 Graph Data Structure
One major incentive for using the Boost Graph and Geometry library is their sup-
port for well-documented and flexible features. The Boost committee philosophy
encourages the reuse of common data structures and algorithms, as such they pro-
vide generic and standardized interfaces for programmers. This means their graph
interface abstracts the implementation details away from the programmer, leaving
only the necessary set of functions to call. For the majority of our use cases, we
require only their C++ headers, exposing the graph and geometry API (application
programming interface).
In the Boost Graph and Geometry library, there are different implementations
for graphs data structures. They offer 2 standard graph objects:
• Adjacency List
• Adjacency Matrix
53
Chapter 5. Data Structures and Functions 54
Figure 5.1: Table 5.1 and 5.2 represent different internal representations for thesame directed graph
Table 5.1: Adjacency ListNode Edges
A {B, C}B {C}C {}D {}
Table 5.2: Adjacency MatrixA B C D
A X XB XCD
An adjacency list is a general all-purpose representation of a graph. It stores
vertices in a simple one-dimensional data structure, such as an array, linked-list or
hash table. For each vertex, a secondary data structure stores edges entering or
leaving this vertex. Typically, this secondary data structure is an array or linked-list.
Adjacency lists may allow a variety of configurations such as directed or undirected
Chapter 5. Data Structures and Functions 55
edges, multi-edge support or improvements to insertion, searching or deletion of
an edge or vertex. Most of these configurations can be found in the Boost Graph
library in their API documentation.
The adjacency matrix is a common alternative graph data structure. Vertices are
stored in a |V | × |V | matrix data structure, where |V | is the number of vertices,
such as a 2 dimensional array or nested hash table. To locate an edge uv, the row
denoting u and column denoting v marks the location of this edge. Representing
the presence of an edge is as simple as flipping a flag in that position or storing a
reference to an object that represents the edge with various properties. The former
has the advantage of requiring the least amount of memory, while the latter is far
more flexible and descriptive in the data that can be stored. The main advantages an
adjacency matrix has over an adjacency list is that they improve insertion, deletion
and lookup speed to O(1) if the graph size is known ahead of time. For dense
graphs, this is especially useful when the number of edges approaches |V |2. The
disadvantage is the memory requirement; often |V |2 memory is prohibitive for large
graphs.
For our experiment, both graph data structure were examined. The predominant
factor for deciding which would be used came down to runtime performance and
memory consumption. When we performed initial experiments, we populated the
adjacency matrix with a large graph containing over 60000 edges. This allocated
pointers in memory for over 600002 edge objects, promptly causing the applica-
tion to crash with an insufficient memory exception. Unfortunately, at 4 bytes per
pointer multiplied by 600002 edges, the adjacency matrix required over 14GB of
memory. This immediately made the usage of the adjacency matrix unsuitable for
Chapter 5. Data Structures and Functions 56
large datasets under most non-commercial hardware configurations.
During initial testing and implementation work, the Boost graph adjacency list
was the data structure of choice by process of elimination. Its configuration for
our algorithm is: undirected and without multi-edges. For both underlying ver-
tex and edge data structures, Boost vector data structures were used. This meant
traversal over vertices to find a particular vertex, the operations add vertex() and
remove vertex() are done in linear time. To find, add or delete an edge, the adja-
cency list must traverse all vertices to find the vertex pair that corresponds to the
edge. Following this, an additional linear time iteration is made across the vector
containing edges corresponding to that vertex. In worse case, this operation visits
every vertex and every edge. As we will see later, this leads to significant slowdown
in our application.
We now explain the pitfalls of the Boost graph adjacency list and why it was dis-
carded in the final implementation. At many parts throughout the algorithm, a find
or access operation to an edge is made. In particular, during Kruskal’s minimum
spanning tree, we accessed all weight-sorted edges in the Delaunay triangulation.
Due to a generalized graph interface, both Boost adjacency list and Boost adja-
cency matrix share the same edge access function signature and as a result have
slower generalizations in their implementation. Experimental benchmarks apply-
ing Kruskal’s MST using a Boost adjacency list led to significantly slower results
than our custom adjacency list. In fact, for datasets with approximately 1000 ver-
tices and approximately 1000 edges, the Boost adjacency list took over 1 minute to
complete. Our custom adjacency list took only seconds. When you consider this
slowdown across other functions of the algorithm, using Boost adjacency list was
Chapter 5. Data Structures and Functions 57
infeasible for larger real-world datasets, which are tens of thousands of vertices and
edges in size.
While it would seem Boost adjacency list is the only option that allows easy
integration with CGAL functions that accept Boost graphs as part of their parameter,
we also considered using our own custom graph. As it turns out, this yielded far
better performance in our experiments than Boost graph data structures because of
optimizations that could be made. (A finely tuned data structure will nearly always
perform better than a general, adaptive data structure at the expense of extra work
and tuning).
Our custom adjacency list uses vectors for the underlying vertex and edge data
structure. Vertices within the vector are ordered by their index position, allowing
for quick access to a vertex given an index. This small optimization alone leads
to significant performance improvements. In the next subsections, we discuss the
implementation details.
5.2 Graph Containers
5.2.1 Vertex
In our custom graph representation, a vertex is simply a size t type, the platform de-
pendent C++ alias which on our platform represents an unsigned long long (guar-
anteed by the C standard to be at least 64 bits). Each vertex has a unique id value.
Randomly generated vertices will have an ascending id from 0 to n − 1, where n
is the number of vertices. Likewise, when vertices are loaded from file, they are
assigned an id from 0 to n− 1 in the order they are read. To describe properties for
Chapter 5. Data Structures and Functions 58
a vertex, a Boost vector, aliased VertexVector, stores a pointer to a vertex struct, a
lightweight data type for grouping variables in C. A struct’s id is the index within its
parent vector, i.e. properties for vertex with id = 5 are found at the 5th index of the
vector. Currently, the only property we need is the Cartesian position, a CGALPoint
data structure which stores x and y values.
In addition, each vertex has 2 useful operators: an equality function and a hash
function. The equality function, given two points, returns a boolean answer as to if
they share the same x and y values. The hash function generates a hash for a given
vertex using its x and y values. This is useful for quickly determining if the same
vertex already exists in a hash table. The hash operator is also useful by extension
for the edge equality function and hash function, as we will see shortly.
5.2.2 Edge
We represent an edge as a simple struct. Its members are two: size t variables
for vertex v1 and vertex v2 and an edge weight. The struct represents the edge
weight as a double in inexact arithmetic and as a CGAL Gmpq type (an arbitrary
precision rational number) in exact arithmetic. We offer a constructor that accepts
as parameters all members (see figure 5.2). We store edge structs in a Boost vector
aliased as EdgeVector. A graph is said to contain an edge between v1 and v2 if there
exists an edge in the EdgeVector with values v1 and v2.
Similar to a vertex, an edge has both equality check and hash operators. They
extend the vertex operators in that: edge, v1v2, and edge, v3v4, are equal if either of
the following are true:
• v1 equals v3 and v2 equals v4
Chapter 5. Data Structures and Functions 59
Figure 5.2: Edge struct representation in code
• v1 equals v4 and v2 equals v3
Due to edges being undirected, even if an edge constructor is called with end-
points v1, v2 or v2, v1, the edge equality operation will return true. This also
holds true for the edge hash operator, which relies on the vertex hash operator.
To create an edge hash, we invoke the hash operator of both of its vertices and
sum their values. For example, suppose hash(v1) = a and hash(v2) = b, then
hash(v1v2) = hash(v1) + hash(v2) = a + b. This ensures that the hash value for
an edge storing vertices in the configurations v1v2 and v2v1 both produce the exact
same hash. We use this hash operator in conjunction with a hash table to quickly
find an edge in constant time, e.g. to determine if there are duplicate edges or for
storing and retrieving an edge.
The last property of an edge is a weight variable. This double primitive type
variable stores the weight or Euclidean distance of the edge. Technically, there
are no restrictions on whether they must be actual distances, squared distances or
CGAL’s exact distances, which produces correct arithmetic results regardless of pre-
cision loss from rounding. The only caveat is that it must be consistent throughout
Chapter 5. Data Structures and Functions 60
the application. Currently, we support squared distances and use exact distances
whenever possible. It is worth noting edge weights do not need to be computed
immediately - they can be evaluated lazily when needed.
5.2.3 Vertex Container
To represent a vertex set, we use a simple Boost vector, aliased VertexVector, to store
all vertices belonging to this set, allowing O(1) access given a vertex id (which
corresponds one-to-one with a vector index). Iteration across this vector is done
simply by a loop or using Boost iterators. (An iterator provides an abstract access
interface to an underlying collection, including functions for traversing across its
elements.) Iterators are the most common form of access for the VertexVector. For
the most part, the algorithm does not modify or change the vertex set. It is also
possible to store the vertices in a hash table using the vertex hash operator, but we
did not find this to be necessary.
To manipulate a VertexVector, we provide a few useful functions. Firstly, an add
vertex function takes as parameters a single vertex v and a VertexVector to append
to. The VertexVector adds v to the end of the vector, without checking for duplicate
vertices. By duplicate, we mean an existing vertex with the same pointer value. A
delete vertex function is also available, taking the same parameters. In linear time,
we iterate the VertexVector to find and delete a vertex that satisfies the equality
function. If no vertex is found, the function simply returns.
Chapter 5. Data Structures and Functions 61
5.2.4 Edge Container
In our application, edge sets take on different representations depending on its
purpose. Much like the VertexVector, our standard edge set uses a Boost vector,
aliased EdgeVector, except that it contains edges instead of vertices. Using an index
accessor, edge retrieval is O(1) if the edge id is known. To iterate across edges,
we use either a loop or iterator. For certain cases, there is a need to quickly find
an edge by reference or hash value, e.g. if two edge pointers refer to the same
object. For this, a hash map with the pointer value or hash value of the edge is the
data structure of choice. This also allows for a direct mapping between an edge
and another object. For example, we perform this type of mapping frequently when
converting back-and-forth between CGAL edge types and our own, requiring O(n)
time where n is the number of edges to convert.
To make our lives easier, EdgeVectors support a variety of functions. The add
edge function receives as parameters an edge e and an EdgeVector, appending e to
the end of the vector. A delete edge function takes as parameter an edge e and an
EdgeVector. The function checks for membership of e using the equality check by
iterating across all edges. If e is found, we remove it from the EdgeVector. If no
edge is found, the function will return. When the application terminates, we invoke
a clean up function, which iteratively removes all edges from an EdgeVector and
deletes them to ensure proper memory management.
5.2.5 Geometric Functions
There are a few common, yet useful, geometric functions our algorithms employs -
including some external to the algorithm to help with data input and validation. We
Chapter 5. Data Structures and Functions 62
know the input must be planar, and thus cannot contain 2 edges that intersect. To
determine if an edge pair intersects, we use CGAL’s Intersect class (in 2 dimensions)
that accepts two Segments as parameters. The return type is not a simple boolean
true or false, but instead a CGAL object that describes, firstly, if there exists an
intersection, then at which segment or point it occurs. The internal implementation
uses a line-to-line intersection determinant to find if there is an intersection. It
returns a result of no intersection, if none is found. Otherwise, the position of
the intersection is returned. Behind the covers, CGAL uses a determinant check to
quickly find the intersecting point:
(Px, Py) = ( (x1y2−y1x2)(x3−x4)−(x1−x2)(x3y4−y3x4)(x1−x2)(y3−y4)−(y1−y2)(x3−x4)
, (x1y2−y1x2)(y3−y4)−(y1−y2)(x3y4−y3x4)(x1−x2)(y3−y4)−(y1−y2)(x3−x4)
)
The intersecting point is given by (Px, Py), if line L1 represented by (x1, y1)
(x2, y2) and line L2 represented by (x3, y3) (x4, y4) are not parallel. If both lines
are parallel, the denominator of the determinant is 0.
Another CGAL geometric feature we use is the CGALCircle class for the in-circle
test. Behind the scenes, CGAL uses a determinant check similar to the intersecting
segment check. Both are relevant for the locally Delaunay property and locally
Gabriel property functions. We use the CGALCircle to compute the bounded region
of a point t given a circle from 2 points, pq, or 3 points, pqr. To use this in-circle
function, we first construct a CGALCircle object using the points and simply invoke a
CGAL bounded side function to determine if t lies inside, on, or outside the boundary
of the circle.
Chapter 5. Data Structures and Functions 63
5.2.6 Graph Traversal
We now explain the relationships between vertices, edges and faces in the CGAL
library. Refer to figure 5.3 for a visual representation. A CGAL triangle face handle
of class Face handle is composed of 3 vertices and 3 edges, all available through
accessor functions in the library. To access vertices, returning a Vertex handle class,
the function vertex(index) provides such capability. The index parameter is an
integer between [0, 2], each mapping to unique vertex of the face. While the index
value may seem arbitrary at first, its direction is relative to each other adjacent face.
A helper function cw(index) and ccw(index) returns the [0, 2] index that correspond
to the clockwise and counterclockwise vertex, respectively.
How direction is set begins during the creation of the graph: the first face of the
graph, typically the first face during graph construction, will assign a vertex to the
index 0. All other vertices assign indices in a counterclockwise fashion relative to
the first, such that vertex(ccw(0)) returns the counterclockwise Vertex handle and
vertex(cw(0)) returns the clockwise Vertex handle. As face construction continues,
any edge-adjacent to that face, assigns vertex indices in the reverse direction. This
reversal of indices repeats for all faces until all vertices and edges are visited.
While in our algorithms we do not directly make use of the clockwise / counter-
clockwise property, CGAL internally uses this to traverse edges of a triangulation.
By walking edges on a face in one direction, then flipping direction when visiting an
edge from an adjacent face, it is able to efficiently create a path that avoids visiting
an edge more than once.
To traverse to an adjacent face we use the function neighbor(index). Much
like the above vertex(index) function, the index parameter maps an adjacent face
Chapter 5. Data Structures and Functions 64
Figure 5.3: Representation of vertices and their associated neighbor faces.[33] No-tice that vertex i sits opposite to face neighbor(i) in triangle i. The same patternholds true for cw(i) and ccw(i).
to the farthest vertex, i.e. for an index i, its corresponding face shares the two
other vertices with indices cw(i) and ccw(i). With this neighbor(index) function, we
traverse all faces of a triangulation.
Chapter 6
Implementation Details
In this section, we look at the implementation details for the algorithms we dis-
cussed in chapter 3 on the minimum edge-constrained Delaunay triangulation,
Gabriel graph and minimum spanning tree. We focus on CGAL / Boost data struc-
tures, their classes and how they are used to build our application. In addition,
we highlight the difficulties, pitfalls and design decisions encountered during the
implementation phase.
A predicate in CGAL refers to a class or struct that describes a geometric compo-
nent. An example of this is: an orientation or comparison result. Classes in CGAL
often use or return predicates to control the flow of the application. The term con-
struction in CGAL refers to a geometric object with state, such as an edge, triangle
or circle. These objects typically have components that involve numerical values
that are subject to rounding error.
Our implementation allows both exact and inexact arithmetic; we achieve ex-
act arithmetic by using the exact predicates and constructions headers and classes.
For inexact arithmetic, we use exact predicates / inexact constructions headers and
65
Chapter 6. Implementation Details 66
classes. The exact predicates / inexact constructions uses standard C data types and
operations to represent and manipulate numerical values. On the other hand, exact
predicates and constructions use the GNU Multiple Precision Arithmetic Library ob-
ject types that allow for arbitrary length precision on numerical values and supports
exact arithmetic operations.
6.1 Minimum Constrained Delaunay Triangulation
The minimum constrained Delaunay triangulation algorithm 6.1 accepts a plane
graph as input in the form of a graph G = (V,E) where V is the vertex set and
E are the constrained edges. It outputs the minimum constraint edge set S. At a
high-level, the algorithm is conceptually simple. We first construct the constrained
Delaunay triangulation CDT (V,E) and initialize the minimum constraint set S to
the empty set, populating it throughout the for-loop. At the end of the loop, inter-
secting S with E will yield the minimum edge constraints.
Algorithm 6.1: Impl. minimum edge-constrained Delaunay triangulationMCDT (G)
Input: plane graph G = (V,E)Output: minimum set S ⊆ E of constraints such that G ⊆ CDT (V, S)
1 Construct CDT(G)2 Initialize S = ∅3 foreach uv ∈ CDT (G) do4 Consider triangles 4(u, v, p) and 4(u, v, q)5 Consider circumcircles Cu,v,p and circumcircle Cu,v,q
6 if p ∈ Cu,v,q or q ∈ Cu,v,p then7 set S ← S ∪ {uv}8 end9 end
10 return S ∩ E
Chapter 6. Implementation Details 67
The most important step is the construction of the constrained Delaunay trian-
gulation, CDT (V,E). In the implementation, we use a CGAL library class:
Constrained Delaunay triangulation 2 to aid us. This class represents an instance of
the Constrained Delaunay triangulation 2 container with an initially empty graph.
Using a for-loop, we populate it with CGAL points that define the vertices. Again,
using a for-loop, we add constraint edges E to the container. If the input is valid,
i.e. does not contain illegal input such as intersecting edges, the container returns
a collection of edges that form the constrained Delaunay triangulation. In the event
that the input is illegal, an execution exception halts the application. To simplify
the algorithm, we supply input that is sanitized before usage.
The for-loop portion of the algorithm for steps 3-9 are straightforward. These
steps add non-locally Delaunay edges to S. From the proof outline in chapter 3, for
each edge uv in CDT (G), we check its adjacent triangles 4{u, v, q} and 4{u, v, p}.
An edge uv is locally Delaunay if circumcircle Cu,v,p does not contain q and circum-
circle Cu,v,q does not contain p. Otherwise, uv must be non-locally Delaunay and
belongs in S if it is also in E. We use a simple CGAL in-circle test that checks the
existence of a point against a circumcircle of 3 points.
In our implementation, the iteration of edges from the CDT (V,E) involves a bit
more work. For ease of use, a triangulation in CGAL can have all edges share exactly
two adjacent vertices, including exterior edges. To allow edges on the exterior to
be part of a triangle face, CGAL introduces the notion of an infinity vertex. Thus,
a vertex is either a finite vertex or an infinite vertex (see figure 6.1 for example).
By extension, edges in CGAL are also finite, if they connect non-infinite vertices or
infinite if one endpoint is the infinite vertex.
Chapter 6. Implementation Details 68
Figure 6.1: CGAL[33] graphs provide an infinity vertex, and consequently an infin-ity face for simplifying graph traversal.
We begin our edge iteration over non-infinite edges since any infinite edge is
never a constraint edge. For any Triangulation::Edge (CGAL’s edge representation)
e, we know it is common to 2 faces. One of two faces arbitrarily is set as the main
face fmain, the other fadj. We can query the edge for a face using variable e.first,
returning a Face handle. The other face, we query using neighbor(e.second). With
fadj, we construct a circle C that passes through all 3 of its vertices using vertices(0),
vertices(1) and vertices(2). Using the query vertex(e.second) on fmain, we obtain
the vertex to perform an in-circle test with. If the vertex fails the locally Delaunay
check with C, then the algorithm considers the edge for the constraint set S. By
swapping the faces, fmain and fadj, we consider the second in-circle test of the locally
Delaunay check. In the last step, we intersect S ∩ E (removing edges that are not
Chapter 6. Implementation Details 69
constraint edges) to obtain the final minimum edge constraint set.
There is a small optimization worth mentioning. We can skip checking edges
on the convex hull because they are locally Delaunay and therefore will not appear
in S. At best, it yields slight optimization by checking if the edge is adjacent to an
infinite face.
6.2 Minimum Constrained Gabriel Graph
To briefly reiterate, the minimum edge-constrained Gabriel graph algorithm 6.2
accepts as input a plane graph G = (V,E) and outputs the minimal edge set S. The
input G must be planar and while not strictly necessary, we keep G free of cycles;
additional checks against invalid input are done in a separate function, i.e. the
algorithm assumes data sanitization to occur before invocation.
Algorithm 6.2: Impl. minimum edge-constrained Gabriel graphMCGG (G)
Input: plane graph G = (V,E)Output: minimum set S ⊆ E of constraints such that G ⊆ CGG(V, S)
1 Construct CDT(G)2 Initialize S = ∅3 foreach uv ∈ CDT (G) do4 Consider the disc Cu,v with diameter ||u− v||5 Consider adjacent vertices p and q to uv6 if p ∈ Cu,v or q ∈ Cu,v then7 set S ← S ∪ {uv}8 end9 end
10 return S ∩ E
Similar to the other algorithms, the first step is to construct the crucial con-
strained Delaunay triangulation CDT (G). This triangulation contains the necessary
Chapter 6. Implementation Details 70
locally Gabriel edges, i.e. recall that an edge in CGG(G) is also in CDT (G) since
CGG(G) ⊆ CDT (G). In code, we use the CGAL library class
Constrained Delaunay triangulation 2 to initialize CDT (V,E), a container for an
initially empty Delaunay triangulation. We fill this container using a for loop to
iteratively add each vertex and edge from F . Following this, the next step converts
the result of Constrained Delaunay triangulation 2 back to our edge data structure
representation, SimpleEdge, and vertex representation, VertexIndex.
Now, we can identify non-locally Gabriel edges. The minimum constraint edge
set, S, is initially empty. Throughout execution, we will add edges to this set. This
step is enclosed in a for loop that iterates across each edge of CDT (G) and performs
a constant time in-circle check. For an edge uv in CDT (V,E), it is adjacent to
vertices p and q and form triangles 4{u, v, p} and 4{u, v, q}. We check if uv is
a locally Gabriel edge by placing a disc C with diameter ||u − v|| with center on
midpoint uv. If either p or q fall inside C, then uv must be a non-locally Gabriel
edge. Non-locally Gabriel edges are considered for S. In the event that C does not
contain p or q, the algorithm simply continues.
The remaining portion of the algorithm is similar to the minimum Delaunay
triangulation algorithm. We use the same CGAL face traversal functions and acces-
sors to navigate the CDT (V,E). After we visit all edges, we remove non-constraint
edges from S before returning the minimum constraint set.
Chapter 6. Implementation Details 71
6.3 Minimum Constrained Minimum Spanning Tree
Construction of a minimum edge-constrained MST (algorithm 6.3) takes as input
a plane forest, F = (V,E) and returns the minimum constraint edge set S. Much
like the two previous algorithms, it starts with the construction of CDT (F ). The
difference here is that we use CDT (F ) twice - once to calculate MST (CDT (F ))
and another time to calculate MST (CDT ◦(F )), the former MST being a regular
MST and the latter having a special cost function that assigns 0 cost to all constraint
edges.
The next step initializes our minimum constraint edge set, S, to the empty set.
We will iteratively add edges to S, but first we require a link cut tree, or dynamic
tree as it is sometimes referred as. Using the MST (CDT ◦(F )), we pre-populate the
link cut tree. Each vertex in MST (CDT ◦(F )) corresponds to a node in the link cut
tree. Likewise, each edge in MST (CDT ◦(F )) corresponds to an edge in the link
cut tree with an integer equal to its index sorted by Euclidean distance, i.e. we sort
all edges by their Euclidean distance and assign an index sequentially according to
their sorted order.
In the outermost foreach loop, the algorithm iterates over all edges e = (u, v) ∈
MST (CDT (F )), e /∈MST (CDT ◦(F )) (an edge that exists in both MST (CDT (F ))
and MST (CDT ◦(F )) is not an edge in S). If linking the nodes u and v in the
link cut tree yields the same connected component, then we know the addition of
e creates a cycle. Still, this is not indicative that e belongs in S; we must look at
the cost of the entire cycle. To isolate the cycle takes a few unusual link cut tree
operations (refer to algorithm 6.3):
1. Unjoining u and v from its parents, leaving their trees Tu and Tv disjoint
Chapter 6. Implementation Details 72
Algorithm 6.3: Impl. minimum edge-constrained Minimum Spanning TreeMCMST (F )
Input: plane graph F = (V,E)Output: minimum set S ⊆ E of constraints such that F ⊆ CMST (V, S)
1 Construct graph CDT (F ) and CDT ◦(F )2 Construct T ′ =MST (CDT (F )) and CMST (F ) =MST (CDT ◦(F ))3 Compute indices for edges in CDT (F ) based on their distance4 Construct link-cut tree LCT with using CMST(F)5 Initialize S = ∅6 foreach edge e = v1, v2 ∈ T ′ do7 if lca(v1, v2) 6= null then8 u = lca(v1, v2)9 p = parent(u)
10 if p 6= null then11 y = cut(u)12 end13 if parent(v1) 6= null then14 v = maxcost(v1)15 x = cost(v)16 while w(e) < x do17 update edge(v,−x)18 set S ← S ∪ {v, parent(v)}19 v = maxcost(v1)20 x = cost(v)
21 end22 end23 if parent(v2) 6= null then24 v = maxcost(v2)25 x = cost(v)26 while w(e) < x do27 update edge(v,−x)28 set S ← S ∪ {v, parent(v)}29 v = maxcost(v2)30 x = cost(v)
31 end32 end33 if p 6= null then34 link(u, p, y)35 end36 end37 end38 return S ∩ E
Chapter 6. Implementation Details 73
2. Using the maxcost function in a loop to find the highest cost edge emaxcost in
the cycle Ce
3. If cost(emaxcost) > cost(v) or cost(emaxcost) > cost(u) then we add emaxcost to S
and repeat step 2.
4. Lastly, we reconnect Tu and Tv to their ancestors.
At this point, we simply keep all edges that intersect: S ∩ E and return that set
as the minimum constrained edge set.
Our implementation uses the same custom objects as the previous algorithms to
represent vertices and edges. Again, our graph representation is a simple adjacency
list, storing both vertices and edges as separate vectors.
To build CDT (F ), we use the CGAL library class
Constrained Delaunay triangulation 2, creating an instance of a container for an
initially empty triangulation. We first populate the container with vertices, then
include our edges as constraints. In the output, we store the edges in a vector as
well.
Computing MST (CDT (F )) and MST (CDT ◦(F )) uses a custom Kruskal MST
algorithm that accepts as part of parameters a set of edges that are constraints. This
set allows a constant time lookup to determine if an edge is part of the constraint
set, which gives us the capability to assign the edge a cost of 0. To perform a lookup,
we rely on Boost’s dictionary data structure and our edge hash function.
In the Kruskal MST implementation, we sort all edges by their cost (using
squared Euclidean distance) in a vector, completely in memory. While the MST
edge size is not equal to vertices size − 1, we add the lowest cost edge to the MST
Chapter 6. Implementation Details 74
(removing it from further consideration) if it does not create a cycle. To check if
inclusion of an edge creates a cycle, we use Boost library disjoint set data structure.
It is has similar operations to the link cut tree; however, it can only report the mem-
bership of a vertex. The advantage of the disjoint set over the link cut tree is that it
can report membership in amortized O(α(n)) (inverse Ackermann function, which
for practical purposes is a constant) in place of amortized O(log n). Otherwise, both
disjoint set and link cut trees use O(n) space and require O(n) steps to populate.
As much as we prefer to avoid employing too many data structures, we do use
BoostGraph to initialize our link cut tree. We make a straightforward, linear time
vertex-to-vertex conversion from our vertex data type to Boost’s vertex. We make
the same conversion for edges as well. The main advantage of the BoostGraph is
the access to a rich set of functions and traversals. Of course, the drawback is that
their implementation attempts to be general and as a result, unoptimized for certain
scenarios in our algorithm, e.g. fast edge iteration and access.
Firstly, the link cut tree is initialized with a node for each vertex in our Boost-
Graph. The parent-child relationship of nodes in the link cut is important, hence
the breadth-first-search traversal: a child vertex remains the child of an ancestor
node/tree throughout the traversal. On each edge the traversal visits, we link the
corresponding link-cut tree nodes and assign it the same cost index as the edge cost
index.
The last step of the algorithm checks for cycles in the link cut tree given: a link
cut tree, VertexVector, EdgeVector of T ′ and an EdgeVector that are the constraint
edges. To simplify the algorithm, we check all edges in T ′ instead of only edges that
are in T ′ ∩ CMST .
Chapter 6. Implementation Details 75
The implementation for the cycle checking follows the pseudocode, however, we
do not compare the Euclidean distance between edges in this part of the code. The
dynamic tree library does not handle exact arithmetic, so we must compute any
parts that require exact arithmetic in advance. To reiterate, by assigning an index
to each edge based on their Euclidean distance, we avoid the need to compare their
exact distance. We compute these indices by sorting all edges by distance from
shortest to longest, then iterate over them assigning a value from 0 to n-1, where n
is the number of edges. We use these indices as the new cost of edges in the cycle
check.
Chapter 7
Experimental Results
7.1 Introduction
We gather and measure experimental results on three minimum constrained algo-
rithms: Delaunay triangulation, Gabriel graph and minimum spanning tree. There
is interest in their results on different graph types, including small graphs (ap-
proximately 1000 nodes) and much larger, elaborate graphs (approximately 60,000
nodes). We also consider the arrangement of node positions in Euclidean space and
their Euclidean distance between one another within the graph. These variables
are important for understanding how each algorithm performs under favorable or
adversarial conditions, giving us a better understanding of their performance in an
applied setting.
76
Chapter 7. Experimental Results 77
7.1.1 Setup
All experimental results were produced on a single desktop computer, running a
Windows 7 Professional 64-bit operating system, equipped with 8GB of RAM. The
processing unit on this desktop is an Intel Core-i5-2500 quad core (first released in
early 2011) capable of running at upwards of 3.3GHz clock speed; its cache is 6MB.
It is worth mentioning that input datasets are loaded into memory at the beginning
of execution, so that no file system operations occur during the algorithm code
portion. Afterwards, the output graph is written to file.
The algorithms were written and executed in Visual Studio Professional 2013
for a C++ IDE (integrated development environment) with application settings for
debug symbols turned on. Some minor performance improvements could possibly
be gained from running the application in release mode by removing the debug
symbols; however, since all experimental runs were done in debug mode, there is a
consistent baseline across all results.
7.1.2 Input Data
Source of real-world graph data
As part of the experimental results, we provide performance metrics on large, real-
world datasets. There is significant value in knowing the performance of real-world
data, both in runtime and in the compression effectiveness. We define the compres-
sion ratio as (|E| − |S|)/|E|, where |E| is the cardinality of the constraint edge set
and |S| is the cardinality of the minimum edge set.
Real-world graph data can come from different sources, such as geographical
Chapter 7. Experimental Results 78
surveys, road networks, flight paths, transportation, topology of a 3d model, biol-
ogy or relationships in a social network. Undoubtedly, every graph has different
characteristics.
Our real-world dataset comes from DIMACS (Center for Discrete Mathemat-
ics and Theoretical Computer Science), initially a collaboration between Rutgers
University, Princeton University, AT&T and Bell Labs. There are 250 permanent
members, actively involved in planning and coordinating from across institutions.
Their goal is to further the applications of computer science and mathematics by
arranging competitions and challenges for finding practical, efficient solutions for
various mathematical and computational problems, such as network flows and com-
putational biology. In addition, DIMACS sponsors conferences and workshops for
researchers.
During the DIMACS 9th applied challenge, a problem to implement the short-
est path algorithm on road networks on states in the United States was announced.
The DIMACS organization provided large undirected graphs as part of the challenge
data. Their graphs included: edges, representing a path, road or highway with
weight equal to its length, and nodes, representing a connection or intersection
between two or more edges. Sizes of their datasets vary greatly. The Washington
D.C. dataset contains the least number of nodes and edges at 9,559 and 14,909,
respectively. On the other hand, the largest dataset, Texas road network, contains
2,073,870 nodes and 2,584,159 edges. We selected the Washington D.C. road net-
work (figure 7.1) and Hawaii road network (figures 7.2 and 7.3), containing 64,892
nodes and 76,809 edges.
While it would be convenient to immediately be able to consume these datasets,
Chapter 7. Experimental Results 79
road networks have many undesired characteristics. They contain cycles, collinear
points, and intersecting edges, all of which can be problematic for our algorithms.
We elaborate on how we deal with them in the following paragraphs.
Road networks are filled with cycles. In the most basic scenario, consider a rect-
angular city block surrounded on all four sides by road. This simple road network
is a cycle. Recall that all three minimum constraint algorithms accept as input a for-
est, and as a result cannot contain cycles. To remove cycles, we perform a one-time
data sanitization pass over the data. Firstly, given a road network edge set E and
our initially empty sanitized edge set E ′, an edge e ∈ E is added to E ′ if e∪E ′ does
not create a cycle. We maintain a disjoint-set data structure[13] to determine if the
inclusion of e creates a cycle. Edges are discarded in a first-come-first-serve man-
ner, i.e. the data is read from beginning to end-of-file and intersecting edges are
discarded as they are discovered, which appears arbitrary since the original dataset
imposes no ordering.
Collinear points in the vertex set can be problematic. One of the initial steps of
each algorithm is to compute a constrained Delaunay Triangulation. If the entire
dataset is collinear, then the constrained Delaunay Triangulation implementation
will not create any triangulations. Hence, these algorithms will not traverse any
triangles and as a result the minimum constrained edge set will be empty. As for
the constrained minimum MST algorithm, it will never contain cycles so similarly,
the minimum constrained set will be empty. We also consider if the dataset is close
to collinear, i.e. due to rounding errors of floating-point calculations. To prevent
such errors, we use exact arithmetic from CGAL in our algorithms.
A second reason collinear points can be problematic is that the edges that con-
Chapter 7. Experimental Results 80
nect them may overlap. For example, given collinear points, p1, p2 and p3, assume
p2 lies between p1 and p3 in Euclidean space. The road network dataset occasionally
places an edge between p1 and p3, causing a violation of the planarity property. Our
assumption is that during the curation of this dataset, roads at different height are
flattened, e.g. a road passing perpendicular to the end of a tunnel. To solve this, we
allow the CGAL constrained Delaunay triangulation algorithm to re-organize edge
p1p3 into two smaller edges, p1p2 and p2p3.
Lastly, intersecting edges in the dataset arise from the three-dimensional layout
of traffic networks in modern day cities. There are numerous bridges, tunnels and
roads that pass above and below each other. The DIMACS dataset does not distin-
guish between the elevation of each edge. Due to the size and fixed positions of
vertices (bound in Euclidean space), it would not be possible to untangle, i.e. find
an isomorphic graph that is planar by shifting the position of vertices and edges to
remove the intersecting edges. To deal with intersections, they are simply removed
as part of the data sanitization. A naiveO(n2) algorithm, where each edge performs
a constant time determinant line-to-line intersection check against each other edge,
is employed. An edge that is found to intersect another edge is removed. The order
of edge traversal is the order that they appear in the dataset, hence intersecting
edges are removed in the same order.
Randomized Graph Types
We use random graphs to discern the average effectiveness of each algorithm over
10 runs. These random graphs are generated ahead of algorithm execution and
are used only once per run. We discard them when the application terminates. We
Chapter 7. Experimental Results 81
do not count the creation time of the randomized dataset as part of the algorithm
execution total; however, we do consider any data structures that the algorithm
uses or manipulates prior to running as part of execution time, e.g. converting the
dataset into an adjacency list.
Dataset # of Vertices # of Edges1. Random disc (Delaunay edges) 1000 100
2. Random disc 1000 1003. Random circle 1000 100
4. Washington D.C. 9559 95525. Hawaii 64892 64321
There are three types of random graphs. We confine their points within a disc or
on the circumference of a circle. Each disc or circle is 1000 units in radius for ease
of visualization but otherwise this value can be arbitrary. All randomized graphs
contain a total of 1000 vertices with random (x, y) positions, i.e. each position
has equal probability within the boundary. Also, all randomized graphs contain a
constant 100 edges, which we select at random but with conditions.
The first type of the randomized graph (figure 7.4) contains 1000 vertices and
100 edges in or on the region of a closed disc. To construct this graph, a Delaunay
triangulation over all vertices is performed. Edges are selected to be part of the
graph by randomly picking edges from the triangulation with equal probability.
Edges that form cycles are not selected to ensure the output is a forest. GIS and 3D
meshes are domains that may have similar graph types to this.
The second type of randomized graph (figure 7.5) contains 1000 vertices in or
on the region of a closed disc; the number of total edges is 100. To form an edge
in this graph, two vertices are selected at random with equal probability. The edge
is added to the graph if it does not form a cycle, does not intersect with an existing
Chapter 7. Experimental Results 82
edge, and its length is at most a factor 0.2 radius of the disc. Edges from this graph
type are typically longer than the first graph type, which gives us insight on how
the algorithms perform using longer constraints.
The last type of randomized graph (figure 7.6) contains 1000 vertices on the
circumference of a circle with a constant 100 edges. We select an edge to be part
of the graph by picking two vertices at random with a few conditions. The two
vertices cannot form a cycle and they must not intersect any other edge already
part of the graph. This graph is particularly adversarial for our algorithm execution
time because of the position of vertices on the boundary of the circle. The portion
of the algorithm that constructs the constrained Delaunay triangulation will detect
that the circumcircle of any triangle inevitably touches all other points on the circle.
Result Validation
For all 3 minimum edge-constrained algorithms, we validate that the minimum con-
straint set S is a subset of the original constraint set E. In addition, we validate that
we can reconstruct the constrained graph using S, e.g., for the MCDT algorithm,
we ensure that CDT (V, S) = CDT (V,E). It is worth mentioning that for real-world
datasets, we validate the inexact edge set S using the inexact constrained algorithm.
Likewise, we validate the exact edge set S using the exact constrained algorithm.
Chapter 7. Experimental Results 83
7.2 On Performance Results
7.2.1 Minimum Edge-Constrained Delaunay Triangulation
Real-World Dataset
On the Washington D.C. DIMACS competition dataset, the constrained Delaunay
triangulation produced a compression ratio of 92.4% using inexact arithmetic and
92.3% using exact arithmetic, that is to say only 7.6%/7.7% of the original constraint
edge set is required in order to recreate the constrained Delaunay triangulation. As
for the Hawaii DIMACS dataset, a compression ratio of 92.4% for both inexact and
exact arithmetic was achieved. Our results are documented in table 7.1.
Between the two arithmetic types for the Washington D.C. datasets, there is a 7
(0.01%) edge difference in their minimum constraint edge set. For inexact arith-
metic this value is 730; for exact arithmetic the value is 737. All extra 7 edges
are unique to the exact dataset and the remaining 730 are common between both
datasets. When using inexact arithmetic, due to numerical rounding, the in-circle
function for the locally Delaunay test reports that these edges as locally Delaunay
when in fact they are not. Hence, there is risk of under-reporting the actual min-
imum constrained edge set when using inexact arithmetic. Results from Devillers
et al.[8] show between 0.6% and 21% of edges as cocircular in 5 different datasets.
(An edge p1p2 is cocircular if points p1, p2, p3 and p4 lie on a closed disc.) They
identify these edges as a source of potential rounding error.
These results are no surprise when compared to other experimental results. In
their paper, Devillers et al.[8] claimed a compression ratio of more than 97% for
their triangulated GIS terrain model dataset. While email requests to obtain the
Chapter 7. Experimental Results 84
Minimum Constrained Delaunay TriangulationData Type Time (ms) |S| |E| Compression Ratio
Random disc (Delaunay edges) 256.2 0 100 1Random disc 260 80.6 100 0.194
Random circle 843.3 87.1 100 0.129Washington D.C. (Inexact) 2087 730 9552 0.924
Washington D.C. (Exact) 3230.7 737 9552 0.923Hawaii (Inexact) 19192 4918 64321 0.924
Hawaii (Exact) 29292.2 4918 64321 0.924
Table 7.1: Experimental results for minimum constrained Delaunay triangulation
author’s original dataset were unsuccessful, their description and visualization of
their dataset suggest their graph had a similar structure to a triangulation. Our DI-
MACS datasets are also filled with short connecting roads in dense regions. Except
for highways, which are represented by long edges, most roads are edges that are
short in length relative to the overall diameter of the graph. This is apparent in
figures 7.1 and 7.2 for both states.
A comparison of the exact and inexact CDT (V, S) shows that the additional
7 edges in the exact edge set S are Delaunay flips in the same quadrilateral for
the inexact CDT (V, S) (see figure 7.7). This is a consequence of the numerical
imprecision in the inexact algorithm, which is unable to determine with sufficient
precision the correct Delaunay edge that maximizes the sum total angle. Hence, it
does not include these 7 edges in the inexact edge set S.
On average execution time, the constrained Delaunay triangulation completed
in 2,087ms (inexact arithmetic) and 3,230.7ms (exact arithmetic) for the Washing-
ton D.C. dataset. For the Hawaii dataset, the execution time was 19,192ms (inex-
act arithmetic) and 29,292.2ms (exact arithmetic). For inexact arithmetic on the
Washington D.C. dataset, this algorithm completed 129ms faster than the Gabriel
Chapter 7. Experimental Results 85
algorithm; 2,529ms faster in the case of the Hawaii dataset. As for exact arithmetic,
this algorithm completed 198.5ms slower and 5,172.6ms faster than the Gabriel al-
gorithm for Washington D.C. and Hawaii datasets, respectively. These metrics are
not unexpected considering the algorithms are similar. Unfortunately, there are no
timed metrics in the paper of Devillers et al. to compare our results.
Randomly Generated Data
The compression ratios for graph type 1 (random Delaunay edges) is closest to that
of the real-world datasets. The remaining random graphs have much poorer com-
pression ratios, needing to store between 80.6% and 87.1% of the original constraint
edge set. Clearly, these graph types are not conducive to effective compression.
Also, the compression ratio of graph type 1 is trivially 100%; all constraints are
Delaunay edges. The purpose of doing this test is to confirm that the algorithm
behaves as expected, i.e. we expect that the minimum constraint set should be
empty.
7.2.2 Minimum Edge-Constrained Gabriel Graph
Real-World Dataset
We present the results for the minimum constrained Gabriel graph in table 7.2. The
compression ratios is 83.3% and 82.4% (both inexact and exact arithmetic) for the
Washington D.C. dataset and Hawaii dataset (figure 7.9), respectively. This is a
significant reduction in the edge set size. The Hawaii edge set was reduced from
64321 to 11314 edges while the Washington D.C. edge set saw a 9552 to 1599
Chapter 7. Experimental Results 86
edge reduction. To the best of our knowledge, no external experimental results
exist using the same algorithm or a variant algorithm.
For our real-world datasets, the minimum constrained Gabriel graph algorithm
produces the same minimum constraint set for both arithmetic types. This is in con-
trast with the other two algorithms. Using inexact arithmetic, the minimum con-
strained Delaunay triangulation algorithm under-reports the minimum constraint
edges while the minimum constrained MST over-reports the minimum constraint
edges when comparing to exact arithmetic results.
Execution times for both Washington D.C. and Hawaii datasets fell within simi-
lar numbers to the minimum constrained Delaunay triangulation results. The Wash-
ington D.C. dataset, containing 9559 vertices and 9552 edges, yielded an execution
time of 3032.2ms. On the other hand, the much larger Hawaii dataset, with 64892
vertices and 64321 edges, completed within 34464.8ms. The majority of the exe-
cution time was dominated by the construction of the constrained Delaunay trian-
gulation, accounting for approximately 58.9% of the total time: for the Washington
D.C. data (exact arithmetic), this is 1924.4ms out of 3032.2ms and 18784ms out of
34464.8ms for the Hawaii dataset (exact arithmetic).
Minimum Constrained Gabriel GraphData Type Time (ms) |S| |E| Compression Ratio
Random disc (Delaunay edges) 218.3 34.7 100 0.653Random disc 224.1 86.6 100 0.134
Random circle 463.6 94.6 100 0.054Washington D.C. (Inexact) 2216 1599 9552 0.833
Washington D.C. (Exact) 3032.2 1599 9552 0.833Hawaii (Inexact) 21721 11314 64321 0.824
Hawaii (Exact) 34464.8 11314 64321 0.824
Table 7.2: Experimental results for minimum constrained Gabriel graph
Chapter 7. Experimental Results 87
Randomly Generated Data
In comparison with real-world datasets, randomized graphs appear to generate
worse compression ratios for all three graph types. For graph types 1, 2 and 3,
their compression ratio is 65.3%, 13.4% and 5.4%, respectively. These values may
seem surprising; however, if we consider the construction method for graph types
2 and 3, we purposely allow edges that are non-locally Gabriel. Therefore, these
edges would be present in the minimum constrained set, leading to less desirable
compression ratios.
The average execution time was 218.3ms, 224.1ms and 463.6ms for graph types
1, 2 and 3, respectively. The majority of the total execution time is in calculating
the constrained Delaunay triangulation, taking an average of 44% of the total time.
Much like the minimum constrained Delaunay triangulation algorithm, this part,
too, dominates the execution time.
7.2.3 Minimum Edge-Constrained Minimum Spanning Tree
Real-World Dataset
Compared to the other compression ratios, the minimum constrained minimum
spanning tree compression ratios are modest. At 9559 vertices and 9552 edges, the
Washington D.C. dataset yields a constraint edge set of 7923 edges (inexact arith-
metic) and 7844 edges (exact arithmetic), giving it a compression ratio of 17.1%
and 17.9%, respectively. The much larger Hawaii dataset, containing 64892 ver-
tices and 64321 edges, has a minimum constraint edge set of 42812 edges (inexact
arithmetic) and 42697 edges (exact arithmetic), giving a compression ratio of 33.4%
Chapter 7. Experimental Results 88
and 33.6%, respectively.
There is a minimum edge set increase from exact arithmetic to inexact arith-
metic for both Washington D.C. and Hawaii datasets. The increase is 79 edges for
Washington D.C. and 115 for Hawaii. All extra edges are unique to the inexact
arithmetic edge set and all remaining edges are common between both. During the
cycle checking step of the minimum constrained MST (algorithm 6.3), we compare
the Euclidean distance of edges to determine if we should include an edge as part of
the minimum constrained edge set. This comparison step in the exact arithmetic re-
ports the spurious edges as being equal in distance, while in the inexact arithmetic,
due to rounding, they have greater distance. Hence, the algorithm using inexact
arithmetic over-reports the edges in the minimum constrained edge set.
For the validation step, recall that we compute the inexact CMST algorithm
with an inexact minimum edge set and the exact CMST algorithm with an exact
minimum edge set. When we compare both inexact and exact CMST (V, S) graphs,
we find they are identical for the Washington D.C. dataset. This result is expected
because the inexact minimum edge set is a subset of the exact minimum edge set.
On the other hand, a comparison of the inexact and exact CMST (V, S) graphs for
the Hawaii dataset yields a difference of a single edge. Upon closer inspection, there
are two vertices in V which lie close to one another. The inexact CMST (V, S) is
unable to distinguish with sufficient numerical accuracy which of these two vertices
to include as part of the minimum spanning tree, both appearing to have equal cost.
Average execution times for Washington D.C. is 3931ms (inexact arithmetic)
and 11526.5ms (exact arithmetic), a factor of 2.9 times slower from inexact to
exact arithmetic. The larger Hawaii datasets average execution time is 40735ms
Chapter 7. Experimental Results 89
(inexact arithmetic) and 120204.5ms (exact arithmetic). Due to the additional
algorithm complexity, the overall execution time should be much larger than the
other 2 algorithms. In additional to computing a constrained Delaunay triangula-
tion, it constructs two MSTs, a dynamic tree, and it examines the cycles within the
dynamic tree. We see that these additional steps increase the execution time by
nearly a factor of 4 in comparison to the other algorithms.
Minimum Constrained Minimum Spanning TreeData Type Time (ms) |S| |E| Compression Ratio
Random disc (Delaunay edges) 2050.4 68.8 100 0.312Random disc 1601.8 94.4 100 0.056
Random circle 1899.9 94.2 100 0.058Washington D.C. (Inexact) 3931 7923 9552 0.171
Washington D.C. (Exact) 11526.5 7844 9552 0.179Hawaii (Inexact) 40735 42812 64321 0.334
Hawaii (Exact) 120204.5 42697 64321 0.336
Table 7.3: Experimental results for minimum constrained minimum spanning tree
Randomly Generated Data
The compression ratio for all three random graph types are modest. For graph
type 1 (Delaunay edges), 2 and 3 the compression ratio is 31.2%, 5.6% and 5.8%,
respectively. This result is aligned with real-world results which are between 33.6%
(Hawaii, figure 7.10) and 17.1% (Washington D.C., figure 7.11).
For the all randomized graphs, their average execution time over 10 runs for
graph types 1, 2 and 3 were 2050.4ms, 1601.8ms and 1899.9ms, respectively.
Unlike the minimum constrained Delaunay triangulation and Gabriel graph algo-
rithms, the constrained Delaunay triangulation construction for this algorithm does
not dominate the execution time. Instead it accounts for only 13% of the total exe-
Chapter 7. Experimental Results 90
cution time. It is worth noting that the construction of the two MSTs greatly reduces
the graph size, making the remaining algorithm relatively quick.
Chapter 7. Experimental Results 91
Figure 7.1: The road network input data set for state of Washington D.C. (9559vertices, 9552 edges)
Chapter 7. Experimental Results 92
Figure 7.2: The road network input data set for state of Hawaii (64892 vertices,64321 edges)
Chapter 7. Experimental Results 93
Figure 7.3: A close-up of the road network input data set for Kaua’i County, one of6 major islands of Hawaii.
Chapter 7. Experimental Results 94
Figure 7.4: A randomized disc data set of 1000 vertices and 100 edges picked fromDelaunay edges.
Chapter 7. Experimental Results 95
Figure 7.5: A randomized disc data set of 1000 vertices and 100 edges with lengthat most 0.2 radius selected at random.
Chapter 7. Experimental Results 96
Figure 7.6: A randomized circle data set of 1000 vertices and 100 edges selected atrandom.
Chapter 7. Experimental Results 97
Figure 7.7: An example of the edge difference in the exact and inexact CDT (V, S).The edge 864, 866 is present in the inexact CDT (V, S) for an inexact S, but not inthe exact CDT (V, S) for an exact S. Instead, edge 863, 865 is present.
Chapter 7. Experimental Results 98
Figure 7.8: The minimum constrained Delaunay triangulation edge set for Kaua’iCounty, Hawaii.
Chapter 7. Experimental Results 99
Figure 7.9: The minimum constrained Gabriel graph edge set for Kaua’i County,Hawaii.
Chapter 7. Experimental Results 100
Figure 7.10: The minimum constrained MST edge set for Kaua’i County, Hawaii.
Chapter 7. Experimental Results 101
Figure 7.11: A close-up of the minimum constrained MST edge set for the state ofWashington D.C.
Chapter 8
Conclusion
We examine the applicability, implementation and runtime performance for three
minimum edge-constrained algorithms, each representing a different proximity graph.
The implementation integrates the well-known geometric framework, CGAL, with
the mature Boost library in C++. We use both randomized graphs and real-world
data, courtesy of DIMACS, to validate and benchmark our algorithm implementa-
tions.
8.1 Summary of Contributions
For real-world datasets, our results show decent compression ratios. Using exact
arithmetic, the minimum constrained Delaunay triangulation (MCDT), minimum
constrained Gabriel graph (MCGG) and minimum constrained minimum spanning
(MCMST) algorithms for the Hawaii dataset are 92.4%, 82.4% and 33.6%, respec-
tively. That is to say, we require only 7.6%, 17.6% and 66.4% of the constraint edge
to form the minimum edge set.
102
Chapter 8. Conclusion 103
We show that randomized graphs can have drastically different compression
ratios, ranging from 5.6% to 100%. This gives us greater insight as to which graph
types are conducive to higher compression, and for which algorithm.
Execution time for the MCDT and MCGG are similar. Also, their execution times
are dominated by the construction of the constrained Delaunay triangulation step.
This step accounted for over half of the total runtime. On the other hand, the
MCMST algorithm required roughly 4 times the computation time of the other al-
gorithms. When using exact over inexact arithmetic, the execution time increased
by upwards of a factor of 3.
8.2 Future Work
There are many avenues for future work. We would like to add new features to
the existing implementation to more robustly handle input data or examine the
behavior of algorithms on new graph types.
Firstly, we would like to admit that current implementations do not robustly
handle data inconsistencies. This entails intersecting edge segments, either at end-
points or through the segment line itself. As stated earlier in the results section
7.1.2, all input data undergoes a sanitization step before processing to eliminate
data inconsistencies. It would be of great interest to investigate algorithms that can
gracefully handle data inconsistencies at runtime and without removing them from
the data.
While we provide the implementations for three minimum constrained algo-
rithms, there is one last proximity graph algorithm (minimum constrained β-Skeleton)
Chapter 8. Conclusion 104
where no implementation exists. It may be of interest to provide an implementation
and to compare its experimental results to our current results.
Lastly, we believe that testing against different graph types would be advan-
tageous. Our selection of graphs, both real-world and randomly generated ones,
could be bolstered by selecting different real-world graphs and generating larger
random graphs with different ratios of vertices to edges and structures. As well,
testing with exact arithmetic over random graphs may provide useful metrics on
the frequency of degerate cases.
Bibliography
[1] Petersen graph 3-coloring, 2006. URL https://en.wikipedia.org/wiki/
Graph_coloring#/media/File:Petersen_graph_3-coloring.svg.
[2] O Boruvka. Contribution to the solution of a problem of economical construc-
tion of electrical networks. Elektronicky Obzor, 15:153–154, 1926.
[3] Prosenjit Bose, Jean-Lou De Carufel, Alina Shaikhet, and Michiel Smid. Es-
sential constraints of edge-constrained proximity graphs. Journal of Graph
Algorithms and Applications, 21(4):389–415, 2017.
[4] L Paul Chew. Constrained delaunay triangulations. Algorithmica, 4(1-4):97–
108, 1989.
[5] Paul Chew. There is a planar graph almost as good as the complete graph. In
Proceedings of the second annual symposium on Computational geometry, pages
169–177. ACM, 1986.
[6] Leila De Floriani and Enrico Puppo. An on-line algorithm for constrained
delaunay triangulation. CVGIP: Graphical Models and Image Processing, 54(4):
290–300, 1992.
105
Bibliography 106
[7] Boris Delaunay. Sur la sphere vide. Izv. Akad. Nauk SSSR, Otdelenie Matem-
aticheskii i Estestvennyka Nauk, 7(793-800):1–2, 1934.
[8] Olivier Devillers, Regina Estkowski, Pierre-Marie Gandoin, Ferran Hurtado,
Pedro Ramos, and Vera Sacristan. Minimal set of constraints for 2d con-
strained delaunay reconstruction. International Journal of Computational Ge-
ometry & Applications, 13(05):391–398, 2003.
[9] Drrilll. Linkcuttree1, 2013. URL https://commons.wikimedia.org/wiki/
File:Linkcuttree1.png.
[10] Andreas Fabri, Geert-Jan Giezmann, Lutz Kettner, et al. On the design of cgal
the computational geometry algorithms library. 1998.
[11] Steven Fortune. A sweepline algorithm for voronoi diagrams. Algorithmica, 2
(1-4):153, 1987.
[12] K Ruben Gabriel and Robert R Sokal. A new statistical approach to geographic
variation analysis. Systematic zoology, 18(3):259–278, 1969.
[13] Bernard A Galler and Michael J Fisher. An improved equivalence algorithm.
Communications of the ACM, 7(5):301–303, 1964.
[14] Michael R Garey, David S. Johnson, and Larry Stockmeyer. Some simpli-
fied np-complete graph problems. Theoretical computer science, 1(3):237–267,
1976.
[15] Barend Gehrels, Bruno Lalande, Mateusz Loskot, and A Wulkiewicz. Boost
geometry library, 2016.
Bibliography 107
[16] Sean Gillies, Aron Bierbaum, Kai Lautaportti, and O Tonnhofer. Shapely. GIS-
Python Lab, 2013.
[17] AD Gordon. A survey of constrained classification. Computational Statistics &
Data Analysis, 21(1):17–29, 1996.
[18] V Jarnık. About a certain minimal problem. Prace Moravske Prırodovedecke
Spolecnosti, 6:57–63, 1930.
[19] Marcelo Kallmann, Hanspeter Bieri, and Daniel Thalmann. Fully dynamic
constrained delaunay triangulations. In Geometric modeling for scientific visu-
alization, pages 241–257. Springer, 2004.
[20] Mohan Krishnamoorthy, Andreas T Ernst, and Yazid M Sharaiha. Comparison
of algorithms for the degree constrained minimum spanning tree. Journal of
heuristics, 7(6):587–611, 2001.
[21] Joseph B Kruskal. On the shortest spanning subtree of a graph and the trav-
eling salesman problem. Proceedings of the American Mathematical society, 7
(1):48–50, 1956.
[22] Der-Tsai Lee. Proximity and reachability in the plane. Technical report, ILLI-
NOIS UNIV AT URBANA-CHAMPAIGN COORDINATED SCIENCE LAB, 1978.
[23] Eric K Lee and Charles U Martel. When to use splay trees. Software: Practice
and Experience, 37(15):1559–1575, 2007.
[24] Sebastian Maneth and Fabian Peternek. A survey on methods and systems for
graph compression. arXiv preprint arXiv:1504.00616, 2015.
Bibliography 108
[25] Kurt Mehlhorn and Stefan Naher. Leda: a platform for combinatorial and
geometric computing. Communications of the ACM, 38(1):96–103, 1995.
[26] Subhash C Narula and Cesar A Ho. Degree-constrained minimum spanning
tree. Computers & Operations Research, 7(4):239–249, 1980.
[27] Lutz Prechelt. An empirical comparison of c, c++, java, perl, python, rexx
and tcl. IEEE Computer, 33(10):23–29, 2000.
[28] Martin Savelsbergh and Ton Volgenant. Edge exchanges in the degree-
constrained minimum spanning tree problem. Computers & Operations Re-
search, 12(4):341–348, 1985.
[29] Daniel D Sleator and Robert Endre Tarjan. A data structure for dynamic trees.
In Proceedings of the thirteenth annual ACM symposium on Theory of computing,
pages 114–122. ACM, 1981.
[30] SW Sloan. A fast algorithm for generating constrained delaunay triangula-
tions. Computers & Structures, 47(3):441–450, 1993.
[31] Vivid Solutions. Java topology suite, 2003.
[32] Tung-Hsin Su and Ruei-Chuan Chang. Computing the constrained relative
neighborhood graphs and constrained gabriel graphs in euclidean plane. Pat-
tern Recognition, 24(3):221–230, 1991.
[33] The CGAL Project. CGAL User and Reference Manual. CGAL Editorial Board,
4.11 edition, 2017. URL http://doc.cgal.org/4.11/Manual/packages.
html.
Bibliography 109
[34] Godfried T Toussaint. The relative neighbourhood graph of a finite planar set.
Pattern recognition, 12(4):261–268, 1980.
[35] Chee K Yap. An O(n logn) algorithm for the voronoi diagram of a set of simple
curve segments. Discrete & Computational Geometry, 2(1):365–393, 1987.