JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming...
Transcript of JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming...
![Page 1: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/1.jpg)
My long path towards O(n) longest-path in 2-trees
JORDAN BISERKOV
ClojuTRE Helsinki, FinlandSeptember 14th 2018
![Page 2: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/2.jpg)
Jordan Biserkov
➢Programming professionally since 2001
➢Found Lisp in 2005 via pg essays & books
➢Found Clojure on HN in 2010, fell in love
➢ Independent contractor for Cognitect since 2018
➢Biserkov.com
![Page 3: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/3.jpg)
My epic journey in the 2-trees forests
➢End goal: implement the Big O(n) boss
➢ but first O(k) bosses in the Bottom-level
• First use of my superpower
➢The O(n√n) boss
• Side quest: Find 5 bugs in a 3rd party library
• The ancient Structural tree
➢The O(n log n) boss
• A wild stack overflow appears
➢The final fight
![Page 4: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/4.jpg)
2-trees are NOT …
➢Binary trees
➢Even trees
2-trees are …
➢A class of undirected graphs
➢Used to model electric circuits
➢Recursively structured
![Page 5: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/5.jpg)
2-tree recursive construction demo
![Page 6: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/6.jpg)
2-tree recursive construction demo
![Page 7: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/7.jpg)
2-tree recursive construction demo
![Page 8: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/8.jpg)
2-tree recursive construction demo
![Page 9: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/9.jpg)
2-tree recursive construction demo
![Page 10: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/10.jpg)
2-tree recursive construction demo
![Page 11: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/11.jpg)
2-tree recursive construction demo
![Page 12: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/12.jpg)
2-tree recursive construction demo
![Page 13: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/13.jpg)
2-tree recursive construction demo
![Page 14: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/14.jpg)
2-tree recursive construction demo
![Page 15: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/15.jpg)
Background
➢The 90’s algorithm to compute the length of the
longest path in a 2-tree has colossal hidden
constants and is “linear” in purely abstract sense
• Never implemented
➢ In 2013 Markov, Vassilev and Manev published a
novel algorithm
• Implemented as pseudo-code in the paper
➢Goal: Implement the MVM algorithm in O(n) time
![Page 16: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/16.jpg)
Overview
➢Recursively split the 2-tree into sub-2-trees
• Only a few nodes change
• Perfect fit for Clojure’s persistent data structures
➢Boundary cond.: Leaf edges, label [1 1 0 0 0 0 0]
➢Combine labels of subtrees to compute parent tree
label
➢The first element of the label is the result – the
length of the longest-path
![Page 17: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/17.jpg)
Code structure
Top level
• Compute-label
Middle level
• Combine-on-face
• Combine-on-edge
Bottom level – helper functions
• max-2-distinct
• max-3-distinct
![Page 18: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/18.jpg)
𝑚𝑎𝑥 𝑎𝑖 + 𝑏𝑗 | 𝑖 ≠ 𝑗
(defn naive-max2DistinctFolios [a b n](reduce max
(for [i (range 0 k)j (range 0 k):when (not= i j)]
(+ (nth a i) (nth b j)))))
a and b are vectors with k elements each
![Page 19: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/19.jpg)
Problem: 2 Nested for-loops → O(k2) runtime
a = [1 2 3 4 5], b = [6 7 8 9 10]
+ 1 2 3 4 5
6 8 9 10 11
7 8 10 11 12
8 9 10 12 13
9 10 11 12 14
10 11 12 13 14
![Page 20: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/20.jpg)
Optimization: O(k)
➢ Iterate each vector separately, keeping track of:
• the maximum
• the second largest
• the index of the maximum
➢Check whether we can use both maxima (different
indices) and if not - which alternative is larger
(max (+ maxA secondB)
(+ maxB secondA))
![Page 21: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/21.jpg)
𝑚𝑎𝑥 𝑎𝑖 + 𝑏𝑗 + 𝑐𝑡 | 𝑖 ≠ 𝑗 ≠ 𝑡 ≠ 𝑖
a, b and c are vectors with k elements
![Page 22: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/22.jpg)
Problem: 3 Nested for-loops → O(k3) runtime
![Page 23: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/23.jpg)
Optimization: O(k)
➢ Iterate each vector separately, keeping track of:
• the maximum
• the second largest
• the third largest
• the index of the maximum and the second largerst
➢Check which of the 36 combos are valid and which
sum is the largest
➢Terrible complexity, many bugs
![Page 24: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/24.jpg)
Generative testing to the rescue
➢Also called property-based testing
➢Finds complex bugs immediately
➢Difficult to come up with a useful property
➢Shrinks input to minimal case which triggers the
bug, in this case often vectors with 0 and 1
➢Use (= (naïve …)
(faster …)) as testing property
![Page 25: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/25.jpg)
Previous implementation
➢ Java
➢ 2-tree represented as a matrix
➢Sub-2-tree = submatrix = tons of copying
➢O(n2) runtime
➢O(n2) memory usage
![Page 26: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/26.jpg)
My first implementation
➢Clojure
➢ as close to the paper as possible
➢ 2-tree represented as map from int to set of int
➢O(n√n) runtime
➢Perhaps Clojure’s dynamic typing is the problem?
![Page 27: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/27.jpg)
Optimization: use Zach Tellman’sint-map and int-set
{0 #{1 2 3 4}
1 #{0 2}
2 #{0 1 3 4}
3 #{0 2}
4 #{0 2}}
Runtime is faster, but complexity still O(n√n)
![Page 28: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/28.jpg)
Sidequest: find 5 bugs in 3rd-party library
➢The problem manifests as a NullPointerException
➢Cursive’s debugger is awesome
• Breakpoint on exception
➢Zach Tellman is a great guy, fixed bug quickly
➢Problem has evolved: infinite looping in subgraph-
walk during multiple-recursion?!? How? Why?
➢ 5 times in a row, same-day bug delivery, what
sorcery is this?
![Page 29: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/29.jpg)
The root cause of the slowdown?
➢Splitting into sub-2-trees
➢Persistent data structure are fast enough,
actual updates not the problem
➢Computing which vertices need updating is the
problem
➢The authors told me to seek the ancient Structural
tree
![Page 30: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/30.jpg)
![Page 31: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/31.jpg)
Representation: map from edge to [vertices]
{[0 1] [2]
[0 2] [3 4 10]
[1 2] [5]
[1 5] [8 9]
[2 5] [6]
[5 6] [7]}
External edge nodes represented implicitly as nil
Blue nodes represented implicitly: parent edge + vertex
![Page 32: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/32.jpg)
My second implementation
➢ Iterative preprocessing step: builds structural tree
➢Recursive part operates on structural tree
➢O(n log n) runtime
➢More complex, unexplored territory
➢Generative testing saves the day again
➢Best of both implementations
• Straightforward and correct, but slow one
• Complex and unproven, but faster one
![Page 33: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/33.jpg)
Suddenly wild stack overflow appears
➢But how?
➢ Infinite recursion?
➢Another bug?
➢No, all tests pass. What?
➢A genuine stack overflow due to one benchmark
using ultra-tall 2-trees
![Page 34: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/34.jpg)
Workaround?
➢ Increase the call stack size via JVM options, but the problem
reappears when you double N a few times
Solution: Every recursive algorithm can be made iterative,
by using an explicit stack parameter, instead of the call stack
Then it hit me – there is a data structure in my program that
holds all the information it needs – the EdgesVerticies map.
With some modifications the recursive calls can be removed
completely and all the work can be done during the
preprocessing (bottom-up) phase
![Page 35: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/35.jpg)
My third implementation
➢ Iterative, dynamic programming, no recursive part
➢O(n) runtime!!
➢Millions of vertices without overflow
➢Map from edge to vector of labels
➢Generative testing saves the day yet again
![Page 36: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/36.jpg)
0
20
40
60
80
100
120
140
160
180
0E+0 1E+6 2E+6 3E+6 4E+6 5E+6 6E+6 7E+6
Seco
nd
s
Number of vertices
Projected O(n log n)
Projected O(n)
Actual time
The result
Benchmarks
via Criterium by
Hugo Duncan
![Page 37: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/37.jpg)
Implementations recap
Type Direction Data structure Complexity
Java Recursive Matrix 𝑂(𝑛2)
Direct Recursive int-map, int-set 𝑂(𝑛 𝑛)
IndirectIterative int-map, int-set
𝑂 𝑛 𝑙𝑜𝑔 𝑛Recursive EdgeVertices map
Dynamic Iterativeint-map, int-set
EdgeLabels map𝑂(𝑛)
![Page 38: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/38.jpg)
Transient variants of persistent data structures
➢ If the original value is never used after modification,
it’s safe to modify it in place, while still presenting an
immutable interface to the outside world
➢Add complexity, so make your program work without
them, then add:
• a call to transient in the beginning
• ! to assoc, dissoc, conj and friends
• a call to persistent! at the end
![Page 39: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/39.jpg)
Further optimization of middle level functions
➢Higher level decision making – 2 simpler, faster
functions instead of 1 complex, mathematically pure
➢Proper case simplified greatly, removed branching
➢Degenerate cases handled by specialized variant
• Simplified greatly, removed branching
• When a = 1 the expression (+ a b) becomes (inc b)
• When c = 0 the expression (max c d) becomes d
➢Frequent trivial case handled directly
• No function call cost, no unnecessary computation
![Page 40: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/40.jpg)
Memoization
➢The function remembers the result for given
parameters to avoid costly recomputation
➢Useful whenever a big problem is divided into
smaller ones
➢The built-in memoize returns a variable argument
function, which adds overhead.
➢ If we know the number of arguments, we can build
our own version which is simpler and faster
![Page 41: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/41.jpg)
Resources
➢The algorithm
https://sites.google.com/site/minkommarkov/longest-
2-tree--draft.pdf
➢My implementations
https://github.com/Biserkov/twotree-longest-path
➢Understanding Clojure’s transients
http://www.hypirion.com/musings/understanding-
clojure-transients
![Page 42: JORDAN BISERKOV - ClojuTRE Biserkov - My... · 2018. 9. 17. · Jordan Biserkov Programming professionally since 2001 Found Lisp in 2005 via pg essays & books Found Clojure on HN](https://reader033.fdocuments.net/reader033/viewer/2022060917/60aa016cfeb2fe31747f3881/html5/thumbnails/42.jpg)
Thank you!Questions?