Real-time and Low Latency Embedded Computer Vision Hardware ...
STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some...
Transcript of STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some...
![Page 1: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/1.jpg)
Mateusz Pusz November 15, 2017
STRIVING FOR ULTIMATE LOW LATENCY
INTRODUCTION TO DEVELOPMENT OF LOW LATENCY SYSTEMS
![Page 2: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/2.jpg)
code::dive 2017 | Striving for ultimate Low Latency
LATENCY VS THROUGHPUT
2
![Page 3: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/3.jpg)
Latency is the time required to perform some action or to producesome result. Measured in units of time like hours, minutes,seconds, nanoseconds or clock periods.
code::dive 2017 | Striving for ultimate Low Latency
LATENCY VS THROUGHPUT
2
![Page 4: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/4.jpg)
Latency is the time required to perform some action or to producesome result. Measured in units of time like hours, minutes,seconds, nanoseconds or clock periods.
Throughput is the number of such actions executed or resultsproduced per unit of time. Measured in units of whatever is beingproduced per unit of time.
code::dive 2017 | Striving for ultimate Low Latency
LATENCY VS THROUGHPUT
2
![Page 5: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/5.jpg)
code::dive 2017 | Striving for ultimate Low Latency
WHAT DO WE MEAN BY LOW LATENCY?
3
![Page 6: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/6.jpg)
Low Latency allows human-unnoticeable delays between an inputbeing processed and the corresponding output providing real timecharacteristics.
code::dive 2017 | Striving for ultimate Low Latency
WHAT DO WE MEAN BY LOW LATENCY?
3
![Page 7: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/7.jpg)
Low Latency allows human-unnoticeable delays between an inputbeing processed and the corresponding output providing real timecharacteristics.
code::dive 2017 | Striving for ultimate Low Latency
WHAT DO WE MEAN BY LOW LATENCY?
Especially important for internet connections utilizing services such astrading, online gaming and VoIP.
3
![Page 8: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/8.jpg)
code::dive 2017 | Striving for ultimate Low Latency
WHY DO WE STRIVE FOR LOW LATENCY?
4
![Page 9: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/9.jpg)
• In VoIP substantial delays between input from conversation participants may impair their
communication
code::dive 2017 | Striving for ultimate Low Latency
WHY DO WE STRIVE FOR LOW LATENCY?
4
![Page 10: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/10.jpg)
• In VoIP substantial delays between input from conversation participants may impair their
communication
• In online gaming a player with a high latency internet connection may show slow responses in spite of
superior tactics or appropriate reaction time
code::dive 2017 | Striving for ultimate Low Latency
WHY DO WE STRIVE FOR LOW LATENCY?
4
![Page 11: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/11.jpg)
• In VoIP substantial delays between input from conversation participants may impair their
communication
• In online gaming a player with a high latency internet connection may show slow responses in spite of
superior tactics or appropriate reaction time
• Within capital markets the proliferation of algorithmic trading requires firms to react to market events
faster than the competition to increase profitability of trades
code::dive 2017 | Striving for ultimate Low Latency
WHY DO WE STRIVE FOR LOW LATENCY?
4
![Page 12: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/12.jpg)
A program trading platform that uses powerful computers totransact a large number of orders at very fast speeds
-- Investopedia
code::dive 2017 | Striving for ultimate Low Latency
HIGH-FREQUENCY TRADING (HFT)
5
![Page 13: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/13.jpg)
A program trading platform that uses powerful computers totransact a large number of orders at very fast speeds
-- Investopedia
• Using complex algorithms to analyze multiple markets and execute orders based on market conditions
• Buying and selling of securities many times over a period of time (o en hundreds of times an hour)
• Done to profit from time-sensitive opportunities that arise during trading hours
• Implies high turnover of capital (i.e. one's entire capital or more in a single day)
• Typically, the traders with the fastest execution speeds are more profitable
code::dive 2017 | Striving for ultimate Low Latency
HIGH-FREQUENCY TRADING (HFT)
5
![Page 14: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/14.jpg)
code::dive 2017 | Striving for ultimate Low Latency
MARKET DATA PROCESSING
6
![Page 15: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/15.jpg)
1-10us 100-1000ns
code::dive 2017 | Striving for ultimate Low Latency
HOW FAST DO WE DO?
ALL SOFTWARE APPROACH ALL HARDWARE APPROACH
7
![Page 16: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/16.jpg)
1-10us 100-1000ns
code::dive 2017 | Striving for ultimate Low Latency
HOW FAST DO WE DO?
ALL SOFTWARE APPROACH ALL HARDWARE APPROACH
7
![Page 17: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/17.jpg)
1-10us 100-1000ns
• Average human eye blink takes 350 000us (1/3s)
• Millions of orders can be traded that time
code::dive 2017 | Striving for ultimate Low Latency
HOW FAST DO WE DO?
ALL SOFTWARE APPROACH ALL HARDWARE APPROACH
7
![Page 18: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/18.jpg)
code::dive 2017 | Striving for ultimate Low Latency
WHAT IF SOMETHING GOES WRONG?
8
![Page 19: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/19.jpg)
• In 2012 was the largest trader in
U.S. equities
• Market share
– 17.3% on NYSE– 16.9% on NASDAQ
• Had approximately $365 million in cash
and equivalents
• Average daily trading volume
– 3.3 billion trades– trading over 21 billion dollars
code::dive 2017 | Striving for ultimate Low Latency
WHAT IF SOMETHING GOES WRONG?
KNIGHT CAPITAL
8
![Page 20: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/20.jpg)
• In 2012 was the largest trader in
U.S. equities
• Market share
– 17.3% on NYSE– 16.9% on NASDAQ
• Had approximately $365 million in cash
and equivalents
• Average daily trading volume
– 3.3 billion trades– trading over 21 billion dollars
• pre-tax loss of $440 million in 45 minutes
code::dive 2017 | Striving for ultimate Low Latency
WHAT IF SOMETHING GOES WRONG?
KNIGHT CAPITAL
9
![Page 21: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/21.jpg)
• Low Latency network
• Modern hardware
• BIOS profiling
• Kernel profiling
• OS profiling
code::dive 2017 | Striving for ultimate Low Latency
C++ OFTEN NOT THE MOST IMPORTANT PART OF THE SYSTEM
10
![Page 22: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/22.jpg)
code::dive 2017 | Striving for ultimate Low Latency
SPIN, PIN, AND DROP-IN
11
![Page 23: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/23.jpg)
• Don't sleep
• Don't context switch
• Prefer single-threaded scheduling
• Disable locking and thread support
• Disable power management
• Disable C-states
• Disable interrupt coalescing
code::dive 2017 | Striving for ultimate Low Latency
SPIN, PIN, AND DROP-IN
SPIN
11
![Page 24: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/24.jpg)
• Don't sleep
• Don't context switch
• Prefer single-threaded scheduling
• Disable locking and thread support
• Disable power management
• Disable C-states
• Disable interrupt coalescing
• Assign CPU a inity
• Assign interrupt a inity
• Assign memory to NUMA nodes
• Consider the physical location of NICs
• Isolate cores from general OS use
• Use a system with a single physical CPU
code::dive 2017 | Striving for ultimate Low Latency
SPIN, PIN, AND DROP-IN
SPIN PIN
11
![Page 25: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/25.jpg)
• Don't sleep
• Don't context switch
• Prefer single-threaded scheduling
• Disable locking and thread support
• Disable power management
• Disable C-states
• Disable interrupt coalescing
• Assign CPU a inity
• Assign interrupt a inity
• Assign memory to NUMA nodes
• Consider the physical location of NICs
• Isolate cores from general OS use
• Use a system with a single physical CPU
• Choose NIC vendors based on performance and availability of drop-in kernel bypass libraries
• Use the kernel bypass library
code::dive 2017 | Striving for ultimate Low Latency
SPIN, PIN, AND DROP-IN
SPIN PIN
DROP-IN
11
![Page 26: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/26.jpg)
LET'S SCOPE ON THE SOFTWARE
![Page 27: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/27.jpg)
• Typically only a small part of code is really important (fast path)
code::dive 2017 | Striving for ultimate Low Latency
CHARACTERISTICS OF LOW LATENCY SOFTWARE
13
![Page 28: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/28.jpg)
• Typically only a small part of code is really important (fast path)
• That code is not executed o en
• When it is executed it has to
– start and finish as soon as possible– have predictable and reproducible performance (low jitter)
code::dive 2017 | Striving for ultimate Low Latency
CHARACTERISTICS OF LOW LATENCY SOFTWARE
13
![Page 29: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/29.jpg)
• Typically only a small part of code is really important (fast path)
• That code is not executed o en
• When it is executed it has to
– start and finish as soon as possible– have predictable and reproducible performance (low jitter)
• Multithreading increases latency
– it is about low latency and not throughput– concurrency (even on di erent cores) trashes CPU caches above L1, share memory bus, shares IO,
shares network
code::dive 2017 | Striving for ultimate Low Latency
CHARACTERISTICS OF LOW LATENCY SOFTWARE
13
![Page 30: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/30.jpg)
• Typically only a small part of code is really important (fast path)
• That code is not executed o en
• When it is executed it has to
– start and finish as soon as possible– have predictable and reproducible performance (low jitter)
• Multithreading increases latency
– it is about low latency and not throughput– concurrency (even on di erent cores) trashes CPU caches above L1, share memory bus, shares IO,
shares network• Mistakes are really costly
– good error checking and recovery is mandatory– one second is 4 billion CPU instructions (a lot can happen that time)
code::dive 2017 | Striving for ultimate Low Latency
CHARACTERISTICS OF LOW LATENCY SOFTWARE
13
![Page 31: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/31.jpg)
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SOFTWARE THAT HAVE PREDICTABLEPERFORMANCE?
14
![Page 32: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/32.jpg)
It turns out that the more important question here is...
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SOFTWARE THAT HAVE PREDICTABLEPERFORMANCE?
14
![Page 33: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/33.jpg)
code::dive 2017 | Striving for ultimate Low Latency
HOW NOT TO DEVELOP SOFTWARE THAT HAVE PREDICTABLEPERFORMANCE?
15
![Page 34: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/34.jpg)
• In Low Latency system we care a lot about
WCET (Worst Case Execution Time)
• In order to limit WCET we should limit the
usage of specific C++ language features
• This is not only the task for developers but
also for code architects
code::dive 2017 | Striving for ultimate Low Latency
HOW NOT TO DEVELOP SOFTWARE THAT HAVE PREDICTABLEPERFORMANCE?
16
![Page 35: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/35.jpg)
1 C++ tools that trade performance for usability (e.g. std::shared_ptr<T>, std::function<>)
2 Throwing exceptions on likely code path
3 Dynamic polymorphism
4 Multiple inheritance
5 RTTI
6 Dynamic memory allocations
code::dive 2017 | Striving for ultimate Low Latency
THINGS TO AVOID ON THE FAST PATH
17
![Page 36: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/36.jpg)
template<class T>class shared_ptr;
• Smart pointer that retains shared ownership of an object through a pointer
• Several shared_ptr objects may own the same object
• The shared object is destroyed and its memory deallocated when the last remaining shared_ptr
owning that object is either destroyed or assigned another pointer via operator= or reset()
• Supports user provided deleter
code::dive 2017 | Striving for ultimate Low Latency
std::shared_ptr<T>
18
![Page 37: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/37.jpg)
template<class T>class shared_ptr;
• Smart pointer that retains shared ownership of an object through a pointer
• Several shared_ptr objects may own the same object
• The shared object is destroyed and its memory deallocated when the last remaining shared_ptr
owning that object is either destroyed or assigned another pointer via operator= or reset()
• Supports user provided deleter
code::dive 2017 | Striving for ultimate Low Latency
std::shared_ptr<T>
Too o en overused by C++ programmers
18
![Page 38: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/38.jpg)
void foo(){ std::unique_ptr<int> ptr{new int{1}}; // some code using 'ptr'}
void foo(){ std::shared_ptr<int> ptr{new int{1}}; // some code using 'ptr'}
code::dive 2017 | Striving for ultimate Low Latency
QUESTION: WHAT IS THE DIFFERENCE HERE?
19
![Page 39: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/39.jpg)
code::dive 2017 | Striving for ultimate Low Latency
KEY std::shared_ptr<T> ISSUES
20
![Page 40: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/40.jpg)
• Shared state
– performance + memory footprint• Mandatory synchronization
– performance• Type Erasure
– performance• std::weak_ptr<T> support
– memory footprint• Aliasing constructor
– memory footprint
code::dive 2017 | Striving for ultimate Low Latency
KEY std::shared_ptr<T> ISSUES
20
![Page 41: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/41.jpg)
code::dive 2017 | Striving for ultimate Low Latency
MORE INFO ON CODE::DIVE 2016
21
![Page 42: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/42.jpg)
code::dive 2017 | Striving for ultimate Low Latency
C++ EXCEPTIONS
22
![Page 43: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/43.jpg)
• Code generated by nearly all C++ compilers does not introduce significant runtime overhead for C++
exceptions
code::dive 2017 | Striving for ultimate Low Latency
C++ EXCEPTIONS
22
![Page 44: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/44.jpg)
• Code generated by nearly all C++ compilers does not introduce significant runtime overhead for C++
exceptions
• ... if they are not thrown
code::dive 2017 | Striving for ultimate Low Latency
C++ EXCEPTIONS
22
![Page 45: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/45.jpg)
• Code generated by nearly all C++ compilers does not introduce significant runtime overhead for C++
exceptions
• ... if they are not thrown
• Throwing an exception can take significant and not deterministic time
code::dive 2017 | Striving for ultimate Low Latency
C++ EXCEPTIONS
22
![Page 46: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/46.jpg)
• Code generated by nearly all C++ compilers does not introduce significant runtime overhead for C++
exceptions
• ... if they are not thrown
• Throwing an exception can take significant and not deterministic time
• Advantages of C++ exceptions usage
– (if not thrown) actually can improve application performance– cannot be ignored!– simplify interfaces– make source code of likely path easier to reason about
code::dive 2017 | Striving for ultimate Low Latency
C++ EXCEPTIONS
22
![Page 47: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/47.jpg)
• Code generated by nearly all C++ compilers does not introduce significant runtime overhead for C++
exceptions
• ... if they are not thrown
• Throwing an exception can take significant and not deterministic time
• Advantages of C++ exceptions usage
– (if not thrown) actually can improve application performance– cannot be ignored!– simplify interfaces– make source code of likely path easier to reason about
code::dive 2017 | Striving for ultimate Low Latency
C++ EXCEPTIONS
Not using C++ exceptions is not an excuse to write not exception-safe code!
22
![Page 48: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/48.jpg)
1 Nothrow (or nofail) exception guarantee
2 Strong exception guarantee
3 Basic exception guarantee
4 No exception guarantee
code::dive 2017 | Striving for ultimate Low Latency
EXCEPTION SAFETY GUARANTEES
23
![Page 49: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/49.jpg)
1 Nothrow (or nofail) exception guarantee
2 Strong exception guarantee
3 Basic exception guarantee
4 No exception guarantee
code::dive 2017 | Striving for ultimate Low Latency
EXCEPTION SAFETY GUARANTEES
Only in case of "No exception guarantee" if the function throws an exception,the program may not be in a valid state: resource leaks, memory corruption,or other invariant-destroying errors may have occurred.
23
![Page 50: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/50.jpg)
code::dive 2017 | Striving for ultimate Low Latency
POLYMORPHISM
24
![Page 51: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/51.jpg)
class base { virtual void setup() = 0; virtual void run() = 0; virtual void cleanup() = 0;public: virtual ~base() = default; void process() { setup(); run(); cleanup(); }};
class derived : public base { void setup() override { /* ... */ } void run() override { /* ... */ } void cleanup() override { /* ... */ }};
code::dive 2017 | Striving for ultimate Low Latency
POLYMORPHISM
DYNAMIC
24
![Page 52: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/52.jpg)
class base { virtual void setup() = 0; virtual void run() = 0; virtual void cleanup() = 0;public: virtual ~base() = default; void process() { setup(); run(); cleanup(); }};
class derived : public base { void setup() override { /* ... */ } void run() override { /* ... */ } void cleanup() override { /* ... */ }};
• Additional pointer stored in an object
• Extra indirection (pointer dereference)
• O en not possible to devirtualize
• Not inlined
• Instruction cache miss
code::dive 2017 | Striving for ultimate Low Latency
POLYMORPHISM
DYNAMIC
24
![Page 53: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/53.jpg)
class base { virtual void setup() = 0; virtual void run() = 0; virtual void cleanup() = 0;public: virtual ~base() = default; void process() { setup(); run(); cleanup(); }};
class derived : public base { void setup() override { /* ... */ } void run() override { /* ... */ } void cleanup() override { /* ... */ }};
template<class Derived>class base {public: void process() { static_cast<Derived*>(this)->setup(); static_cast<Derived*>(this)->run(); static_cast<Derived*>(this)->cleanup(); }};
class derived : public base<derived> { friend class base<derived>; void setup() { /* ... */ } void run() { /* ... */ } void cleanup() { /* ... */ }};
code::dive 2017 | Striving for ultimate Low Latency
POLYMORPHISM
DYNAMIC STATIC
25
![Page 54: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/54.jpg)
• this pointer adjustments needed to call
member function (for not empty base classes)
code::dive 2017 | Striving for ultimate Low Latency
MULTIPLE INHERITANCE
MULTIPLE INHERITANCE
26
![Page 55: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/55.jpg)
• this pointer adjustments needed to call
member function (for not empty base classes)
• Virtual inheritance as an answer
• virtual in C++ means "determined at runtime"
• Extra indirection to access data members
code::dive 2017 | Striving for ultimate Low Latency
MULTIPLE INHERITANCE
MULTIPLE INHERITANCE
DIAMOND OF DREAD
27
![Page 56: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/56.jpg)
• this pointer adjustments needed to call
member function (for not empty base classes)
• Virtual inheritance as an answer
• virtual in C++ means "determined at runtime"
• Extra indirection to access data members Always prefer composition before inheritance!
code::dive 2017 | Striving for ultimate Low Latency
MULTIPLE INHERITANCE
MULTIPLE INHERITANCE
DIAMOND OF DREAD
27
![Page 57: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/57.jpg)
class base {public: virtual ~base() = default; virtual void foo() = 0;};
class derived : public base {public: void foo() override; void boo();};
code::dive 2017 | Striving for ultimate Low Latency
RUNTIME TYPE IDENTIFICATION (RTTI)
28
![Page 58: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/58.jpg)
class base {public: virtual ~base() = default; virtual void foo() = 0;};
class derived : public base {public: void foo() override; void boo();};
void foo(base& b){ derived* d = dynamic_cast<derived*>(&b); if(d) { d->boo(); }}
code::dive 2017 | Striving for ultimate Low Latency
RUNTIME TYPE IDENTIFICATION (RTTI)
28
![Page 59: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/59.jpg)
class base {public: virtual ~base() = default; virtual void foo() = 0;};
class derived : public base {public: void foo() override; void boo();};
void foo(base& b){ derived* d = dynamic_cast<derived*>(&b); if(d) { d->boo(); }}
code::dive 2017 | Striving for ultimate Low Latency
RUNTIME TYPE IDENTIFICATION (RTTI)
O en the sign of a smelly design
28
![Page 60: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/60.jpg)
class base {public: virtual ~base() = default; virtual void foo() = 0;};
class derived : public base {public: void foo() override; void boo();};
void foo(base& b){ derived* d = dynamic_cast<derived*>(&b); if(d) { d->boo(); }}
• Traversing an inheritance tree
• Comparisons
code::dive 2017 | Striving for ultimate Low Latency
RUNTIME TYPE IDENTIFICATION (RTTI)
29
![Page 61: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/61.jpg)
class base {public: virtual ~base() = default; virtual void foo() = 0;};
class derived : public base {public: void foo() override; void boo();};
void foo(base& b){ derived* d = dynamic_cast<derived*>(&b); if(d) { d->boo(); }}
• Traversing an inheritance tree
• Comparisons
void foo(base& b){ if(typeid(b) == typeid(derived)) { derived* d = static_cast<derived*>(&b); d->boo(); }}
• Only one comparison of std::type_info
• O en only one runtime pointer compare
code::dive 2017 | Striving for ultimate Low Latency
RUNTIME TYPE IDENTIFICATION (RTTI)
29
![Page 62: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/62.jpg)
• General purpose operation
• Nondeterministic execution performance
• Causes memory fragmentation
• Memory leaks possible if not properly handled
• May fail (error handling is needed)
code::dive 2017 | Striving for ultimate Low Latency
DYNAMIC MEMORY ALLOCATIONS
30
![Page 63: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/63.jpg)
• Address specific needs (functionality and hardware constrains)
• Typically low number of dynamic memory allocations
• Data structures needed to manage big chunks of memory
code::dive 2017 | Striving for ultimate Low Latency
CUSTOM ALLOCATORS TO THE RESCUE
31
![Page 64: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/64.jpg)
• Address specific needs (functionality and hardware constrains)
• Typically low number of dynamic memory allocations
• Data structures needed to manage big chunks of memory
template<typename T> struct pool_allocator { T* allocate(std::size_t n); void deallocate(T* p, std::size_t n);};
using pool_string = std::basic_string<char, std::char_traits<char>, pool_allocator>;
code::dive 2017 | Striving for ultimate Low Latency
CUSTOM ALLOCATORS TO THE RESCUE
31
![Page 65: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/65.jpg)
• Address specific needs (functionality and hardware constrains)
• Typically low number of dynamic memory allocations
• Data structures needed to manage big chunks of memory
template<typename T> struct pool_allocator { T* allocate(std::size_t n); void deallocate(T* p, std::size_t n);};
using pool_string = std::basic_string<char, std::char_traits<char>, pool_allocator>;
code::dive 2017 | Striving for ultimate Low Latency
CUSTOM ALLOCATORS TO THE RESCUE
Preallocation makes the allocator jitter more stable, helps in keeping relateddata together and avoiding long term fragmentation.
31
![Page 66: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/66.jpg)
Prevent dynamic memory allocation for the (common) case ofdealing with small objects
code::dive 2017 | Striving for ultimate Low Latency
SMALL OBJECT OPTIMIZATION (SOO / SSO / SBO)
32
![Page 67: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/67.jpg)
Prevent dynamic memory allocation for the (common) case ofdealing with small objects
class sso_string { char* data_ = u_.sso_; size_t size_ = 0; union { char sso_[16] = ""; size_t capacity_; } u_;public: size_t capacity() const { return data_ == u_.sso_ ? sizeof(u_.sso_) - 1 : u_.capacity_; } // ...};
code::dive 2017 | Striving for ultimate Low Latency
SMALL OBJECT OPTIMIZATION (SOO / SSO / SBO)
32
![Page 68: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/68.jpg)
template<std::size_t MaxSize>class inplace_string { std::array<value_type, MaxSize + 1> chars_;public: // string-like interface};
code::dive 2017 | Striving for ultimate Low Latency
NO DYNAMIC ALLOCATION
33
![Page 69: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/69.jpg)
template<std::size_t MaxSize>class inplace_string { std::array<value_type, MaxSize + 1> chars_;public: // string-like interface};
struct db_contact { inplace_string<7> symbol; inplace_string<15> name; inplace_string<15> surname; inplace_string<23> company;};
code::dive 2017 | Striving for ultimate Low Latency
NO DYNAMIC ALLOCATION
33
![Page 70: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/70.jpg)
template<std::size_t MaxSize>class inplace_string { std::array<value_type, MaxSize + 1> chars_;public: // string-like interface};
struct db_contact { inplace_string<7> symbol; inplace_string<15> name; inplace_string<15> surname; inplace_string<23> company;};
code::dive 2017 | Striving for ultimate Low Latency
NO DYNAMIC ALLOCATION
No dynamic memory allocations or pointer indirections guaranteed with thecost of possibly bigger memory usage
33
![Page 71: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/71.jpg)
1 Use tools that improve e iciency without sacrificing performance
2 Use compile time wherever possible
3 Know your hardware
4 Clearly isolate cold code from the fast path
code::dive 2017 | Striving for ultimate Low Latency
THINGS TO DO ON THE FAST PATH
34
![Page 72: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/72.jpg)
• static_assert()
• Automatic type deduction
• Type aliases
• Move semantics
• noexcept
• constexpr
• Lambda expressions
• type_traits
• std::unique_ptr<T>
• Variadic templates
• and many more...
code::dive 2017 | Striving for ultimate Low Latency
EXAMPLE OF SAFE TO USE C++ TOOLS
35
![Page 73: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/73.jpg)
The fastest programs are those that do nothing
code::dive 2017 | Striving for ultimate Low Latency
DO YOU AGREE?
36
![Page 74: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/74.jpg)
static_assert(factorial(4) == 24); // compile-time
volatile int k = 8;std::cout << factorial(k) << '\n'; // runtime
code::dive 2017 | Striving for ultimate Low Latency
constexpr FUNCTION
37
![Page 75: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/75.jpg)
constexpr int factorial(int n){ return n <= 1 ? 1 : (n * factorial(n - 1));}
static_assert(factorial(4) == 24); // compile-time
volatile int k = 8;std::cout << factorial(k) << '\n'; // runtime
code::dive 2017 | Striving for ultimate Low Latency
constexpr FUNCTION
C++11
37
![Page 76: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/76.jpg)
constexpr int factorial(int n){ return n <= 1 ? 1 : (n * factorial(n - 1));}
constexpr int factorial(int n){ int result = n; while(n > 1) result *= --n; return result;}
static_assert(factorial(4) == 24); // compile-time
volatile int k = 8;std::cout << factorial(k) << '\n'; // runtime
code::dive 2017 | Striving for ultimate Low Latency
constexpr FUNCTION
C++11 C++14
37
![Page 77: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/77.jpg)
constexpr int factorial(int n){ return n <= 1 ? 1 : (n * factorial(n - 1));}
constexpr int factorial(int n){ int result = n; while(n > 1) result *= --n; return result;}
static_assert(factorial(4) == 24); // compile-time
volatile int k = 8;std::cout << factorial(k) << '\n'; // runtime
code::dive 2017 | Striving for ultimate Low Latency
constexpr FUNCTION
C++11 C++14
No need to create and use manually precalculated tables anymore
37
![Page 78: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/78.jpg)
Support for constexpr dynamic memory allocation anddeallocation added in Albuquerque, NM. Possibility to create constexpr std::vector or maybe evenstd::string.
code::dive 2017 | Striving for ultimate Low Latency
C++20 SPOILER ALERT ;-)
38
![Page 79: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/79.jpg)
template<typename T>struct is_array : std::false_type {};
template<typename T>struct is_array<T[]> : std::true_type {};
template<typename T>constexpr bool is_array_v = is_array<T>::value;
static_assert(is_array_v<int> == false);static_assert(is_array_v<int[]>);
code::dive 2017 | Striving for ultimate Low Latency
COMPILE TIME DISPATCH
39
![Page 80: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/80.jpg)
template<typename T>struct is_array : std::false_type {};
template<typename T>struct is_array<T[]> : std::true_type {};
template<typename T>constexpr bool is_array_v = is_array<T>::value;
static_assert(is_array_v<int> == false);static_assert(is_array_v<int[]>);
void destroy(std::true_type) noexcept{ delete[] ptr_;}void destroy(std::false_type) noexcept{ delete ptr_;}void destroy() noexcept{ destroy(is_array<T>());}
code::dive 2017 | Striving for ultimate Low Latency
COMPILE TIME DISPATCH
39
![Page 81: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/81.jpg)
template<typename T>struct is_array : std::false_type {};
template<typename T>struct is_array<T[]> : std::true_type {};
template<typename T>constexpr bool is_array_v = is_array<T>::value;
static_assert(is_array_v<int> == false);static_assert(is_array_v<int[]>);
void destroy(std::true_type) noexcept{ delete[] ptr_;}void destroy(std::false_type) noexcept{ delete ptr_;}void destroy() noexcept{ destroy(is_array<T>());}
code::dive 2017 | Striving for ultimate Low Latency
COMPILE TIME DISPATCH
Tag dispatch provides the possibility to select the proper function overload incompile-time based on properties of a type.
39
![Page 82: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/82.jpg)
template<typename T>struct is_array : std::false_type {};
template<typename T>struct is_array<T[]> : std::true_type {};
template<typename T>constexpr bool is_array_v = is_array<T>::value;
static_assert(is_array_v<int> == false);static_assert(is_array_v<int[]>);
void destroy() noexcept{ if constexpr(is_array_v<T>) delete[] ptr_; else delete ptr_;}
code::dive 2017 | Striving for ultimate Low Latency
C++17 COMPILE TIME DISPATCH
40
![Page 83: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/83.jpg)
struct X { int a, b, c; int id;};
void foo(const std::vector<X>& a1, std::vector<X>& a2){ memcpy(a2.data(), a1.data(), a1.size() * sizeof(X)); // ...}
code::dive 2017 | Striving for ultimate Low Latency
TYPE TRAITS: A NEGATIVE OVERHEAD ABSTRACTION
41
![Page 84: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/84.jpg)
![Page 85: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/85.jpg)
![Page 86: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/86.jpg)
struct X { int a, b, c; int id;};
void foo(const std::vector<X>& a1, std::vector<X>& a2){ memcpy(a2.data(), a1.data(), a1.size() * sizeof(X)); // ...}
code::dive 2017 | Striving for ultimate Low Latency
TYPE TRAITS: A NEGATIVE OVERHEAD ABSTRACTION
44
![Page 87: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/87.jpg)
struct X { int a, b, c; int id;};
struct X { int a, b, c; std::string id;};
void foo(const std::vector<X>& a1, std::vector<X>& a2){ memcpy(a2.data(), a1.data(), a1.size() * sizeof(X)); // ...}
code::dive 2017 | Striving for ultimate Low Latency
TYPE TRAITS: A NEGATIVE OVERHEAD ABSTRACTION
45
![Page 88: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/88.jpg)
struct X { int a, b, c; int id;};
struct X { int a, b, c; std::string id;};
void foo(const std::vector<X>& a1, std::vector<X>& a2){ memcpy(a2.data(), a1.data(), a1.size() * sizeof(X)); // ...}
Ooops!!!
code::dive 2017 | Striving for ultimate Low Latency
TYPE TRAITS: A NEGATIVE OVERHEAD ABSTRACTION
45
![Page 89: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/89.jpg)
struct X { int a, b, c; int id;};
struct X { int a, b, c; std::string id;};
void foo(const std::vector<X>& a1, std::vector<X>& a2){ std::copy(begin(a1), end(a1), begin(a2)); // ...}
code::dive 2017 | Striving for ultimate Low Latency
TYPE TRAITS: A NEGATIVE OVERHEAD ABSTRACTION
46
![Page 90: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/90.jpg)
struct X { int a, b, c; int id;};
struct X { int a, b, c; std::string id;};
movq %rsi, %raxmovq 8(%rdi), %rdxmovq (%rdi), %rsicmpq %rsi, %rdxje .L1movq (%rax), %rdisubq %rsi, %rdxjmp memmove
// 100 lines of assembly code
void foo(const std::vector<X>& a1, std::vector<X>& a2){ std::copy(begin(a1), end(a1), begin(a2)); // ...}
code::dive 2017 | Striving for ultimate Low Latency
TYPE TRAITS: A NEGATIVE OVERHEAD ABSTRACTION
46
![Page 91: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/91.jpg)
enum class side { BID, ASK };
class order_book { template<side S> class book_side {
std::vector<price> levels_; public: void insert(order o) {
} bool match(price p) const {
} // ... };
book_side<side::BID> bids_; book_side<side::ASK> asks_; // ...};
code::dive 2017 | Striving for ultimate Low Latency
TYPE TRAITS: COMPILE-TIME BRANCHING
47
![Page 92: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/92.jpg)
enum class side { BID, ASK };
class order_book { template<side S> class book_side { using compare = std::conditional_t<S == side::BID, std::greater<>, std::less<>>; std::vector<price> levels_; public: void insert(order o) {
} bool match(price p) const {
} // ... };
book_side<side::BID> bids_; book_side<side::ASK> asks_; // ...};
code::dive 2017 | Striving for ultimate Low Latency
TYPE TRAITS: COMPILE-TIME BRANCHING
47
![Page 93: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/93.jpg)
enum class side { BID, ASK };
class order_book { template<side S> class book_side { using compare = std::conditional_t<S == side::BID, std::greater<>, std::less<>>; std::vector<price> levels_; public: void insert(order o) { const auto it = lower_bound(begin(levels_), end(levels_), o.price, compare{}); if(it != end(levels_) && *it != o.price) levels_.insert(it, o.price); // ... } bool match(price p) const {
} // ... };
book_side<side::BID> bids_; book_side<side::ASK> asks_; // ...};
code::dive 2017 | Striving for ultimate Low Latency
TYPE TRAITS: COMPILE-TIME BRANCHING
47
![Page 94: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/94.jpg)
enum class side { BID, ASK };
class order_book { template<side S> class book_side { using compare = std::conditional_t<S == side::BID, std::greater<>, std::less<>>; std::vector<price> levels_; public: void insert(order o) { const auto it = lower_bound(begin(levels_), end(levels_), o.price, compare{}); if(it != end(levels_) && *it != o.price) levels_.insert(it, o.price); // ... } bool match(price p) const { return compare{}(levels_.back(), p); } // ... };
book_side<side::BID> bids_; book_side<side::ASK> asks_; // ...};
code::dive 2017 | Striving for ultimate Low Latency
TYPE TRAITS: COMPILE-TIME BRANCHING
47
![Page 95: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/95.jpg)
constexpr int array_size = 10'000;int array[array_size][array_size];
for(auto i = 0L; i < array_size; ++i) { for(auto j = 0L; j < array_size; ++j) { array[j][i] = i + j; }}
code::dive 2017 | Striving for ultimate Low Latency
WHAT IS WRONG HERE?
48
![Page 96: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/96.jpg)
constexpr int array_size = 10'000;int array[array_size][array_size];
for(auto i = 0L; i < array_size; ++i) { for(auto j = 0L; j < array_size; ++j) { array[j][i] = i + j; }}
code::dive 2017 | Striving for ultimate Low Latency
WHAT IS WRONG HERE?
Reckless cache usage can cost you a lot of performance!
49
![Page 97: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/97.jpg)
code::dive 2017 | Striving for ultimate Low Latency
LAKOS'17 EXERCISE
50
![Page 98: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/98.jpg)
Please everybody stand up
code::dive 2017 | Striving for ultimate Low Latency
LAKOS'17 EXERCISE
50
![Page 99: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/99.jpg)
code::dive 2017 | Striving for ultimate Low Latency
QUIZ: HOW MUCH SLOWER IS THE BAD CASE?
51
![Page 100: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/100.jpg)
• Less than 2x
code::dive 2017 | Striving for ultimate Low Latency
QUIZ: HOW MUCH SLOWER IS THE BAD CASE?
51
![Page 101: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/101.jpg)
• Less than 2x
• Less than 5x
code::dive 2017 | Striving for ultimate Low Latency
QUIZ: HOW MUCH SLOWER IS THE BAD CASE?
51
![Page 102: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/102.jpg)
• Less than 2x
• Less than 5x
• Less than 10x
code::dive 2017 | Striving for ultimate Low Latency
QUIZ: HOW MUCH SLOWER IS THE BAD CASE?
51
![Page 103: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/103.jpg)
• Less than 2x
• Less than 5x
• Less than 10x
• Less than 20x
code::dive 2017 | Striving for ultimate Low Latency
QUIZ: HOW MUCH SLOWER IS THE BAD CASE?
51
![Page 104: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/104.jpg)
• Less than 2x
• Less than 5x
• Less than 10x
• Less than 20x
• Less than 50x
code::dive 2017 | Striving for ultimate Low Latency
QUIZ: HOW MUCH SLOWER IS THE BAD CASE?
51
![Page 105: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/105.jpg)
2173,166562 task-clock (msec) # 0,998 CPUs utilized 5602701607 cycles # 2,578 GHz (66,61%) 1166903909 instructions # 0,21 insn per cycle (83,25%) 74953018 L1-dcache-loads # 34,490 M/sec (83,26%) 398254489 L1-dcache-load-misses # 531,34% of all L1-dcache hits (83,45%) 102530658 LLC-loads # 47,180 M/sec (83,44%) 2386907 LLC-load-misses # 2,33% of all LL-cache hits (83,30%) 97769 page-faults # 0,045 M/sec
194,177764 task-clock (msec) # 0,996 CPUs utilized 506872781 cycles # 2,610 GHz (67,06%) 812720459 instructions # 1,60 insn per cycle (83,54%) 69094773 L1-dcache-loads # 355,833 M/sec (83,53%) 13586696 L1-dcache-load-misses # 19,66% of all L1-dcache hits (83,52%) 91249 LLC-loads # 0,470 M/sec (83,52%) 37030 LLC-load-misses # 40,58% of all LL-cache hits (83,78%) 97769 page-faults # 0,504 M/sec
code::dive 2017 | Striving for ultimate Low Latency
CPU CACHE
52
![Page 106: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/106.jpg)
2173,166562 task-clock (msec) # 0,998 CPUs utilized 5602701607 cycles # 2,578 GHz (66,61%) 1166903909 instructions # 0,21 insn per cycle (83,25%) 74953018 L1-dcache-loads # 34,490 M/sec (83,26%) 398254489 L1-dcache-load-misses # 531,34% of all L1-dcache hits (83,45%) 102530658 LLC-loads # 47,180 M/sec (83,44%) 2386907 LLC-load-misses # 2,33% of all LL-cache hits (83,30%) 97769 page-faults # 0,045 M/sec
194,177764 task-clock (msec) # 0,996 CPUs utilized 506872781 cycles # 2,610 GHz (67,06%) 812720459 instructions # 1,60 insn per cycle (83,54%) 69094773 L1-dcache-loads # 355,833 M/sec (83,53%) 13586696 L1-dcache-load-misses # 19,66% of all L1-dcache hits (83,52%) 91249 LLC-loads # 0,470 M/sec (83,52%) 37030 LLC-load-misses # 40,58% of all LL-cache hits (83,78%) 97769 page-faults # 0,504 M/sec
code::dive 2017 | Striving for ultimate Low Latency
CPU CACHE
52
![Page 107: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/107.jpg)
2173,166562 task-clock (msec) # 0,998 CPUs utilized 5602701607 cycles # 2,578 GHz (66,61%) 1166903909 instructions # 0,21 insn per cycle (83,25%) 74953018 L1-dcache-loads # 34,490 M/sec (83,26%) 398254489 L1-dcache-load-misses # 531,34% of all L1-dcache hits (83,45%) 102530658 LLC-loads # 47,180 M/sec (83,44%) 2386907 LLC-load-misses # 2,33% of all LL-cache hits (83,30%) 97769 page-faults # 0,045 M/sec
194,177764 task-clock (msec) # 0,996 CPUs utilized 506872781 cycles # 2,610 GHz (67,06%) 812720459 instructions # 1,60 insn per cycle (83,54%) 69094773 L1-dcache-loads # 355,833 M/sec (83,53%) 13586696 L1-dcache-load-misses # 19,66% of all L1-dcache hits (83,52%) 91249 LLC-loads # 0,470 M/sec (83,52%) 37030 LLC-load-misses # 40,58% of all LL-cache hits (83,78%) 97769 page-faults # 0,504 M/sec
code::dive 2017 | Striving for ultimate Low Latency
CPU CACHE
52
![Page 108: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/108.jpg)
2173,166562 task-clock (msec) # 0,998 CPUs utilized 5602701607 cycles # 2,578 GHz (66,61%) 1166903909 instructions # 0,21 insn per cycle (83,25%) 74953018 L1-dcache-loads # 34,490 M/sec (83,26%) 398254489 L1-dcache-load-misses # 531,34% of all L1-dcache hits (83,45%) 102530658 LLC-loads # 47,180 M/sec (83,44%) 2386907 LLC-load-misses # 2,33% of all LL-cache hits (83,30%) 97769 page-faults # 0,045 M/sec
194,177764 task-clock (msec) # 0,996 CPUs utilized 506872781 cycles # 2,610 GHz (67,06%) 812720459 instructions # 1,60 insn per cycle (83,54%) 69094773 L1-dcache-loads # 355,833 M/sec (83,53%) 13586696 L1-dcache-load-misses # 19,66% of all L1-dcache hits (83,52%) 91249 LLC-loads # 0,470 M/sec (83,52%) 37030 LLC-load-misses # 40,58% of all LL-cache hits (83,78%) 97769 page-faults # 0,504 M/sec
code::dive 2017 | Striving for ultimate Low Latency
CPU CACHE
52
![Page 109: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/109.jpg)
2173,166562 task-clock (msec) # 0,998 CPUs utilized 5602701607 cycles # 2,578 GHz (66,61%) 1166903909 instructions # 0,21 insn per cycle (83,25%) 74953018 L1-dcache-loads # 34,490 M/sec (83,26%) 398254489 L1-dcache-load-misses # 531,34% of all L1-dcache hits (83,45%) 102530658 LLC-loads # 47,180 M/sec (83,44%) 2386907 LLC-load-misses # 2,33% of all LL-cache hits (83,30%) 97769 page-faults # 0,045 M/sec
194,177764 task-clock (msec) # 0,996 CPUs utilized 506872781 cycles # 2,610 GHz (67,06%) 812720459 instructions # 1,60 insn per cycle (83,54%) 69094773 L1-dcache-loads # 355,833 M/sec (83,53%) 13586696 L1-dcache-load-misses # 19,66% of all L1-dcache hits (83,52%) 91249 LLC-loads # 0,470 M/sec (83,52%) 37030 LLC-load-misses # 40,58% of all LL-cache hits (83,78%) 97769 page-faults # 0,504 M/sec
code::dive 2017 | Striving for ultimate Low Latency
CPU CACHE
52
![Page 110: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/110.jpg)
struct coordinates { int x, y; };
void draw(const coordinates& coord);void verify(int threshold);
constexpr int OBJECT_COUNT = 1'000'000;class objectMgr;
void process(const objectMgr& mgr){ const auto size = mgr.size(); for(auto i = 0UL; i < size; ++i) { draw(mgr.position(i)); } for(auto i = 0UL; i < size; ++i) { verify(mgr.threshold(i)); }}
code::dive 2017 | Striving for ultimate Low Latency
ANOTHER EXAMPLE
53
![Page 111: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/111.jpg)
class objectMgr { struct object { coordinates coord; std::string errorTxt_1; std::string errorTxt_2; std::string errorTxt_3; int threshold; std::array<char, 100> otherData; }; std::vector<object> data_;
public: explicit objectMgr(std::size_t size) : data_{size} {} std::size_t size() const { return data_.size(); } const coordinates& position(std::size_t idx) const { return data_[idx].coord; } int threshold(std::size_t idx) const { return data_[idx].threshold; }};
code::dive 2017 | Striving for ultimate Low Latency
NAIIVE OBJECTMGR IMPLEMENTATION
54
![Page 112: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/112.jpg)
class objectMgr { struct object { coordinates coord; std::string errorTxt_1; std::string errorTxt_2; std::string errorTxt_3; int threshold; std::array<char, 100> otherData; }; std::vector<object> data_;
public: explicit objectMgr(std::size_t size) : data_{size} {} std::size_t size() const { return data_.size(); } const coordinates& position(std::size_t idx) const { return data_[idx].coord; } int threshold(std::size_t idx) const { return data_[idx].threshold; }};
code::dive 2017 | Striving for ultimate Low Latency
NAIIVE OBJECTMGR IMPLEMENTATION
54
![Page 113: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/113.jpg)
• Program optimization approach motivated by cache coherency
• Focus on data layout
• Results in objects decomposition
code::dive 2017 | Striving for ultimate Low Latency
DOD (DATA-ORIENTED DESIGN)
55
![Page 114: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/114.jpg)
• Program optimization approach motivated by cache coherency
• Focus on data layout
• Results in objects decomposition
code::dive 2017 | Striving for ultimate Low Latency
DOD (DATA-ORIENTED DESIGN)
Keep data used together close to each other
55
![Page 115: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/115.jpg)
class objectMgr { std::vector<coordinates> positions_; std::vector<int> thresholds_;
struct otherData { struct errorData { std::string errorTxt_1, errorTxt_2, errorTxt_3; }; errorData error; std::array<char, 100> data; }; std::vector<otherData> coldData_;
public: explicit objectMgr(std::size_t size) : positions_{size}, thresholds_(size), coldData_{size} {} std::size_t size() const { return positions_.size(); } const coordinates& position(std::size_t idx) const { return positions_[idx]; } int threshold(std::size_t idx) const { return thresholds_[idx]; }};
code::dive 2017 | Striving for ultimate Low Latency
OBJECT DECOMPOSITION EXAMPLE
56
![Page 116: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/116.jpg)
struct A { char c; double d; short s; static double dd; int i;};
static_assert( sizeof(A) == ??);static_assert(alignof(A) == ??);
code::dive 2017 | Striving for ultimate Low Latency
MEMBERS ORDER MIGHT BE IMPORTANT
57
![Page 117: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/117.jpg)
struct A { char c; // size=1, alignment=1, padding=7 double d; // size=8, alignment=8, padding=0 short s; // size=2, alignment=2, padding=2 static double dd; int i; // size=4, alignment=4, padding=0}; // size=24, alignment=8
static_assert( sizeof(A) == 24);static_assert(alignof(A) == 8);
code::dive 2017 | Striving for ultimate Low Latency
MEMBERS ORDER MIGHT BE IMPORTANT
57
![Page 118: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/118.jpg)
struct A { char c; // size=1, alignment=1, padding=7 double d; // size=8, alignment=8, padding=0 short s; // size=2, alignment=2, padding=2 static double dd; int i; // size=4, alignment=4, padding=0}; // size=24, alignment=8
static_assert( sizeof(A) == 24);static_assert(alignof(A) == 8);
struct B { char c; // size=1, alignment=1, padding=1 short s; // size=2, alignment=2, padding=0 int i; // size=4, alignment=4, padding=0 double d; // size=8, alignment=8, padding=0 static double dd;}; // size=16, alignment=8
static_assert( sizeof(B) == 16);static_assert(alignof(B) == 8);
code::dive 2017 | Striving for ultimate Low Latency
MEMBERS ORDER MIGHT BE IMPORTANT
57
![Page 119: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/119.jpg)
struct A { char c; // size=1, alignment=1, padding=7 double d; // size=8, alignment=8, padding=0 short s; // size=2, alignment=2, padding=2 static double dd; int i; // size=4, alignment=4, padding=0}; // size=24, alignment=8
static_assert( sizeof(A) == 24);static_assert(alignof(A) == 8);
struct B { char c; // size=1, alignment=1, padding=1 short s; // size=2, alignment=2, padding=0 int i; // size=4, alignment=4, padding=0 double d; // size=8, alignment=8, padding=0 static double dd;}; // size=16, alignment=8
static_assert( sizeof(B) == 16);static_assert(alignof(B) == 8);
code::dive 2017 | Striving for ultimate Low Latency
MEMBERS ORDER MIGHT BE IMPORTANT
In order to satisfy alignment requirements of all non-static members of aclass, padding may be inserted a er some of its members
57
![Page 120: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/120.jpg)
struct B { char c; // size=1, alignment=1, padding=1 short s; // size=2, alignment=2, padding=0 int i; // size=4, alignment=4, padding=0 double d; // size=8, alignment=8, padding=0 static double dd;}; // size=16, alignment=8
static_assert( sizeof(B) == 16);static_assert(alignof(B) == 8);
code::dive 2017 | Striving for ultimate Low Latency
BE AWARE OF ALIGNMENT SIDE EFFECTS
58
![Page 121: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/121.jpg)
struct B { char c; // size=1, alignment=1, padding=1 short s; // size=2, alignment=2, padding=0 int i; // size=4, alignment=4, padding=0 double d; // size=8, alignment=8, padding=0 static double dd;}; // size=16, alignment=8
static_assert( sizeof(B) == 16);static_assert(alignof(B) == 8);
using opt = std::optional<B>;
static_assert( sizeof(opt) == 24);static_assert(alignof(opt) == 8);
code::dive 2017 | Striving for ultimate Low Latency
BE AWARE OF ALIGNMENT SIDE EFFECTS
58
![Page 122: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/122.jpg)
struct B { char c; // size=1, alignment=1, padding=1 short s; // size=2, alignment=2, padding=0 int i; // size=4, alignment=4, padding=0 double d; // size=8, alignment=8, padding=0 static double dd;}; // size=16, alignment=8
static_assert( sizeof(B) == 16);static_assert(alignof(B) == 8);
using opt = std::optional<B>;
static_assert( sizeof(opt) == 24);static_assert(alignof(opt) == 8);
using array = std::array<opt, 256>;
static_assert( sizeof(array) == 24 * 256);static_assert(alignof(array) == 8);
code::dive 2017 | Striving for ultimate Low Latency
BE AWARE OF ALIGNMENT SIDE EFFECTS
58
![Page 123: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/123.jpg)
struct B { char c; // size=1, alignment=1, padding=1 short s; // size=2, alignment=2, padding=0 int i; // size=4, alignment=4, padding=0 double d; // size=8, alignment=8, padding=0 static double dd;}; // size=16, alignment=8
static_assert( sizeof(B) == 16);static_assert(alignof(B) == 8);
using opt = std::optional<B>;
static_assert( sizeof(opt) == 24);static_assert(alignof(opt) == 8);
using array = std::array<opt, 256>;
static_assert( sizeof(array) == 24 * 256);static_assert(alignof(array) == 8);
code::dive 2017 | Striving for ultimate Low Latency
BE AWARE OF ALIGNMENT SIDE EFFECTS
Be aware of the conceptual implementation of the tools you use every day
58
![Page 124: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/124.jpg)
struct A { char c; double d; short s; static double dd; int i;} __attribute__((packed));
static_assert( sizeof(B) == 15);static_assert(alignof(B) == 1);
struct B { char c; // size=1, alignment=1, padding=1 short s; // size=2, alignment=2, padding=0 int i; // size=4, alignment=4, padding=0 double d; // size=8, alignment=8, padding=0 static double dd;}; // size=16, alignment=8
static_assert( sizeof(B) == 16);static_assert(alignof(B) == 8);
code::dive 2017 | Striving for ultimate Low Latency
PACKING
59
![Page 125: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/125.jpg)
struct A { char c; double d; short s; static double dd; int i;} __attribute__((packed));
static_assert( sizeof(B) == 15);static_assert(alignof(B) == 1);
struct B { char c; // size=1, alignment=1, padding=1 short s; // size=2, alignment=2, padding=0 int i; // size=4, alignment=4, padding=0 double d; // size=8, alignment=8, padding=0 static double dd;}; // size=16, alignment=8
static_assert( sizeof(B) == 16);static_assert(alignof(B) == 8);
On modern hardware may be faster than aligned structure.
code::dive 2017 | Striving for ultimate Low Latency
PACKING
59
![Page 126: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/126.jpg)
struct A { char c; double d; short s; static double dd; int i;} __attribute__((packed));
static_assert( sizeof(B) == 15);static_assert(alignof(B) == 1);
struct B { char c; // size=1, alignment=1, padding=1 short s; // size=2, alignment=2, padding=0 int i; // size=4, alignment=4, padding=0 double d; // size=8, alignment=8, padding=0 static double dd;}; // size=16, alignment=8
static_assert( sizeof(B) == 16);static_assert(alignof(B) == 8);
On modern hardware may be faster than aligned structure.
code::dive 2017 | Striving for ultimate Low Latency
PACKING
Not portable! May be slower or even crash.
59
![Page 127: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/127.jpg)
L1 cache reference 0.5 nsBranch misprediction 5 nsL2 cache reference 7 ns 14x L1 cacheMutex lock/unlock 25 nsMain memory reference 100 ns 20x L2 cache, 200x L1 cacheCompress 1K bytes with Zippy 3,000 nsSend 1K bytes over 1 Gbps network 10,000 ns 0.01 msRead 4K randomly from SSD 150,000 ns 0.15 msRead 1 MB sequentially from memory 250,000 ns 0.25 msRound trip within same datacenter 500,000 ns 0.5 msRead 1 MB sequentially from SSD 1,000,000 ns 1 ms 4X memoryDisk seek 10,000,000 ns 10 ms 20x datacenter roundtripRead 1 MB sequentially from disk 20,000,000 ns 20 ms 80x memory, 20X SSDSend packet CA->Netherlands->CA 150,000,000 ns 150 ms
code::dive 2017 | Striving for ultimate Low Latency
LATENCY NUMBERS EVERY PROGRAMMER SHOULD KNOW
60
![Page 128: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/128.jpg)
class vector_downward { uint8_t *make_space(size_t len) { if (len > static_cast<size_t>(cur_ - buf_)) { auto old_size = size(); auto largest_align = AlignOf<largest_scalar_t>(); reserved_ += (std::max)(len, growth_policy(reserved_)); // Round up to avoid undefined behavior from unaligned loads and stores. reserved_ = (reserved_ + (largest_align - 1)) & ~(largest_align - 1); auto new_buf = allocator_.allocate(reserved_); auto new_cur = new_buf + reserved_ - old_size; memcpy(new_cur, cur_, old_size); cur_ = new_cur; allocator_.deallocate(buf_); buf_ = new_buf; } cur_ -= len; // Beyond this, signed offsets may not have enough range: // (FlatBuffers > 2GB not supported). assert(size() < FLATBUFFERS_MAX_BUFFER_SIZE); return cur_; } // ...};
code::dive 2017 | Striving for ultimate Low Latency
WHAT IS WRONG HERE?
61
![Page 129: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/129.jpg)
class vector_downward { uint8_t *make_space(size_t len) { if (len > static_cast<size_t>(cur_ - buf_)) { auto old_size = size(); auto largest_align = AlignOf<largest_scalar_t>(); reserved_ += (std::max)(len, growth_policy(reserved_)); // Round up to avoid undefined behavior from unaligned loads and stores. reserved_ = (reserved_ + (largest_align - 1)) & ~(largest_align - 1); auto new_buf = allocator_.allocate(reserved_); auto new_cur = new_buf + reserved_ - old_size; memcpy(new_cur, cur_, old_size); cur_ = new_cur; allocator_.deallocate(buf_); buf_ = new_buf; } cur_ -= len; // Beyond this, signed offsets may not have enough range: // (FlatBuffers > 2GB not supported). assert(size() < FLATBUFFERS_MAX_BUFFER_SIZE); return cur_; } // ...};
code::dive 2017 | Striving for ultimate Low Latency
WHAT IS WRONG HERE?
61
![Page 130: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/130.jpg)
class vector_downward { uint8_t *make_space(size_t len) { if (len > static_cast<size_t>(cur_ - buf_)) reallocate(len); cur_ -= len; // Beyond this, signed offsets may not have enough range: // (FlatBuffers > 2GB not supported). assert(size() < FLATBUFFERS_MAX_BUFFER_SIZE); return cur_; }
void reallocate(size_t len) { auto old_size = size(); auto largest_align = AlignOf<largest_scalar_t>(); reserved_ += (std::max)(len, growth_policy(reserved_)); // Round up to avoid undefined behavior from unaligned loads and stores. reserved_ = (reserved_ + (largest_align - 1)) & ~(largest_align - 1); auto new_buf = allocator_.allocate(reserved_); auto new_cur = new_buf + reserved_ - old_size; memcpy(new_cur, cur_, old_size); cur_ = new_cur; allocator_.deallocate(buf_); buf_ = new_buf; } // ...};
code::dive 2017 | Striving for ultimate Low Latency
DIVIDE THE CODE TO HOT AND COLD PATHS
62
![Page 131: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/131.jpg)
class vector_downward { uint8_t *make_space(size_t len) { if (len > static_cast<size_t>(cur_ - buf_)) reallocate(len); cur_ -= len; // Beyond this, signed offsets may not have enough range: // (FlatBuffers > 2GB not supported). assert(size() < FLATBUFFERS_MAX_BUFFER_SIZE); return cur_; } // ...};
code::dive 2017 | Striving for ultimate Low Latency
DIVIDE THE CODE TO HOT AND COLD PATHS
63
![Page 132: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/132.jpg)
class vector_downward { uint8_t *make_space(size_t len) { if (len > static_cast<size_t>(cur_ - buf_)) reallocate(len); cur_ -= len; // Beyond this, signed offsets may not have enough range: // (FlatBuffers > 2GB not supported). assert(size() < FLATBUFFERS_MAX_BUFFER_SIZE); return cur_; } // ...};
Code is data too!
code::dive 2017 | Striving for ultimate Low Latency
DIVIDE THE CODE TO HOT AND COLD PATHS
63
![Page 133: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/133.jpg)
class vector_downward { uint8_t *make_space(size_t len) { if (len > static_cast<size_t>(cur_ - buf_)) reallocate(len); cur_ -= len; // Beyond this, signed offsets may not have enough range: // (FlatBuffers > 2GB not supported). assert(size() < FLATBUFFERS_MAX_BUFFER_SIZE); return cur_; } // ...};
Code is data too!
code::dive 2017 | Striving for ultimate Low Latency
DIVIDE THE CODE TO HOT AND COLD PATHS
Performance improvement of 20% thanks to better inlining
63
![Page 134: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/134.jpg)
std::optional<Error> validate(const request& r){ switch(r.type) { request::type_1: if(/* simple check */) return std::nullopt; return /* complex error msg generation */; request::type_2: if(/* simple check */) return std::nullopt; return /* complex error msg generation */; request::type_3: if(/* simple check */) return std::nullopt; return /* complex error msg generation */; // ... } throw std::logic_error("");}
code::dive 2017 | Striving for ultimate Low Latency
TYPICAL VALIDATION AND ERROR HANDLING
64
![Page 135: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/135.jpg)
std::optional<Error> validate(const request& r){ if(is_valid(r)) return std::nullopt; return make_error(r)}
code::dive 2017 | Striving for ultimate Low Latency
ISOLATING COLD PATH
65
![Page 136: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/136.jpg)
bool is_valid(const request& r){ switch(r.type) { request::type_1: return /* simple check */; request::type_2: return /* simple check */; request::type_3: return /* simple check */; // ... } throw std::logic_error("");}
std::optional<Error> validate(const request& r){ if(is_valid(r)) return std::nullopt; return make_error(r)}
code::dive 2017 | Striving for ultimate Low Latency
ISOLATING COLD PATH
65
![Page 137: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/137.jpg)
bool is_valid(const request& r){ switch(r.type) { request::type_1: return /* simple check */; request::type_2: return /* simple check */; request::type_3: return /* simple check */; // ... } throw std::logic_error("");}
Error make_error(const request& r){ switch(r.type) { request::type_1: return /* complex error msg generation */; request::type_2: return /* complex error msg generation */; request::type_3: return /* complex error msg generation */; // ... } throw std::logic_error("");}
std::optional<Error> validate(const request& r){ if(is_valid(r)) return std::nullopt; return make_error(r)}
code::dive 2017 | Striving for ultimate Low Latency
ISOLATING COLD PATH
65
![Page 138: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/138.jpg)
if(expensiveCheck() && fastCheck()) { /* ... */ }
if(fastCheck() && expensiveCheck()) { /* ... */ }
code::dive 2017 | Striving for ultimate Low Latency
EXPRESSION SHORT-CIRCUITING
WRONG
GOOD
66
![Page 139: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/139.jpg)
if(expensiveCheck() && fastCheck()) { /* ... */ }
if(fastCheck() && expensiveCheck()) { /* ... */ }
code::dive 2017 | Striving for ultimate Low Latency
EXPRESSION SHORT-CIRCUITING
WRONG
GOOD
Bail out as early as possible and continue fast path.
66
![Page 140: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/140.jpg)
int foo(int i){ int k = 0; for(int j = i; j < i + 10; ++j) ++k; return k;}
int foo(unsigned i){ int k = 0; for(unsigned j = i; j < i + 10; ++j) ++k; return k;}
code::dive 2017 | Striving for ultimate Low Latency
INTEGER ARITHMETIC
67
![Page 141: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/141.jpg)
int foo(int i){ int k = 0; for(int j = i; j < i + 10; ++j) ++k; return k;}
int foo(unsigned i){ int k = 0; for(unsigned j = i; j < i + 10; ++j) ++k; return k;}
foo(int): mov eax, 10 ret
foo(unsigned int): cmp edi, -10 sbb eax, eax and eax, 10 ret
code::dive 2017 | Striving for ultimate Low Latency
INTEGER ARITHMETIC
67
![Page 142: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/142.jpg)
int foo(int i){ int k = 0; for(int j = i; j < i + 10; ++j) ++k; return k;}
int foo(unsigned i){ int k = 0; for(unsigned j = i; j < i + 10; ++j) ++k; return k;}
foo(int): mov eax, 10 ret
foo(unsigned int): cmp edi, -10 sbb eax, eax and eax, 10 ret
code::dive 2017 | Striving for ultimate Low Latency
INTEGER ARITHMETIC
Integer arithmetic di ers for the signed and unsigned integral types
67
![Page 143: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/143.jpg)
• Avoid using std::list, std::map, etc
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
68
![Page 144: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/144.jpg)
• Avoid using std::list, std::map, etc
• std::unordered_map is also broken
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
68
![Page 145: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/145.jpg)
• Avoid using std::list, std::map, etc
• std::unordered_map is also broken
• Prefer std::vector
– also as the underlying storage for hash tables– consider using tsl::hopscotch_map or similar
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
68
![Page 146: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/146.jpg)
• Avoid using std::list, std::map, etc
• std::unordered_map is also broken
• Prefer std::vector
– also as the underlying storage for hash tables– consider using tsl::hopscotch_map or similar
• Preallocate storage
– free list– plf::colony
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
68
![Page 147: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/147.jpg)
• Avoid using std::list, std::map, etc
• std::unordered_map is also broken
• Prefer std::vector
– also as the underlying storage for hash tables– consider using tsl::hopscotch_map or similar
• Preallocate storage
– free list– plf::colony
• Consider storing only pointers to objects in the container
• Consider storing expensive hash values with the key
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
68
![Page 148: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/148.jpg)
• Avoid using std::list, std::map, etc
• std::unordered_map is also broken
• Prefer std::vector
– also as the underlying storage for hash tables– consider using tsl::hopscotch_map or similar
• Preallocate storage
– free list– plf::colony
• Consider storing only pointers to objects in the container
• Consider storing expensive hash values with the key
• Limit the number of type conversions
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
68
![Page 149: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/149.jpg)
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
69
![Page 150: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/150.jpg)
• Keep the number of threads close (less or equal) to the number of available physical CPU cores
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
69
![Page 151: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/151.jpg)
• Keep the number of threads close (less or equal) to the number of available physical CPU cores
• Separate IO threads from business logic thread (unless business logic is extremely lightweight)
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
69
![Page 152: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/152.jpg)
• Keep the number of threads close (less or equal) to the number of available physical CPU cores
• Separate IO threads from business logic thread (unless business logic is extremely lightweight)
• Use fixed size lock free queues / busy spins to pass the data between threads
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
69
![Page 153: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/153.jpg)
• Keep the number of threads close (less or equal) to the number of available physical CPU cores
• Separate IO threads from business logic thread (unless business logic is extremely lightweight)
• Use fixed size lock free queues / busy spins to pass the data between threads
• Use optimal algorithms/data structures and data locality principle
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
69
![Page 154: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/154.jpg)
• Keep the number of threads close (less or equal) to the number of available physical CPU cores
• Separate IO threads from business logic thread (unless business logic is extremely lightweight)
• Use fixed size lock free queues / busy spins to pass the data between threads
• Use optimal algorithms/data structures and data locality principle
• Precompute, use compile time instead of runtime whenever possible
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
69
![Page 155: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/155.jpg)
• Keep the number of threads close (less or equal) to the number of available physical CPU cores
• Separate IO threads from business logic thread (unless business logic is extremely lightweight)
• Use fixed size lock free queues / busy spins to pass the data between threads
• Use optimal algorithms/data structures and data locality principle
• Precompute, use compile time instead of runtime whenever possible
• The simpler the code, the faster it is likely to be
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
69
![Page 156: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/156.jpg)
• Keep the number of threads close (less or equal) to the number of available physical CPU cores
• Separate IO threads from business logic thread (unless business logic is extremely lightweight)
• Use fixed size lock free queues / busy spins to pass the data between threads
• Use optimal algorithms/data structures and data locality principle
• Precompute, use compile time instead of runtime whenever possible
• The simpler the code, the faster it is likely to be
• Do not try to be smarter than the compiler
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
69
![Page 157: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/157.jpg)
• Keep the number of threads close (less or equal) to the number of available physical CPU cores
• Separate IO threads from business logic thread (unless business logic is extremely lightweight)
• Use fixed size lock free queues / busy spins to pass the data between threads
• Use optimal algorithms/data structures and data locality principle
• Precompute, use compile time instead of runtime whenever possible
• The simpler the code, the faster it is likely to be
• Do not try to be smarter than the compiler
• Know the language, tools, and libraries
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
69
![Page 158: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/158.jpg)
• Keep the number of threads close (less or equal) to the number of available physical CPU cores
• Separate IO threads from business logic thread (unless business logic is extremely lightweight)
• Use fixed size lock free queues / busy spins to pass the data between threads
• Use optimal algorithms/data structures and data locality principle
• Precompute, use compile time instead of runtime whenever possible
• The simpler the code, the faster it is likely to be
• Do not try to be smarter than the compiler
• Know the language, tools, and libraries
• Know your hardware!
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
69
![Page 159: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/159.jpg)
• Keep the number of threads close (less or equal) to the number of available physical CPU cores
• Separate IO threads from business logic thread (unless business logic is extremely lightweight)
• Use fixed size lock free queues / busy spins to pass the data between threads
• Use optimal algorithms/data structures and data locality principle
• Precompute, use compile time instead of runtime whenever possible
• The simpler the code, the faster it is likely to be
• Do not try to be smarter than the compiler
• Know the language, tools, and libraries
• Know your hardware!
• Bypass the kernel (100% user space code)
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
69
![Page 160: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/160.jpg)
• Keep the number of threads close (less or equal) to the number of available physical CPU cores
• Separate IO threads from business logic thread (unless business logic is extremely lightweight)
• Use fixed size lock free queues / busy spins to pass the data between threads
• Use optimal algorithms/data structures and data locality principle
• Precompute, use compile time instead of runtime whenever possible
• The simpler the code, the faster it is likely to be
• Do not try to be smarter than the compiler
• Know the language, tools, and libraries
• Know your hardware!
• Bypass the kernel (100% user space code)
• Measure performance… ALWAYS
code::dive 2017 | Striving for ultimate Low Latency
HOW TO DEVELOP SYSTEM WITH LOW-LATENCY CONSTRAINTS
69
![Page 161: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/161.jpg)
code::dive 2017 | Striving for ultimate Low Latency
THE MOST IMPORTANT RECOMMENDATION
70
![Page 162: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/162.jpg)
Always measure your performance!
code::dive 2017 | Striving for ultimate Low Latency
THE MOST IMPORTANT RECOMMENDATION
70
![Page 163: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/163.jpg)
• Always measure Release version
cmake -DCMAKE_BUILD_TYPE=Release ..cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo ..
code::dive 2017 | Striving for ultimate Low Latency
HOW TO MEASURE THE PERFORMANCE OF YOUR PROGRAMS
71
![Page 164: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/164.jpg)
• Always measure Release version
cmake -DCMAKE_BUILD_TYPE=Release ..cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo ..
• Prefer hardware based black box performance meassurements
code::dive 2017 | Striving for ultimate Low Latency
HOW TO MEASURE THE PERFORMANCE OF YOUR PROGRAMS
71
![Page 165: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/165.jpg)
• Always measure Release version
cmake -DCMAKE_BUILD_TYPE=Release ..cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo ..
• Prefer hardware based black box performance meassurements
• In case that is not possible or you want to debug specific performance issue use profiler
• To gather meaningful stack traces preserve frame pointer
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CXX_FLAGS="-fno-omit-frame-pointer" ..
code::dive 2017 | Striving for ultimate Low Latency
HOW TO MEASURE THE PERFORMANCE OF YOUR PROGRAMS
71
![Page 166: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/166.jpg)
• Always measure Release version
cmake -DCMAKE_BUILD_TYPE=Release ..cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo ..
• Prefer hardware based black box performance meassurements
• In case that is not possible or you want to debug specific performance issue use profiler
• To gather meaningful stack traces preserve frame pointer
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CXX_FLAGS="-fno-omit-frame-pointer" ..
• Familiarize yourself with linux perf tools (xperf on Windows) and flame graphs
• Use tools like Intel VTune
code::dive 2017 | Striving for ultimate Low Latency
HOW TO MEASURE THE PERFORMANCE OF YOUR PROGRAMS
71
![Page 167: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/167.jpg)
• Always measure Release version
cmake -DCMAKE_BUILD_TYPE=Release ..cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo ..
• Prefer hardware based black box performance meassurements
• In case that is not possible or you want to debug specific performance issue use profiler
• To gather meaningful stack traces preserve frame pointer
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CXX_FLAGS="-fno-omit-frame-pointer" ..
• Familiarize yourself with linux perf tools (xperf on Windows) and flame graphs
• Use tools like Intel VTune
• Verify output assembly code
code::dive 2017 | Striving for ultimate Low Latency
HOW TO MEASURE THE PERFORMANCE OF YOUR PROGRAMS
71
![Page 168: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/168.jpg)
Flame Graph Search
test_bu..
x..
d..
s..
gener..
S..
_..
unary_..
execute_command_internal
x..
bash
shell_expand_word_listred..
execute_builtin
__xst..
__dup2
main
_.. do_redirection_internaldo_redirections
do_f..
ext4_f..
__GI___li..
__..
expand_word_list_internal
unary_..
expan..tra..
path..
execute_command
two_ar..
cleanup_redi..
execute_command
do_sy..
trac..
__GI___libc_..
posixt..
execute_builtin_or_function
_..
[unknown]
tra..
expand_word_internal_..
tracesys
ex..
_..
vfs_write
c..
test_co..
e..
u..v..
SY..
i..
execute_simple_command
execute_command_internal
p.. gl..
s..sys_write
__GI___l..
expand_words
d..
__libc_start_main
d..
do_sync..
generi..
e..
reader_loop
sy..
execute_while_command
do_redirecti..
sys_o..
g..
__gene..
__xsta..
tracesys
c..
_..
execute_while_or_until
code::dive 2017 | Striving for ultimate Low Latency
FLAMEGRAPH
72
![Page 169: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/169.jpg)
![Page 170: STRIVING FOR ULTIMATE LOW LATENCY · 2017-11-22 · Latency is the time required to perform some action or to produce some result. Measured in units of time like hours, minutes, seconds,](https://reader034.fdocuments.net/reader034/viewer/2022050114/5f4aaf1d1455b0043a087265/html5/thumbnails/170.jpg)