NY Web Performance - DNS as a Web Performance Tool
Transcript of NY Web Performance - DNS as a Web Performance Tool
Intelligent DNS & Traffic ManagementIntelligent DNS & Traffic Management
April 19, 2016
Kris Beevers, co-founder & CEO
DNS as a web performance tool
Hi! Me:Huge performance geekBackend & internet infrastructure at scalePreviously: cloud, bare metal, CDN, etcThese days: intelligent DNS @ NS1
Today:DNS crash course / refresherWhat can go wrong & how perf is impactedOptimizing DNS lookup performanceDNS as an offensive tool to optimize app perf
Delegation:Authoritative nameservers for a domain are DELEGATED via the registrar at the TLD nameservers
…
…
Recursive lookup:DNS is a hierarchical distributed database.
End-to-end lookups start at the root;continue at TLD servers;and are resolved by authorities for a domain(which may subdelegate, reference other domains, etc).
Caching resolvers:End-to-end resolution is usually performed by special servers called resolvers.
Resolvers cache DNS records according to a TTL specified by the authority.
End to end:
(1) request
www.example.com A?
(2) local cache?
(3) query
(8) answer
(4) cache checks:www.example.com Aexample.com NScom NS
(5) com NS?
(6) example.com NS?
(7) www.example.com A?
RESOLVER
ROOT (“.”)AUTHORITY
COMAUTHORITY
EXAMPLE.COMAUTHORITY
Understanding DNS perf is hardCaching everywhereMost of the lookup process short circuited most of the time
What are the most commonly incurred DNS lookup costs?
Depends on many factors:
● Domain popularity● Distribution of users● Cache TTLs
RESOLVER
ROOT (“.”)AUTHORITY
COMAUTHORITY
EXAMPLE.COMAUTHORITY
60-100%
5-50%
Client → Resolver
Up to the client. Not much you can do.
RESOLVER
ROOT (“.”)AUTHORITY
COMAUTHORITY
EXAMPLE.COMAUTHORITY
Resolver → Authority
Direct latency impact on cache missLong lookup times block asset loadsMinimize authoritative DNS response timesAvoid embedding assets that are bad at this!
RESOLVER
ROOT (“.”)AUTHORITY
COMAUTHORITY
EXAMPLE.COMAUTHORITY
A problematic asset:What does a slow authoritative DNS lookup look like?
Results in WPT or from actual clients can be intermittent because of caching.
Examine the lookup path of your assets.
Let’s dig in (pun intended).
Follow the lookup trail:Is it the final authority? Query it directly.
In this case, no -- we don’t have a webserver IP yet.
Let’s continue to gather more information like a resolver would.
Is it the final authority?Lookup the CNAME target like a resolver would.
We’ve gotten back A records (IP addresses) -- these are the webservers for this asset.
Checking for slownessRecap: we needed to talk to two authoritative nameservers to resolve the asset’s webserver IPs:
How performant are they?
Checking for slownessRecap: we needed to talk to two authoritative nameservers to resolve the asset’s webserver IPs:
How performant are they?
Let’s keep it simple: traceroutes from NYC
Checking for slownessRecap: we needed to talk to two authoritative nameservers to resolve the asset’s webserver IPs:
How performant are they?
How about traceroutes from California?
Checking for slownessSo:
is a problematic authority.
If there is a cache miss at a user’s resolver, in NYC this lookup will introduce at least ~80ms delay for the asset, and in California at least ~150ms.
A well-managed asset:Quick look at a better authoritative DNS situation.
Who is authoritative? Cloudflare.
A well-managed asset:Quick look at a better authoritative DNS situation.
Who is authoritative? Cloudflare.
Final authority? Nope, there’s a CNAME.
A well-managed asset:Quick look at a better authoritative DNS situation.
Who is authoritative? Cloudflare.
Final authority? Nope, there’s a CNAME.
CNAME authority? NS1.
A well-managed asset:Quick look at a better authoritative DNS situation.
Who is authoritative? Cloudflare.
Final authority? Nope, there’s a CNAME.
CNAME authority? NS1.
Final authority? Yep, there’s the A record.
A well-managed asset:Quick look at a better authoritative DNS situation.
Who is authoritative? Cloudflare.
Final authority? Nope, there’s a CNAME.
CNAME authority? NS1.
Final authority? Yep, there’s the A record.
Are Cloudflare or NS1 slow from NYC? No.
A well-managed asset:Quick look at a better authoritative DNS situation.
Who is authoritative? Cloudflare.
Final authority? Nope, there’s a CNAME.
CNAME authority? NS1.
Final authority? Yep, there’s the A record.
Are Cloudflare or NS1 slow from NYC? No.
From California? No.
Signs are good that this asset’s authoritative DNS won’t cause trouble!
Hunt down assets with slow lookups in the DNS resolution path.You won’t see the impact every time, but they will cause intermittent slowdowns.
“Common knowledge” is out of dateNever okay for one of your nameservers to failGeographically redundant nameservers != fastAll managed DNS topologies not created equal
Geographically redundant NS’s
SRTT = smoothed round trip time (performance affinity -- prefer fastest authority)
Even when SRTT is in place, do you want to leave your perf up to ISPs?
SRTT: 50-60% of resolvers
1-10% 90-99%
Non-SRTT: the rest
50% 50%
Just use managed DNS… right?
Not all managed DNS networks are created equal
# POPs
40+
50+
?
<1010+15+
30+15+20+
Anycasted DNS Networks
The (globally) highest performing DNS networks are anycasted and tightly optimized for latency
Operationally complexExpensive to build and maintain“Don’t try this at home”
Well-managed property: nytimes.comLet’s look at authoritative DNS for every single asset on nytimes.com homepage
Well-managed property: nytimes.comLet’s look at authoritative DNS for every single asset on nytimes.com homepage
100% anycasted managed DNS and CDN:
CDN:Facebook ( , )Google ( )Akamai ( )Edgecast ( )CDNetworks ( )
Managed DNS:NS1 ( )AWS Route53 ( )Dyn ( )Akamai ( )
DNS lookup is an opportunity to inject intelligence in the delivery path
First indication of interest in content or application
DNS lookup: powerful tool for service endpoint selection
Multiple datacentersMultiple CDNsInfrastructure elasticity…
Effective service endpoint selection is critical in 2016:
Web applications are increasingly dynamic and distributed
Web applications have increasingly wide audiences
Early 2010s
Centralized origin / dynamic applicationCaching CDNGlobal audience
LB
DB
W W W
CDNCache
Origin
Today
Distributed application nodes (dynamic app delivery moving to where the users are)
Multiple CDNs (optimized for different use cases & markets)
Impatient global audience (blame the Big Boys for paving the way)
CDNB
CDNA
Why is DNS lookup good for service endpoint selection?
Pervasive: part of every application, used by every client
Lightweight: no application / architecture changes, no hardware / software
Modern DNS traffic management platforms use real-time perf data
Infrastructure telemetryRoute based on app infrastructure performance & workload
TopologyHealthLoadTraffic…
DNS
Modern DNS traffic management platforms use real-time perf data
Network telemetryRoute based on real measured end-user performance
LatencyThroughputReachability…
DNS
Modern DNS traffic management platforms use real-time perf data
TopologyHealthLoadTrafficLatencyThroughputReachability…
DNS
RESOLVERMODELING
APP-SPECIFIC ROUTING ALGO
Avoid embedding assets serviced by slow DNS
Follow the end-to-end DNS lookup path of your assetsUse globally anycasted DNS networks
Use intelligence in DNS lookup to control & optimize your traffic as:
Your audience globalizesYour application becomes more distributedYour CDN needs become more specialized