Optimizing download delivery for gaming | Akamai · • Inconsistent throughput within a download...
Transcript of Optimizing download delivery for gaming | Akamai · • Inconsistent throughput within a download...
2Optimizing Download Delivery for Gaming
Abstract:
Whether you are using one CDN or driving a multi-CDN strategy, understanding your
global traffic and cache map analysis is critical. If you can evaluate the feasibility of
optimized delivery, you can maximize performance across CDNs. In this article, you’ll
learn best practices for download delivery, details on timeouts to override baseline
defaults, how to support periodic new releases, improve parent caching footprints
and more.
Introduction
The gaming industry continues to grow year-by-year with new releases that require
support for an associated increase in volume of end-user traffic. This ultimately
boils down to large file downloads in higher quality formats, of which designated
audiences expect great performance and high availability. Both availability and
performance are now equally critical across a diverse set of devices and networks,
including over-the-air and mobile. For highly anticipated new releases, software
patches, and periodic releases, we can achieve improved delivery performance via
optimizing publishing strategies, upgrading client protocols, and applying caching
optimizations, such as improved mapping and cache footprint. We can develop
single-CDN and/or multi-CDN strategies to provide high availability and seamless
failover, which is critical during peak download events. In this article, we will discuss
best practices as well as recommended approaches in optimizing delivery.
3Optimizing Download Delivery for Gaming
Table of Contents
ABSTRACT 2
INTRODUCTION 2
PUBLISHING CONTENT 4
Packaging 4
Version management and TTL 4
CLIENT ENHANCEMENTS 5
HTTP/2 5
Quick UDP Internet Connection (QUIC) 5
CACHING OPTIMIZATIONS 7
Capacity & Mapping 7
Partial object caching 8
DELIVERY OPTIMIZATIONS 9
TCP Optimizations 11
Prefetching
HIGH AVAILABILITY AND FAILOVER 10
Prewarming/Flash crowds 11
Optimize the Failover Experience 11
Multi-CDN Strategy 13
TRACKING METRICS 13
CONCLUSION 14
BIOGRAPHY 15
4Optimizing Download Delivery for Gaming
Publishing Content
In our efforts to optimize delivery, we first direct our focus to the content itself through
delivery and packaging strategies, as well as URL-specific publishing techniques to
improve cacheability.
PACKAGING
We refer to packaging as a way of publishing content and providing public access
to end users. In video streaming, we can deliver packaged content via HLS or DASH,
while using pre-segmented media or byte-range requests to fetch content on the
client. With a download delivery based approach via HTTP, we can also utilize pre-
segmented media or byte range requests to fetch content, which provides us the
ability to use features like prefetching in addition to other optimizations. The alternate
approach of using a timestamp-based segment naming convention will not work
well since we will not be able to prefetch upcoming segments from the cache.
Additionally, this approach will further limit potential CDN performance optimizations
for long tailed content and high quality streaming such as FHD, UHD and VR.
In gaming, we have various packaging options including single files (i.e. format
container such as pkg) and a set of individual files (i.e. gaming SW in more than
one container). A few downsides to using a set of individual files include: ongoing
changes affect the entire package due to change management, and prefetching of
upcoming requests is not viable for individual files of a gaming title. Given these
circumstances, we are not able to utilize optimizations, such as prefetching, for
individual files in a gaming package so it is important to take into consideration how
certain optimizations are only available for specific packaging methods.
VERSION MANAGEMENT AND TTL
A few use cases such as TV series, movies, gaming SW, and device firmware patches
require version management. For example, TV series use dates as a form of version
management, while gaming SW and firmware patches use the release version for
updates. In order to cache updates, we can apply a Time-to-Live (TTL) to ensure
content remains in cache for a desired amount of time. The TTL value should be
determined by the timing and popularity of content, while taking into consideration
backend change management within the content repository. It is recommended to
5Optimizing Download Delivery for Gaming
use TTLs with long durations such as 7d, 30d or 365d to ensure content is delivered
as soon as possible to end users, while offloading the backend infrastructures.
In the situation that a newly released movie, gaming SW update or device firmware
update contains incorrect or errored content, we need to be able to quickly revert
and replace content despite using long TTLs. In order to accomodate caching and
the need for frequent updates, we can use CDN purging capabilities via an interface
or API to invalidate objects in the cache. It is recommended to use purge APIs and
build this into the content publishing cycle to ensure content versioning is accurately
managed and urgent content replacement is possible.
Client Enhancements
Recently, several protocols such as HTTP/2 and QUIC were introduced in an effort to
improve performance related metrics such as client-side download times and overall
connection times.
HTTP/2
HTTP/2 is the latest TCP+TLS based protocol that aims to improve shortcomings
of its predecessor HTTP/1.1. Several key features, such as multiplexing, header
compression, and resource prioritization, can improve download times significantly
for gaming packages and/or software updates. For example, header enhancements
can improve transfer times since we are essentially reducing the size of request/
response payloads. The more powerful feature of HTTP/2, multiplexing, allows
connection reuse along with parallel delivery of resources which proves useful for
package downloads. HTTP/2 can be enabled for all web and media products on the
Akamai platform today.
QUICK UDP INTERNET CONNECTION (QUIC) While HTTP/2 provides many performance-based improvements, the underlying
protocol is still TCP, which comes with known issues such as “safe” congestion control
algorithms. This can impact larger and higher quality downloads as a consistent
throughput is necessary for an optimal end user experience.
6Optimizing Download Delivery for Gaming
How can we obtain the benefits of HTTP/2 while tackling the shortcomings and
limitations of the TCP+TLS stack? This can be achieved with QUIC - the new
UDP-based protocol designed to overcome the various performance limitations
associated with TCP. As we know, TCP takes a rather conservative approach to
congestion control when packet loss is detected and QUIC aims to solve the sudden
throughput drop that can occur as a result of detected packet loss. QUIC also has a
0-RTT/1-RTT setup to reduce startup latency. New connections eliminate extra round
trips and existing connections reuse the same connection Along with aggressive
congestion control settings, we can expect faster connection establishments, resilient
multiplexing, and consistent throughput during download sessions. With a proven
track record of boosting performance for high traffic sites, QUIC is becoming a go-
to standard and can be enabled for select media products on the Akamai platform
today. Engage your Internal Account Team for feasibility, as well as perform thorough
testing before proceeding with a wide scale deployment.
SYN
ACKClientHello
ClientKeyExchangeChangeCipherSpec
Finished
ApplicationData
SYN ACK
ServerHelloCertificateServerHellDone
ChangeCipherSpecFinished
ApplicationData
7Optimizing Download Delivery for Gaming
Caching Optimizations
CAPACITY & MAPPING The Akamai Intelligent Platform consists of many servers spread around the world,
located near end users. When the platform observes popular content, a local copy is
kept and delivered to the next user requesting the same content. The user experience
is optimal and improved due to close proximity between end users and the servers
that are communicating with those end users.
A map can be thought of as a set of servers in a particular region around the world.
We can set up mapping profiles that serve traffic in an optimal way, depending on the
type of content, the location of end users and the location of origin servers.
In addition to setting up optimized maps, Tiered Distribution (tiered parents)
can be used to provide greater origin offload by funneling all traffic from Akamai
ClientHello(empty)
ClientHello
ClientHelloApplication
Data
ApplicationData
RejectionSourceAddressTokensCertificates
ApplicationData
ApplicationData
• 1RTT for new QUIC connections • 0 RTT on reconnecting
8Optimizing Download Delivery for Gaming
edge servers to a smaller set of edge servers, before reaching the origin. Tiered
Distribution uses Akamai’s intelligent mapping to pick an optimal set of parent servers
that can communicate with the origin. As we can see, we are essentially adding an
additional caching tier to further offload the origin. Additionally, we can further tune
this multi-tiered approach by “hashing” the incoming object URLs at the edge to pick
a consistent set of servers to fetch and/or cache this content. In doing so, this method
spreads out a customer’s cache footprint and network traffic consistently around the
parent servers.
PARTIAL OBJECT CACHING
Partial object caching breaks up a large file into smaller chunks and these partial
object chunks are cached only when requested by the end user. Benefits to using
this optimization include - ensuring that there are minimal bytes wasted from
overdownloading and each partial object chunk can be revalidated individually rather
than revalidating the entire large file object when the TTL expires. It is important to
note that this may cause a storm of revalidation requests towards parent caches and
origin servers in the case of lengthy content catalogs. With 3rd party origins like AWS
S3, this can result in wasting costs, or in the case of Netstorage origin, this can result
in origin slowness. Given these potential caveats, it is critical that we set TTLs to longer
values such as 30d or 365d. Additionally, disabling the partial object revalidation
can further reduce the number of requests and offload the origin. Overall, partial
object caching along with best practices can result in performance improvements
and outbound cost saving for NetStorage and 3rd party storage, respectively. Below
graphs illustrate the performance improvement of throughput, latency, and first byte
time when POC TTL and revalidation setting is optimized.
600 Mbps
500 Mbps
400 Mbps
300 Mbps
200 Mbps
100 Mbps
0 kbps10/24 11/1 11/1 11/15
Throughput (Avg)
1.0s
800 ms
600 ms
400 ms
200 ms
0 ms10/24 11/1 11/8 11/15
Latency (Avg)
40 ms
30 ms
20 ms
10 ms
0 ms
10/16 10/24 11/1 11/8
Turn Time (Avg)
9Optimizing Download Delivery for Gaming
Delivery Optimizations
TCP OPTIMIZATIONS
The Transmission Control Protocol (TCP) is the standard transport layer protocol
used on the Internet to control and ensure delivery of the data packets that make
up a website (i.e. an HTTP request or response). Specifically, TCP controls the
setup of connections between source and destination machines, the rate of packet
transmission, packet loss detection and recovery algorithms. Akamai is able to
optimize connection windows, tune TCP timeouts and loss recovery, maximize the
use of persistent connections and control other aspects of TCP to improve site
600 Mbps
500 Mbps
400 Mbps
300 Mbps
200 Mbps
100 Mbps
0 kbps10/24 11/1 11/1 11/15
Throughput (Avg)
1.0s
800 ms
600 ms
400 ms
200 ms
0 ms10/24 11/1 11/8 11/15
Latency (Avg)
40 ms
30 ms
20 ms
10 ms
0 ms
10/16 10/24 11/1 11/8
Turn Time (Avg)
10Optimizing Download Delivery for Gaming
performance. Ultimately, these optimizations maximize throughput between Akamai
Edge servers and clients, and also with origin servers.
PREFETCHING
Prefetching provides a way to proactively fetch content that may be needed at a
future point. In use cases such as VOD and Live video streaming, we should use
prefetch to retrieve the next request and keep the content ready. In the case of VOD,
we can prefetch a few segments ahead of time since we know the length of the
stream. In the case of Live, we should prefetch 1-2 segments in advance. In the case
of partial object caching, the prefetching of upcoming byte-ranges is mandated,
and determined by the size of the partial object. Keep in mind that sequential byte-
range requests can maximize performance, while non-sequential byte-range requests
(jumping back-and-forth in/out of byte-range) can waste the prefetched partial object
in cache. Other uses cases that could benefit from prefetching content ahead of time
include:
• Long tailed/less popular assets - which are normally cold
• Inconsistent throughput within a download session
• Fast startup and in-game latency
• Smaller videos (e.g 30s worth of ads)
• Large SW packages
High Availability and Failover
Downloads are increasingly becoming bigger and bigger. High definition movies
are in the 10-20 gb range. Games are consistently exceeding 50gb per download.
Consumer audiences are also becoming increasingly global with simultaneous multi-
country product launches being the norm.
11Optimizing Download Delivery for Gaming
These kind of download patterns put very high stress on infrastructure because
today’s consumers demand the content to be downloaded as fast as possible without
any hiccups. There is a need for both high availability and seamless failover behind
the scenes if anything happens.
PREWARMING/FLASH CROWDS
In the online gaming world, high traffic periods are very predictable. One way to
solve this problem is to pre-fill specific and targeted content into carefully selected
set of Edge servers to maximize content being served from cache. Ideal cases for
prewarming are regular software updates or game releases. Akamai account teams
work closely with customers to prepare the network for expected periods of known
high demand traffic.
OPTIMIZE THE FAILOVER EXPERIENCE
While focusing on improving download delivery, it is often overlooked as to how the
failover experience can be optimized as well. Within a CDN model, we can adjust
various settings such as timeout and retry values between CDN and origin, which
will provide a more robust failover experience for end users. It is recommended to
set the timeout value according to average origin response times, while also setting
the retry count to an appropriate value in order to protect the origin during peak
event outages. If using a multi-tiered approach in delivering content, such as Tiered
Distribution, we can cascade timeouts based on each tier that a request has reached.
In doing so, we can ensure that each tier is given enough time to try alternate edge
servers before serving a failure response to the end user.
12Optimizing Download Delivery for Gaming
The above recommendations provide enough time for servers to respond to a
request as well as quicker error responses during an outage, but what happens
once an error response is served? In order to ensure seamless failover in case of
connection issues, consider setting up content on a secondary origin to avoid
displaying default and non-informative errors to your end users. Akamai Netstorage
can be used as an alternate origin server for certain failover scenarios. For example,
a maintenance page can be maintained on Netstorage in case of full outages while
static backup files can be maintained on Netstorage for planned releases. If more
complex availability and failover strategies are needed, explore the below approach
with multi-CDN.
Primary ParentMap
Quick Retry Path
Secondary ParentMap
Edge Regions
End User
Origin
Parent – OriginSureRoute Map
13Optimizing Download Delivery for Gaming
MULTI-CDN STRATEGY
As part of availability and failover planning, many customers move forward with
a Multi-CDN strategy. These are some of the ways we have seen our customers
intelligently pick CDNs.
• Regional Performance Considerations - Common techniques include taking
the geography / isp into consideration while making a decision. CDNs have
different server footprints across the world leading to different availability and
performance.
• Real user monitoring decisions - Customer’s client continually beacons back
client side data such as download time, throughput, end-user network and
other performance metrics. With this data, either the client or the server can
decide to either stay with the existing CDN or to switch. The logic in these cases
would typically be embedded within a SDK in the client.
Tracking Metrics
While content delivery is critical, we need to consider how we can track and report on
various KPIs. For example, it is a requirement to track down the number of downloads
that have been started and completed for certain types of downloads (i.e. Software
downloads / gaming downloads).
For streaming, a few common metrics to track include the following:
• Startup time - Track how long end users have to wait for the playback/download
to start. Impatient users will walk away.
• Bit rates - Assuming that a customer is delivering multi-bitrate streams, track and
measure the average/median quality that end-users are consuming.
• Session length - Track how long users are watching your content. These days, a
lot of video content is monetized via ads.
14Optimizing Download Delivery for Gaming
• Rebuffer ratio - Track and ensure this is a low number since users have many
options these days and will not return if faced with a bad experience.
• Throughput - Track throughput provided by CDN
• Download completion - This is more relevant for downloads but a customer
would like to know what % of downloads were completed. In a few cases, this
becomes more important especially if a user is downloading a paid game
or software as it starts to cause business issues if a user cannot complete a
download
Some of these metrics can be captured through Akamai’s media reports while more
granular client-side metrics may require SDK integrations (either Akamai or 3rd party)
to capture such data.
Conclusion
Delivering large downloads over the internet to a demanding audience is easier
said than done. Gamers want their content now and are not willing to wait. Content
publishers have to strike a difficult balance between cost, performance, and
availability. Fortunately, Akamai is in a unique position to have worked with some of
the largest companies in the world while supporting their events seamlessly.
Akamai’s consulting team (reachable at [email protected]) have the experience
and expertise to work with and manage some of the largest download events
on the internet.
15Optimizing Download Delivery for Gaming
Biography
Sabrina Burney
Enterprise Architect
Sabrina Burney has worked in many different fields
since graduating from Santa Clara University. She
has a background in computer engineering and has
always had a passion for technologies in the IT world. Sabrina’s experience inside
and outside of Akamai includes roles in software development and web security,
as well as more recently the media and web experience world. She is able to utilize
her backgrounds in multiple fields to help improve the overall end user experience
when it comes to navigating the Web. Sabrina’s recent work is focused on third-party
content and ways to improve the associated vulnerabilities and concerns—she has
several patents pending in this subject area. Outside of work, she enjoys playing
soccer with her fellow coworkers as well as traveling with her family.
Rajiv Ramnath
Enterprise Architect
Rajiv Ramnath is an Enterprise Architect in Akamai.
Rajiv has worked on some of the largest events in
Akamai such as the Super Bowl and the Fifa world cup.
Rajiv has a background in computer engineering. Prior to joining Akamai,
Rajiv worked as a software developer in Singapore working on security projects for
the Singapore government.
16Optimizing Download Delivery for Gaming
Changhyeon Lim
Senior Enterprise Architect
Changhyeon Lim is a Senior Enterprise Architect at
Akamai, based in Seoul. He has led the data-centric
consultancy projects in terms of service quality
assessment and custom report of large broadcasters’
live event, the performance assessment of download delivery of gaming and
firmware, the optimization of origin traffic and capacity, and advisory for architectural
design and technology roadmap in customer’s media service. And, he is sharpening
his methodologies and developing new areas to get applied. Before joining Akamai,
Changhyeon had various experiences of multiple roles such as SW developer,
architect, product manager and team manager in Enterprise Mobility Management
Platform and Mobile Broadcasting (DVB-H and mDTV (ATSC-MH)) in Samsung
Electronics for 9 years. He hold a Doctor degree with a research of TCP algorithm
enhancement.
17Optimizing Download Delivery for Gaming
Akamai secures and delivers digital experiences for the world’s largest companies. Akamai’s intelligent edge platform surrounds everything, from the enterprise to the cloud, so customers and their businesses can be fast, smart, and secure. Top brands globally rely on Akamai to help them realize competitive advantage through agile solutions that extend the power of their multi-cloud architectures. Akamai keeps decisions, apps, and experiences closer to users than anyone — and attacks and threats far away. Akamai’s portfolio of edge security, web and mobile performance, enterprise access, and video delivery solutions is supported by unmatched customer service, analytics, and 24/7/365 monitoring. To learn why the world’s top brands trust Akamai, visit www.akamai.com, blogs.akamai.com, or @Akamai on Twitter. You can find our global contact information at www.akamai.com/locations. Published 01/19.