Cache is king

39
Waseem Asif @folio_3 www.folio3.com Copyright 2015

Transcript of Cache is king

Waseem Asif

@folio_3 www.folio3.com Copyright 2015

Introduction

Typical Implementation

Tools◦ Memcache, Redis

◦ Varnish

◦ HTML 5

Challenges

@folio_3 www.folio3.com Copyright 2015

A cache is a component that transparently stores data so that future requests for that data can be served faster

Speeds up the data lookups

Wikipedia serves 75% content from Cache

@folio_3 www.folio3.com Copyright 2015

Hardware◦ CPU (L1, L2)

◦ HDD (Read/Write Buffers)

Applications

Database

Proxies

Web Servers

Browsers

@folio_3 www.folio3.com Copyright 2015

@folio_3 www.folio3.com Copyright 2015

Populating the cache◦ Upfront Population

◦ Lazy Population

Keeping the cache and source in sync◦ Time based Expiry

◦ Active Expiry

◦ Write through Cache

Managing cache size◦ Removing data

@folio_3 www.folio3.com Copyright 2015

Upfront Population Lazy Population

No one-off cache read delay

The initial build of the cache takes time

Data cached may not be read at all

First request is not found in cache

Only caches data that is actually read

Has no cache build delay

@folio_3 www.folio3.com Copyright 2015

Time Based Expiry◦ Data is removed from cache after some time

◦ Most commonly used

◦ Short expiry time can be a problem

Active Expiry◦ Source tells cache to remove data when updated or

removed

Write-through cache◦ Allow read and write to cache

◦ Data is added/written to source when written to cache

@folio_3 www.folio3.com Copyright 2015

Cache storage is smaller then original data

Have to manage space smartly

To make space for new data:◦ Time based eviction

◦ First in, first out (FIFO)

◦ Least accessed

◦ Least time between access

◦ Many others (http://en.wikipedia.org/wiki/Cache_algorithms#Examples)

@folio_3 www.folio3.com Copyright 2015

@folio_3 www.folio3.com Copyright 2015

@folio_3 www.folio3.com Copyright 2015

@folio_3 www.folio3.com Copyright 2015

Application◦ Objects◦ Arrays / Query results◦ String/HTML/ Full Page

Database◦ Query cache

Web Server◦ Static resources◦ Full page cache

Browser◦ Headers◦ HTML5

@folio_3 www.folio3.com Copyright 2015

@folio_3 www.folio3.com Copyright 2015

It is Free & Open Source

Distributed memory based cache

Key value store

Easy to deploy

Client libraries available in most of the languages

@folio_3 www.folio3.com Copyright 2015

Since 2003

Written in C

Works on Linux, Windows

Fast by default

Security can be an issue◦ Expose only to trusted network

◦ Optional authentication

@folio_3 www.folio3.com Copyright 2015

@folio_3 www.folio3.com Copyright 2015

@folio_3 www.folio3.com Copyright 2015

Issues with large implementations.◦ Large works are fine by default

No communication between Servers◦ No Data replication

Capacity issue of single Server

No warm-up on startup

Solution◦ Mick-Router

@folio_3 www.folio3.com Copyright 2015

Tool by Facebook

Provides failover, Load balance, warm-up server, replication of server

How it works:◦ https://code.facebook.com/posts/29644273721

3493/introducing-mcrouter-a-memcached-protocol-router-for-scaling-memcached-deployments/

@folio_3 www.folio3.com Copyright 2015

@folio_3 www.folio3.com Copyright 2015

@folio_3 www.folio3.com Copyright 2015

@folio_3 www.folio3.com Copyright 2015

@folio_3 www.folio3.com Copyright 2015

Key-value cache Optionally a permanent Store◦ Snapshotting◦ Log Based

Support a lot of Data Structures.◦ String, Lists (arrays), Sets (Unique), Hashes

(Objects), Sorted Sets, BitMaps, HyperLogLogs etc◦ http://redis.io/topics/data-types-intro

Lot more than just SET, GET and DEL◦ Increment, Append, LPUSH, RPOP, LRANGE◦ Pipelining (Send multiple commands in one go)◦ http://redis.io/commands

@folio_3 www.folio3.com Copyright 2015

Replication

Client libraries available in most of the languages◦ C, C#, PHP, Objective-C, Node.Js, Python, Ruby and more

◦ http://redis.io/clients

Works on Windows◦ Windows port developed/maintained by Microsoft Open

Tech group

Available on cloud◦ AWS, Morpheus, RedisToGo, RackSpace, Microsoft Azure

◦ Comes with a lot of preset like backup, monitoring, high availability, ACL etc

◦ http://en.wikipedia.org/wiki/Redis#Cloud_deployment

@folio_3 www.folio3.com Copyright 2015

Performance◦ Has built-in benchmark, easy to run

◦ It is easy to run test and find out real picture

◦ http://redis.io/topics/benchmarks

Since 2009

Written in C

@folio_3 www.folio3.com Copyright 2015

Redis Memcached

Stores data in a variety of formats

Complex to configure (Default configuration can work)

Pipelining! Multiple commands at once

Locking reads - will wait until another process writes data to the cache

Mass insertion of data to prime a cache

Partitions data across multiple redis instances

Can back data to disk Pub/Sub feature

Doesn't do anything besides being an in-memory key/value store

Low complexity Simple to configure Few command macros

making it simple to master

Runs like a rock Many years in production Every programming

language has a memcached library.

@folio_3 www.folio3.com Copyright 2015

A lot of bench marks available◦ http://oldblog.antirez.com/post/redis-

memcached-benchmark.html

◦ http://dormando.livejournal.com/525147.html

◦ http://oldblog.antirez.com/post/update-on-memcached-redis-benchmark.html

@folio_3 www.folio3.com Copyright 2015

NCache: Distributed Cache for .NET

CouchDB: Can be used as replacement of Memcache

MemSQL: The Database for Speed, Scale & Simplicity

And a lot more

@folio_3 Copyright 2015www.folio3.com

@folio_3 Copyright 2015www.folio3.com

Sits in front of web server(s)

@folio_3 Copyright 2015www.folio3.com

Page cache

Static File Cache

Load Balancing

Rewriting and redirecting URLs

Varnish Configuration Language◦ Powerful, Can manage rules, Modify headers

Community/Enterprise Editions

Becoming the default for specialized cloud hosting◦ WP, Magento, Joom, Drupal etc

@folio_3 Copyright 2015www.folio3.com

ESI◦ Useful for user customized dynamic content◦ Help increase TTL for pages◦ <esi:include src="/news/hot_news.html"/>

Works for Windows via CYGWIN◦ Can have IIS on backend

Stores cache in memory or file Paid edition has admin interface◦ Real-time Statistics◦ Cache group management◦ Invalidations of cache etc

@folio_3 Copyright 2015www.folio3.com

LiteSpeed: Provides cache, replacement for apache httpd, Load balancer etc

Apache Httpd: Mod Cache, Mod Mem Cache

Nginx Cache

@folio_3 Copyright 2015www.folio3.com

If cached properly:◦ Resources are not downloaded on next requests

Set “Expires” Header for Static Resources

“304 Not Modified” Requests◦ Involves http round trip but file is not downloaded

@folio_3 Copyright 2015www.folio3.com

Easy to implement

Very effective

Headers can be set on◦ Application

◦ Web Server

@folio_3 Copyright 2015www.folio3.com

Application Cache◦ manifest="demo.appcache“

◦ Define files which are to be cached

CACHE MANIFEST

NETWORK: Not to be cached

FALLBACK: If page not available then fallback to these

◦ http://diveintohtml5.info/offline.html

@folio_3 Copyright 2015www.folio3.com

Which Data to Cache◦ Can’t cache every thing◦ Space is limited

Expiry Time◦ Have to be selective, too short or too long can be a problem

Multiple Cache backend◦ Use frame works so it is easy to Switch

Full Page / Parts◦ Based on needs and usage

Logged/Logged Out Users◦ ESI, Logged in user still has part which can be cached

Removing Data form cache◦ There must be a way to remove individual item

Key Scheme Monitor Hit/Miss

@folio_3 Copyright 2015www.folio3.com