Prepare to Scale

29
1 Prepare to Scale Bill O’Connor, CTO d.o.: csevb10 1

description

Prepare to Scale. Bill O’Connor, CTO d.o.: csevb10. 1. 2. 3. Basic Infrastructure. Single-Server Database & Application on the same server Start optimizing what you have Apache Drupal PHP Database. Optimizations you make for the first server will be applicable for future servers - PowerPoint PPT Presentation

Transcript of Prepare to Scale

Page 1: Prepare to Scale

1

Prepare to Scale

Bill O’Connor, CTOd.o.: csevb10

1

Page 2: Prepare to Scale

22

Page 3: Prepare to Scale

33

Page 4: Prepare to Scale

4

Basic Infrastructure

• Single-Server– Database & Application on the same

server– Start optimizing what you have

• Apache• Drupal• PHP• Database

4

– Optimizations you make for the first server will be applicable for future servers

– Strategy: Optimize what you have, then divert traffic through caching and specialization.

Page 5: Prepare to Scale

5

Drupal• 1-word:

– Support for Database Replication– Support for Squid/Varnish– MySQL optimizations– PHP5 optimizations– http://fourkitchens.com/pressflow-makes-drupal-scale/

downloads

5

Page 6: Prepare to Scale

6

DB• MyISAM– Default storage engine for <= Drupal 6– Good for selects

• Read-only sort of websites

– Poor Read-write performance for large websites

6

Page 7: Prepare to Scale

7

DB, Cont.• Falcon– Beta-stage project for MySQL– Different performance characteristics than other engines

(both + & -)• Not ready for primetime, but worth watching

7

Page 8: Prepare to Scale

8

DB, Cont.• InnoDB is your friend in most scenarios.– Row-level vs Table-level locking– Improves read/write functionality

• Does slow pure read functionality to some degree

– Easier to do it right from the start, then have to revisit the issue later when you have users and traffic

– Default Store Engine of Drupal 7+– Best bet at the moment for allowing your site to scale

8

Page 9: Prepare to Scale

9

PHP• Opcode caching– Sort of like having a compiled version of your application– Optimizes your application– Stores the compiled PHP bytecode for execution in stored

memory– Result: Smaller PHP memory footprint (read more users

with less hardware) and faster execution of code.• Virtually a necessity for any large-scale/high-volume Drupal deployment

9

Page 10: Prepare to Scale

10

PHP, Cont.• Opcode caching– eAccelerator

• Off & on maintenance• Only works with threadsafe PHP• Has – in my experience – led to some strange crashing, WSOD, etc.

– Xcache• Reasonable performance improvement, though tends to performance

test slowest of the 3• Actively maintained. • Stable, but still prone to cache-corruption, WSOD, etc.

10

Page 11: Prepare to Scale

11

PHP, Cont.• Opcode caching, cont.– APC

• Current opcode cache of choice.– Most actively updated.– Most stable of the 3.– Usually the winner in performance benchmarks.

• Maintained by core PHP developers (Rasmus).

11

Page 12: Prepare to Scale

12

Static Caching• Static Caching Modules– Creating and storing rendered versions of the html

• Rather than building the page on request

– Avoids having to load any aspect of your application depending on the implementation

– Acts as a layer between the user and actual execution of your program• Alleviates DB issues since the DB is no longer involved• Simplifies any PHP execution

12

Page 13: Prepare to Scale

13

Static Caching, Cont.• Static Caching Modules, Cont.– Boost Module

• Static file caching• Good for Anonymous traffic only• Great performance for small sites• Ideal for shared hosts

– AuthCache Module• Static file caching• Attempts to handle logged-in traffic• Plays nice with and/or can utilize multiple caching engines (more on

those later)• Can be a bit of a pain for user-specific content as you have to write

particular cases for each user-specific area

13

Page 14: Prepare to Scale

14

Static Caching, Cont.• Static Caching Modules, Cont.– Shameless plug: Ajaxify Regions

• Aptly-named….or not.– Actually pulls Blocks not Regions via ajax

• Early release w/ plenty of work to do, needs more real-world testing, etc.• Automatically handles all user-specific block content based on block-

caching settings – BLOCK_NO_CACHE– BLOCK_CACHE_PER_USER– BLOCK_CACHE_PER_ROLE

• Concept: ajax load anything that can’t be cached for everyone.

14

Page 15: Prepare to Scale

15

Object-level Caching• Object-level caching– Provides a way to store full-generated objects– Can be the amalgam of many queries

• Think of all the queries run on a node_load vs retrieving all that information in 1 query.

– Stories the information in memory for fast access– Performance characteristics not significantly different

than MySQL when MySQL can handle the load• BUT can handle a much higher load

– Protects the DB – the area most likely to inhibit performance for Drupal – from becoming overwhelmed

15

Page 16: Prepare to Scale

16

Object-level Caching, Cont.• Object-level caching, Cont.– APC

• Not a typo.• APC can handle object caching as well as op-code caching.• It’s fast: everything is stored in local memory.• It caches only for one server.

– This means that you could have synchronization issues between servers if you have more than one.

– If that’s not an issue, it’s a quick and easy solution.

• Ideal for single-server implementations or when synchronicity isn’t an issue.

16

Page 17: Prepare to Scale

17

Object-level Caching, Cont.• Object-level caching, Cont.– Memcache

• Utilized by most high-profile sites.– Facebook, for instance, makes tremendous use of lots and lots of memcache servers.– Drupal.org uses it.

• Provides an object cache that can be used by multiple servers.• Slower in the single-server instance than APC, but provides synchronicity.• Multiple silos/buckets can be created for information, so you can

distribute information across multiple servers.

17

Page 18: Prepare to Scale

18

Advanced Infrastructure (ex)

Application

Database Solr Memcache Deployment

Slave DB

18

Static-Caching

Load Balancer

Page 19: Prepare to Scale

19

Specialization• Specialized Servers/Services– DB Server– SOLR– Memcache– Static-caching– CDN

19

Page 20: Prepare to Scale

20

Specialization• MySQL Server– One of the fastest ways to improve performance is to

separate your MySQL DB from your application– This allows both your application and your db to make full

use of independent hardware– The change is basically transparent at the application

layer: just a single change to settings.php

20

Page 21: Prepare to Scale

21

Specialization• Search– Problem: Search is incredibly hard on the system

• Particularly w/ multiple search terms• Drupal search works, but, despite great efforts is still not as quick or

useful as an outside solution• Search is particularly hard on the DB, Drupal’s traditional bottleneck

– In other words, search makes a bad problem worse

21

Page 22: Prepare to Scale

22

Specialization• Search, Cont.– Solution: Solr

• Communication layer between the website and the Lucene search index

• Offloads all of the complex processing to a separate box– More power for searches (search faster!)– Doesn’t lock up your website DB– Website can focus on what it does, search can focus on what it does

• Additional benefit: faceting (filtering), sorting– Ability to search content based on specific criteria (content type, author, taxonomy

terms) and sort based on criteria (title, date, author, content type)

• Hosted model (Acquia Search) or can be installed on server in your infrastructure

22

Page 23: Prepare to Scale

23

Specialization• Static Caching– Static-caching on the same server as the website provides

performance improvement• Downside: there’s still a lot of wasted overhead. apache has everything it

needs for a website, not just serving html; php also has to load.

– Static-caching elsewhere provides the opportunity to optimize the server for static-caching• Side effect: your web server now has more memory free to handle

requests that require php processing.

23

Page 24: Prepare to Scale

24

Specialization

• Static Caching, Cont.– Squid

• Free• Not specifically designed just for

http acceleration• Difficult to setup/configure• Performance improvement, but less than competition

24

Page 25: Prepare to Scale

25

Specialization

• Static Caching, Cont.– Varnish

• Free (to download)• Pressflow built to work w/ Varnish• Varnish servers set up for Drupal and usable off Amazon EC2 (developed

by Chapter 3) ($.34/hr +$.17/GB)• Designed from the group up for http acceleration• Can take time/expertise to get the performance you want• Can create a significant performance improvement once configured

correctly

25

Page 26: Prepare to Scale

26

Specialization

• Static Caching, Cont.– AI-Cache

• Best performance of the bunch• Simple configuration• Provides additional features for caching

– header recognition– session caching

• Drop-in solution• Not free• Amazon EC2 instance is available ($.68/hr +$.20/GB)

26

Page 27: Prepare to Scale

27

Specialization• CDN– Cache content that is static (outside of full pages)

• Images• Video• CSS• JS

– Popular examples• Akamai• LimeLight• Amazon CloudFront

– Separate domains, more bandwidth, geographic servers all equal faster loading

– Can be an expensive option

27

Page 28: Prepare to Scale

28

Summary• Start small and make the easy optimizations:– Pressflow– InnoDB– APC

• Add servers and services as necessary and based on individual traffic:– MySQL– SOLR– Memcache– Static Cache– CDN

28

Page 29: Prepare to Scale

29

The End.• Questions?

29