Scaling apps for the big time

19
Pro IT Consulting Scaling apps for the big time

description

Accompanying slides for the "Scaling apps for the big time" presentation delivered at MelbDjango 1.4, hosted by Common Code

Transcript of Scaling apps for the big time

Page 1: Scaling apps for the big time

Pro IT ConsultingScaling apps for the big time

Page 2: Scaling apps for the big time

The Challenge?

• You have an app that works• You have users that like it

Awesome• Performance is suffering as you scale.• Reliability is getting worse, not better.• As your data sets grow,

the problems are more pronounced.• The operations team are talking about problems, not solutions

Not so awesome

Page 3: Scaling apps for the big time

So what happens if you win big?

Page 4: Scaling apps for the big time

You are not alone – unfortunately…

• Your cool app• May end up supported • By lots of things• You can’t control

Page 5: Scaling apps for the big time

You are not alone – unfortunately…

Page 6: Scaling apps for the big time

What is the root cause?

• Take the time to understand what happens when your code asks the server to do some task.

select * from some_production_table_with_100,000,000_records

Is really not the same workload as

select * from some_dev_table_with_100_records

• Look for evidence in logs and tools that provide real insight.

Page 7: Scaling apps for the big time

What is the root cause?

Page 8: Scaling apps for the big time

Issues of priority…

• Disk drive, single user session

• Disk drives, Multiple users….

Page 9: Scaling apps for the big time

Issues of Scale…

• Fetching Blocks, single user session

• Fetching Blocks, enterprise workload

Page 10: Scaling apps for the big time

Storage

• Many database and operating system vendor recommendations are woefully out of date.

• Modern techniques utilising flash in the right way can deliver millions of random IOPS.

• SAN and flash vendors have made dramatic changes over the last few years that invalidate many of the old recommendations.

• Some principles still hold and are important for optimised performance– 1 process writes to each disk group– Avoid reads and writes occurring simultaneously if possible

Page 11: Scaling apps for the big time

CPU

• CPUs are not all created equal.• Use SpecInt to compare if it matters for your workload.• Split up the work and scale wide if you can. There is a reason

the web scale companies have.• Don’t process work now that can wait until later. • Later might be in a few seconds and on another box.• Schedule intensive workloads like reports.• Don’t expect your laptop and the production server to scale

the same way.

Page 12: Scaling apps for the big time

Memory

• Memory is addressable in various forms with performance tradeoffs for capacity.

• Use the lowest latency one you can afford.

Memory Type Typical Capacity ApproximateAccess time

CPU cache 30MB < 10 nsDDR3 64GB <100nsSSD ~ 800GB <20,000nsFC or SAS ~ 1TB <20,000,000nsSATA 4TB + <8,000,000ns

Page 13: Scaling apps for the big time

Network

• Why is it that we conceptualise networks from an individual point of view?

Page 14: Scaling apps for the big time

NetworkThe best transport is context dependent

Page 15: Scaling apps for the big time

Network

• Latency & Bandwidth are not the same thing.– Think satellite delay on a TV interview

• In this context we use these definitions– Latency is the amount of time a network takes to reach the other end.– Bandwidth is the rate at which we can successfully transmit data to the

other end.

• This is why you need to test your app through a latency generator. – There are capable free open source tools such as WANEM

Page 16: Scaling apps for the big time

Middleware

• Websphere, WebLogic, JBOSS, Tomcat– Garbage collection tradeoffs between JVM size and system

memory/CPU capacities.

• Django– Read HighPerformanceDjango by the team from Lincoln Loop– Sponsored by the Common Code team

Page 17: Scaling apps for the big time

SQL databases

• Microsoft SQL, Oracle DB, PostgreSQL & MySQL.• Various strengths & weaknesses for each but have some key

things in common.• Offload reporting away from OLTP workloads• Indexes are important• Transaction Logs are a performance bottleneck• Think deeply about scaling out • Think about caching queries• Backups are critical because you will need to restore one day

Page 18: Scaling apps for the big time

Backup is about Restore

• Enterprise wide backup will find all your infrastructure failings by pushing more data for longer while other work continues.

• Test your restores. Really, test them.• Offload large backups away from your production systems.

Page 19: Scaling apps for the big time

Questions?

How to get in touch?

James CliffordEmail: [email protected]: 0421 648 034

Brenton CarbinsEmail: [email protected]: 0409 779 230