Writing Scalable Software in Java
-
Upload
ruben-badaro -
Category
Technology
-
view
17.793 -
download
2
Transcript of Writing Scalable Software in Java
![Page 1: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/1.jpg)
Writing Scalable Software in JavaFrom multi-core to grid-computing
![Page 2: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/2.jpg)
Me
• Ruben Badaró
• Dev Expert at Changingworlds/Amdocs
• PT.JUG Leader
• http://www.zonaj.org
![Page 3: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/3.jpg)
What this talk is not about
• Sales pitch
• Cloud Computing
• Service Oriented Architectures
• Java EE
• How to write multi-threaded code
![Page 4: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/4.jpg)
Summary
• Define Performance and Scalability
• Vertical Scalability - scaling up
• Horizontal Scalability - scaling out
• Q&A
![Page 5: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/5.jpg)
Performance != Scalability
![Page 6: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/6.jpg)
Performance
Amount of useful work accomplished by a computer system compared to the time and
resources used
![Page 7: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/7.jpg)
Scalability
Capability of a system to increase the amount of useful work as resources and load are added to
the system
![Page 8: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/8.jpg)
Scalability
• A system that performs fast with 10 users might not do so with 1000 - it doesn’t scale
• Designing for scalability always decreases performance
![Page 9: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/9.jpg)
Linear Scalability
Throughput
Resources
![Page 10: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/10.jpg)
Reality is sub-linear
Throughput
Resources
![Page 11: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/11.jpg)
Amdahl’s Law
![Page 12: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/12.jpg)
Scalability is about parallelizing
• Parallel decomposition allows division of work
• Parallelizing might mean more work
• There’s almost always a part of serial computation
![Page 13: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/13.jpg)
Vertical Scalability
![Page 14: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/14.jpg)
Vertical ScalabilitySomewhat hard
![Page 15: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/15.jpg)
Vertical ScalabilityScale Up
• Bigger, meaner machines
- More cores (and more powerful)
- More memory
- Faster local storage
• Limited
- Technical constraints
- Cost - big machines get exponentially expensive
![Page 16: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/16.jpg)
Shared State
• Need to use those cores
• Java - shared-state concurrency
- Mutable state protected with locks
- Hard to get right
- Most developers don’t have experience writing multithreaded code
![Page 17: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/17.jpg)
This is how they look like
public static synchronized SomeObject getInstance() {
return instance;
}
public SomeObject doConcurrentThingy() {
synchronized(this) {
//...
}
return ..;
}
![Page 18: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/18.jpg)
Single vs Multi-threaded
• Single-threaded
- No scheduling cost
- No synchronization cost
• Multi-threaded
- Context Switching (high cost)
- Memory Synchronization (memory barriers)
- Blocking
![Page 19: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/19.jpg)
Lock ContentionLittle’s Law
The average number of customers in a stable system is equal to their average arrival rate
multiplied by their average time in the system
![Page 20: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/20.jpg)
Reducing Contention
• Reduce lock duration
• Reduce frequency with which locks are requested (stripping)
• Replace exclusive locks with other mechanisms
- Concurrent Collections
- ReadWriteLocks
- Atomic Variables
- Immutable Objects
![Page 21: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/21.jpg)
Concurrent Collections
• Use lock stripping
• Includes putIfAbsent() and replace() methods
• ConcurrentHashMap has 16 separate locks by default
• Don’t reinvent the wheel
![Page 22: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/22.jpg)
ReadWriteLocks
• Pair of locks
• Read lock can be held by multiple threads if there are no writers
• Write lock is exclusive
• Good improvements if object as fewer writers
![Page 23: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/23.jpg)
Atomic Variables
• Allow to make check-update type of operations atomically
• Without locks - use low-level CPU instructions
• It’s volatile on steroids (visibility + atomicity)
![Page 24: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/24.jpg)
Immutable Objects
• Immutability makes concurrency simple - thread-safety guaranteed
• An immutable object is:- final
- fields are final and private
- Constructor constructs the object completely
- No state changing methods
- Copy internal mutable objects when receiving or returning
![Page 25: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/25.jpg)
JVM issues
• Caching is useful - storing stuff in memory
• Larger JVM heap size means longer garbage collection times
• Not acceptable to have long pauses
• Solutions
- Maximum size for heap 2GB/4GB
- Multiple JVMs per machine
- Better garbage collectors: G1 might help
![Page 26: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/26.jpg)
Scaling Up: Other Approaches
• Change the paradigm
- Actors (Erlang and Scala)
- Dataflow programming (GParallelizer)
- Software Transactional Memory (Pastrami)
- Functional languages, such as Clojure
![Page 27: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/27.jpg)
Scaling Up: Other Approaches
• Dedicated JVM-friendly hardware
- Azul Systems is amazing
- Hundreds of cores
- Enormous heap sizes with negligible gc pauses
- HTM included
- Built-in lock elision mechanism
![Page 28: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/28.jpg)
Horizontal Scalability
![Page 29: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/29.jpg)
Horizontal ScalabilityThe hard part
![Page 30: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/30.jpg)
Horizontal ScalabilityScale Out
• Big machines are expensive - 1 x 32 core normally much more expensive than 4 x 8 core
• Increase throughput by adding more machines
• Distributed Systems research revisited - not new
![Page 31: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/31.jpg)
Requirements
• Scalability
• Availability
• Reliability
• Performance
![Page 32: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/32.jpg)
Typical Server Architecture
![Page 33: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/33.jpg)
... # of users increases
![Page 34: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/34.jpg)
... and increases
![Page 35: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/35.jpg)
... too much load
![Page 36: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/36.jpg)
... and we loose availability
![Page 37: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/37.jpg)
... so we add servers
![Page 38: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/38.jpg)
... and a load balancer
![Page 39: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/39.jpg)
... and another one rides the bus
![Page 40: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/40.jpg)
... we create a DB cluster
![Page 41: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/41.jpg)
... and we cache wherever we can
Cache
Cache
![Page 42: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/42.jpg)
Challenges
• How do we route requests to servers?
• How do distribute data between servers?
• How do we handle failures?
• How do we keep our cache consistent?
• How do we handle load peaks?
![Page 43: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/43.jpg)
Technique #1: Partitioning
A...E
U...Z
P...T
K...O
F...J
Users
![Page 44: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/44.jpg)
Technique #1: Partitioning
• Each server handles a subset of data
• Improves scalability by parallelizing
• Requires predictable routing
• Introduces problems with locality
• Move work to where the data is!
![Page 45: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/45.jpg)
Technique #2: Replication
Active
Backup
![Page 46: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/46.jpg)
Technique #2: Replication
• Keep copies of data/state in multiple servers
• Used for fail-over - increases availability
• Requires more cold hardware
• Overhead of replicating might reduce performance
![Page 47: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/47.jpg)
Technique #3: Messaging
![Page 48: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/48.jpg)
Technique #3: Messaging
• Use message passing, queues and pub/sub models - JMS
• Improves reliability easily
• Helps deal with peaks
- The queue keeps filling
- If it gets too big, extra requests are rejected
![Page 49: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/49.jpg)
Solution #1: De-normalize DB
• Faster queries
• Additional work to generate tables
• Less space efficiency
• Harder to maintain consistency
![Page 50: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/50.jpg)
Solution #2: Non-SQL Database
• Why not remove the relational part altogether
• Bad for complex queries
• Berkeley DB is a prime example
![Page 51: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/51.jpg)
Solution #3: Distributed Key/Value Stores
• Highly scalable - used in the largest websites in the world, based on Amazon’s Dynamo and Google’s BigTable
• Mostly open source
• Partitioned
• Replicated
• Versioned
• No SPOF
• Voldemort (LinkedIn), Cassandra (Facebook) and HBase are written in Java
![Page 52: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/52.jpg)
Solution #4: MapReduce
Map...
![Page 53: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/53.jpg)
Solution #4: MapReduce
Map...
![Page 54: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/54.jpg)
Divide Work
Solution #4: MapReduce
Map...
![Page 55: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/55.jpg)
Divide Work
Solution #4: MapReduce
Map...
![Page 56: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/56.jpg)
Divide Work
Solution #4: MapReduce
Map...
![Page 57: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/57.jpg)
Solution #4: MapReduce
Map...
![Page 58: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/58.jpg)
Compute
Solution #4: MapReduce
Map...
![Page 59: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/59.jpg)
Solution #4: MapReduce
Reduce...
Return and aggregate
![Page 60: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/60.jpg)
Solution #4: MapReduce
Reduce...
Return and aggregate
![Page 61: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/61.jpg)
Solution #4: MapReduce
Reduce...
Return and aggregate
![Page 62: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/62.jpg)
Solution #4: MapReduce
• Google’s algorithm to split work, process it and reduce to an answer
• Used for offline processing of large amounts of data
• Hadoop is used everywhere! Other options such as GridGain exist
![Page 63: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/63.jpg)
Solution #5: Data Grid
• Data (and computations)
• In-memory - low response times
• Database back-end (SQL or not)
• Partitioned - operations on data executed in specific partition
• Replicated - handles failover automatically
• Transactional
![Page 64: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/64.jpg)
Solution #5: Data Grid
• It’s a distributed cache + computational engine
• Can be used as a cache with JPA and the like
• Oracle Coherence is very good.
• Terracotta, Gridgain, Gemfire, Gigaspaces, Velocity (Microsoft) and Websphere extreme scale (IBM)
![Page 65: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/65.jpg)
Retrospective
• You need to scale up and out
• Write code thinking of hundreds of cores
• Relational might not be the way to go
• Cache whenever you can
• Be aware of data locality
![Page 66: Writing Scalable Software in Java](https://reader033.fdocuments.net/reader033/viewer/2022050613/554e8c3ab4c90526358b4b0d/html5/thumbnails/66.jpg)
Q & AThanks for listening!
Ruben Badaróhttp://www.zonaj.org