Performance optimization techniques for Java code
-
Upload
attila-balazs -
Category
Technology
-
view
6.788 -
download
2
description
Transcript of Performance optimization techniques for Java code
Performance optimization techniques for Java code
Who am I and why should you trust me?
● Attila-Mihály Balázshttp://hype-free.blogspot.com/
● Former malware researcher (”low-level guy”)
● Current Java dev (”high level dude”)● Spent the last ~6 monts optimizing a large
(1 000 000+ LOC) legacy system● Will spend the next 6 months on it too (at
least )
Question everything!
?
What's this about● Core principles● Demo 1: collections framework● Demo 2, 3, 4: synchronization performance● Demo 5: ugly code, is it worth it?● Demo 6, 7, 8: playing with Strings● Conclusions● Q&A
What this is not about● Selecting efficient algorithms● High level optimizations (architectural
changes)
● These are important too! (but require more effort, and we are going for the quick win here)
Core principles● Performance is a balence, and endless
game of shifting bottlenecks, no silver bullets here!
CPUCPU MemoryMemory
DiskDisk NetworkNetwork
Your program
Perform on all levels!● Performance has many levels:
– Compiler (JIT): 5 to 6: 100%(1)
– Memory: L1/L2 cache, main memory– Disk: cache, RAID, SSD– Network: 10Mbit, 100Mbit, 1000Mbit
● Until recently we had it easy (performance doubled every 18 months)
● Now we need to do some work(1) http://java.sun.com/performance/reference/whitepapers/6_performance.html
Core principles● Measure, measure, measure! (before,
during, after). ● Try using realistic data!● Watch out for the Heisenberg effect (more
on this later)● Some things are not intuitive:
– Pop-question: if processing 1000 messages takes 1 second, how long does the processing of 1 message take?
Core principles● Troughput● Latency● Thread context, context switching● Lock contention● Queueing theory● Profiling● Sampling
Feasibility – ”numbers everyone should know” (2)
● L1 cache reference 0.5 ns● Branch mispredict 5 ns● L2 cache reference 7 ns● Mutex lock/unlock 100 ns● Main memory reference 100 ns● Compress 1K bytes with Zippy 10,000 ns● Send 2K bytes over 1 Gbps network 20,000 ns● Read 1 MB sequentially from memory 250,000 ns● Round trip within same datacenter 500,000 ns● Disk seek 10,000,000 ns● Read 1 MB sequentially from network 10,000,000 ns● Read 1 MB sequentially from disk 30,000,000 ns● Send packet CA->Netherlands->CA 150,000,000 n(2) http://research.google.com/people/jeff/stanford-295-talk.pdf
Feasability● Amdahl's law: The speedup of a program
using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program.
Course of action● Have a clear (written?), measourable goal:
operation X should take less than 100ms in 99.9% of the cases
● Measure (profile)● Is the goal met? → The End● Optimize hotspots → go to step 2
Tools● VisualVM● JProfiler● YourKit
● Eclipse TPTP● Netbeans Profiler
Demo 1: collections framework● Name 3 things wrong with this code:
Vector<String> v1;…if (!v1.contains(s)) { v1.add(s); }
Demo 1: collections framework● Wrong data structure (list / array instead of
set), hence slooow performance for large data sets (but not for small ones!)
● Extra synchronization if used by a single thread only
● Not actually thread safe! (only ”exception safe”)
Demo 1: lessons● Use existing classes● Use realistic sample data● Thread safety is hard!● Heisenberg (observer) effect
Demo 2, 3, 4: synchronization performance
● If I have N units of work and use 4, it must be faster than using a single thread, right?
● What does lock contention look like?● What does a ”synchronization train(wreck)”
look like?
Demo 2, 3, 4: lessons● Use existing classes
– ReadWriteLock– java.util.concurrent.*
● Use realistic sample data (too short / too long units of work)
● Sometimes throwing a threadpool at it makes it worse!
● Consider using a private copy of the variable for each thread
Demo 5: ugly code, is it worth it?● Parsing a logfile
Demo 5: lessons● Sometimes yes, but always profile first!
Demo 6: String.substring● How are strings stored in Java?
Demo 6: Lesson● You can look inside the JRE when needed!
Demo 7: repetitive strings
Demo 7: Lessons● You shouldn't use String.intern:
– Slow– You have to use it everywhere– Needs hand-tuning
● Use a WeakHashMap for caching (don't forget to synchronize!)
● Use String.equals (not ==)
Demo 8: charsets– ASCII– ISO-8859-1– UTF-8– UTF-16
Demo 8: lessons● Use UTF-8 where possible
Conclusions● Measure twice, cut once● Don't trust advice you didn't test! (including
mine)● Most of the time you don't need to sacrifice
clean code for performant code
Conclusions● Slides:
– Google Groups– http://hype-free.blogspot.com/– [email protected]
● Source code:– http://code.google.com/p/hype-
free/source/browse/#svn/trunk/java-perfopt-201003
● Profiler evaluation licenses
Resources● https://visualvm.dev.java.net/ ● http://www.ej-technologies.com/● http://blog.ej-technologies.com/ ● http://www.yourkit.com/ ● http://www.yourkit.com/docs/index.jsp ● http://www.yourkit.com/eap/index.jsp
Thank you!
Questions?