Shopzilla On Concurrency
-
Upload
rod-barlow -
Category
Technology
-
view
124 -
download
1
description
Transcript of Shopzilla On Concurrency
- 1 -
Shopzilla on ConcurrencyConcurrency as Shopzilla's performance building block
Rod Barlow, Architect
3/2/2010
- 2 -
Agenda
Introduce Shopzilla
History of Java Concurrency
Java 5 Concurrency Features
Concurrency in Frameworks
Concurrency @ Shopzilla
Future
2
- 3 -
3
Shopzilla, Inc. - Online Shopping Network
100M impressions/day
20-29M UV’s per Month
8,000+searches per second
100M+Products
- 4 -
Concurrent code pre Java 1.5 was difficult and error prone.
Unintended side effects
Doug Lea's concurrent package (circa 1998)
Java users need to write reliable multi-threaded software!
JSR-133 Java Memory Model (threads, locks, volatiles, ...)
JSR-166 Concurrency Utilities
Expert groups consisting of Bloch, Goetz, and Lea
4History of Java Concurrency
- 5 -
Where is Concurrent Code?
Concurrent code is in our application containers
org.apache.tomcat.util.threads.ThreadPool
"A thread pool that is trying to copy the apache process management. Should we remove this in favor of Doug Lea's thread package?"
http://tomcat.apache.org/tomcat-6.0-doc/api/org/apache/tomcat/util/threads/ThreadPool.html
5
- 6 -
Where is Concurrent Code?
Concurrent code is in the frameworks we all use
Concurrent code is increasingly in our application code
Concurrent code @ Shopzilla!
6
- 7 -
Immutability
Immutability = a class whose instances can't be modified
Eg; String, boxed primitives, BigDecimal, BigInteger
Joshua Bloch's Effective Java sets forth guidelines Eliminate mutators
Eliminate extensibility
All fields final
Exclusivity for mutable components
Further, Bloch's Effective Java reminds us
Immutable objects are inherently thread safe.
Immutable objects require no synchronization
Immutable objects can be shared freely
7
- 8 -
Immutability - Guava
Formerly Google Collections, now Guava - http://code.google.com/p/guava-libraries/
http://www.infoq.com/news/2010/01/google_collections_10 “Immutability” guarantees no other actor in the system can change the
state of the collection
“Unmodifiable” (java.util.Collections.unmodifiable... factory methods) only guarantees that the client of the collection – the user – can never change the collection
An instance of ImmutableList contains its own private data and will never change
8
- 9 -
Atomic References
Immune to deadlock and other liveness issues
Offer non-blocking synchronization of single variables
Offer lower scheduling overhead than traditional synchronization techniques
Immune to deadlock and other liveness issues
Effectively volatile variables with extra features
Modern hardware support through compare-and-swap processor instructions
9
public final boolean compareAndSet(V expect, V update);
- 10 -
Atomic References – Unique ID
We needed an unique ID value
Unique across multiple data-centers, and silos
Configuration elements prime the singleton IdGenerator for distributed uniqueness
A portion is based on a time value, e.g.: the seconds since the start of the month
An additional portion provides uniqueness within a single JVM, using an Atomic Reference
10
private static AtomicInteger sessionIdCounter = new AtomicInteger(100000);
...sessionIdCounter.compareAndSet(999999,100000);...sessionIdCounter.incrementAndGet();
- 11 -
Atomic References – Parent Node
AtomicReference's compareAndSet()
Used for visibility
Used to enforce data integrity constraints
Note ImmutableList
11
public class Node { private final NodeInfo nodeInfo; private final Collection<Node> children; private final AtomicReference<Node> parent;
public Node(NodeInfo nodeInfo) { this.nodeInfo = nodeInfo; children = new ConcurrentLinkedQueue<Node>(); parent = new AtomicReference<Node>(); }
public void addChild(Node szNode) { szNode.setParent(this); children.add(szNode); }
private void setParent(Node newParent) { if (!parent.compareAndSet(null, newParent)) { throw new NodeException("Child: " + this + " already has parent. Attempted:" + newParent); } }
public Node getParent() { return parent.get(); }
public Collection<Node> getChildren() { return ImmutableList.copyOf(children); }
public NodeInfo getNodeInfo() { return nodeInfo; }}
- 12 -
Atomic References – Takeaways
Volatiles suffice where atomic check-then-act is overkill
Some atomic nonblocking algorithms involve looping for a failed compareAndSet()
During high thread contention this could actually mean inefficiency
Most real-world threads have far more to do than mere lock contention though
12
- 13 -
Blocking Queues
Acts as a thread-safe implementation of a producer / consumer pattern
JMS queue, though not distributed
Insertion blocks until there is space available (for bounded queues)
13
- 14 -
Blocking Queues – Data Publish
A very large (50GB) flat-file
Consumers send data to a remote grid cache
Multiple queue consumers increased throughput
14
BlockingQueue<String> blockingQueue = new LinkedBlockingQueue<String>(upperBound);…private final BlockingQueue<String> blockingQueue;BufferedReader reader = new BufferedReader(new FileReader("numbers.txt"));String line = reader.readLine();while (line != null) { blockingQueue.offer(line); line = reader.readLine();…String string = blockingQueue.take();
- 15 -
Locks – ReadWriteLock
Allows for multiple concurrent read locks
No new read locks once a write lock is placed
Write lock blocks until read locks complete
Lock modes; non-fair (default), fair
Reentrancy
Downgrading
15
readWriteLock = new ReentrantReadWriteLock();readLock = readWriteLock.readLock();writeLock = readWriteLock.writeLock();...
writeLock.lock();try { ...} finally { writeLock.unlock();}
readLock.lock();try { ...} finally { readLock.unlock();}
- 16 -
Distributed Cached Data Snapshot
Multiple clients → Multiple HTTP requests
Multiple load balanced JVMs
Distributed data grid cache
Need to stream a data snapshot to n clients
16
- 17 -
Distributed Cached Data Snapshot
Coherence partitioned cache supports cluster wide key-based lock
Locked objects can still be read by other cluster threads without a lock
Locks are unaffected by server failure (and will failover to a backup server.) Locks are immediately released when the lock owner (client) fails.
Lock timeouts (-1, 0, 1+)
17
- 18 -
Hazelcast
http://www.hazelcast.com/
Open source clustering and highly scalable data distribution platform for Java
Distributed data structures
Queue / Topic
Map, MultiMap, Set, List
Lock
Effectively distributed java.util.concurrent
Uses TCP/IP
Cluster wide ID generators
Distributed executor services
Distributed Cached Data Snapshot 18
Lock lock = Hazelcast.getLock(myLockedObject);lock.lock();try { // do something here} finally { lock.unlock();}
...
if (lock.tryLock (5000,TimeUnit.MILLISECONDS)) { try { // do some stuff here.. } finally { lock.unlock(); } }
- 19 -
Distributed Cached Data Snapshot
Apache Zookeeper http://hadoop.apache.org/zookeeper/
Terracotta http://www.terracotta.org/
19
- 20 -
Distributed Cached Data Snapshot
Distributed competition for the publishing privilege
Computation of completeness
Communicate completeness
Other threads in other JVMs happily polling for State.DONE
20
public class QueueState { State state; Long expectedSize; ...}
public void populateQueue(final String version) { final NamedCache cache = CacheFactory.getCache("queue-state"); final boolean lockObtained = cache.lock(version, 0); if (lockedObtained) { try { QueueState state = (QueueState)cache.get(version); cache.put(version, newQueueState(State.PUBLISHING)); publishSnapshotIds(version); // compute completeness ... cache.put(version, new QueueState(State.DONE)); } finally { cache.unlock(version); } }}
long publishSnapshotIds(Identifier version) { final Set<?> keySet = dataCache.keySet(); for (Object key : keySet) { messagingSession.publishMessage(version, (Long) key); } return keySet.size();}
- 21 -
Distributed Cached Data Snapshot
n number of clients
n number of HTTP requests across 6 load balanced Tomcat JVMs
Threads failing to acquire lock immediately start shipping data
What about the thread obtaining the lock?
21
@GET@Path("data/snapshot/{version}")@Produces( { "application/fastinfoset" })@Overridepublic StreamingOutput getDataSnapshot(@PathParam("version") final String version) throws Exception { distributedQueueLock.populateQueue(version); return new DataStreamingOutput(messagingSession, dataDao, dataModelAdapter, version);}
- 22 -
Concurrency in Frameworks
Hibernate Core 3.5.0
CountDownLatch
Need a thread to wait until some number of events have occurred
Constructed with the count of the # events which must occur before release
Callable, ExecutorService, ReentrantLock, AtomicReference
22
final CountDownLatch readerLatch = new CountDownLatch(1);final CountDownLatch writerLatch = new CountDownLatch(1);final CountDownLatch completionLatch = new CountDownLatch(1);final ExceptionHolder holder = new ExceptionHolder();
Thread reader = new Thread() { public void run() { try { BatchModeTransactionManager.getInstance().begin(); log.debug("Transaction began, get value for key"); assertTrue(VALUE2.equals(region.get(KEY)) == false); BatchModeTransactionManager.getInstance().commit(); } catch (AssertionFailedError e) { holder.a1 = e; rollback(); } catch (Exception e) { holder.e1 = e; rollback(); } finally { readerLatch.countDown(); } } };
- 23 -
Concurrency in Frameworks
Spring Framework 3.0.1
TaskExecutor
Spring 2.0 supported Java 1.4
TaskExecutor did not implement Executor
In Spring 3.0 TaskExecutor extends Executor
TaskExecutor sees wide use within Spring framework
Quartz
Message Driven POJO
Spring Enterprise Recipies – Josh Long, Gar Mak
23
public interface TaskExecutor extends Executor { ...}
public interface AsyncTaskExecutor extends TaskExecutor { ...}
public class SimpleAsyncTaskExecutor extends CustomizableThreadCreator implements AsyncTaskExecutor, Serializable { ... }
public void execute(Runnable task, long startTimeout) { Assert.notNull(task, "Runnable must not be null"); if (isThrottleActive() && startTimeout > TIMEOUT_IMMEDIATE){ this.concurrencyThrottle.beforeAccess(); doExecute(new ConcurrencyThrottlingRunnable(task)); } else { doExecute(task); }}
- 24 -
Shopzilla's Website Concurrency
Needed sub 650ms server side response time
Simplify the layers
Decompose architecture
Functionally separate, individually testable, loosely coupled web-services
Define SLAs
24
- 25 -
Shopzilla's Website Concurrency 25
Concurrency!
Our pages today ship within 250ms
How to invoke 30+ web-services and ship a page in <650ms?
- 26 -
26
Pods &Service Calls
Shopzilla's Website Concurrency
- 27 -
Shopzilla's Website Concurrency
Started simple
Implement only the concurrency features required
Concurrency isolated to pods
Pods responsible for fetching data
We're using simple building blocks
Incremental implementation based solely on requirements
Haven't seen deadlocks
27
- 28 -
Shopzilla's Website Concurrency
Thread longevity configured at the HTTP connection level
HTTPClient connectionTimeout
Spring wired HTTPClient implementation
Ability to add a pod to a controller
28
<!-- Bean used for HTTP communication.--><bean id="commonsHttpClientTemplate" class="com.shopzilla.site.common.http.CommonsHttpClientTe..."> <!-- Max time waiting to open a connection --> <property name="connectionTimeout" value="1000" /> <!-- Max time waiting to receive data --> <property name="socketTimeout" value="1000" /> <!-- Essentially disable an upper limit we own these hosts--> <property name="maxConnectionsPerHost" value="9999" /></bean>
new FuturePodResult(executor.submit(new PodCallable<SearchCommand>(
relatedSearchPod, search, request, dictionary))));
- 29 -
Shopzilla's Website Concurrency
FuturePodResult implements the PodResult interface
Abstracts the details of the future
PodCallable types the pod and command
29
public class FuturePodResult implements PodResult { private final Future<PodResult> future; public FuturePodResult(Future<PodResult> future) { this.future = future; } public Map<String, Macro> getMacros() throws InterruptedException, PodException { try { return future.get().getMacros(); ...
public class PodCallable<T> implements Callable<PodResult> { final Pod<T> pod; final HttpServletRequest request; final T search; final Map<String, Object> dictionary; public PodResult call() throws PodException { ThreadLocalHelper.beforeExecute(request); try { return pod.invoke(request, search, dictionary); } finally { ThreadLocalHelper.afterExecute(); } }}
- 30 -
Shopzilla's Website Concurrency
Need to execute pods
Configurable ExecutorService
Backed with a queue
Naming of threads proved useful in initial testing (JMX)
30
public ExecutorService create() { final LinkedBlockingQueue<Runnable> queue = new LinkedBlockingQueue<Runnable>(); final ThreadFactory threadFactory; if (poolName == null) { threadFactory = Executors.defaultThreadFactory(); } else { threadFactory = new NamedPoolThreadFactory(poolName); } return new ThreadPoolExecutor(corePoolSize, maximumPoolSize, keepAliveTimeSeconds, TimeUnit.SECONDS, queue, threadFactory);}
private static class NamedPoolThreadFactory implements ThreadFactory { private static final AtomicInteger poolNumber = new AtomicInteger(1); final AtomicInteger threadNumber = new AtomicInteger(1); private final String namePrefix; private NamedPoolThreadFactory(String poolName) { namePrefix = poolName + "-" + poolNumber.getAndIncrement() + "-thread-"; } public Thread newThread(Runnable runnable) { return new Thread(runnable, namePrefix + threadNumber.getAndIncrement());
}}
- 31 -
Shopzilla's Website Concurrency
Once the concept was proven, interesting feature requests materializing
Product Review pod
Distilled, 2 pods needed to share a single result
Added ServiceInvocation concept
31
- 32 -
Shopzilla's Website Concurrency
Pods now have access to a Service Invocation Map
get() blocks on the result of the service invocation
A single service invocation result can be shared between two pods
32
public interface ServiceInvocationCallback<T> { public T invokeService() throws ServiceInvocationException;
}public class ServiceInvocationCallable<T, E extends ServiceInvocationCallback<T>> implements Callable<T> {
private E callback;
public ServiceInvocationCallable(E callback) { this.callback = callback; }
public T call() throws ServiceInvocationException { try { return this.callback.invokeService(); ...
serviceInvocationMap.add(CommonServiceInvocations.PRODUCTS, new ServiceInvocationCallable <ProductSearchResults, SearchCommand>( productServiceCallback, search, request));
// Product PodserviceInvocationMap.get(CommonServiceInvocations.PRODUCTS,ProductSearchResults.class);
// Product Reviews PodserviceInvocationMap.get(CommonServiceInvocations.PRODUCTS,ProductSearchResults.class);
- 33 -
Shopzilla's Website Concurrency
Now we were sharing results, we were done, right?
Product Review information was now required in-line in the product pod
Still needed the special Product Review pod too!
Dependent Service Invocations
33
- 34 -
Shopzilla's Website Concurrency
Service Invocations can now depend on results of others
Dependent Callable is configured with two callbacks;
A callback whose result is blocked for
A callback which is invoked once the blocking result arrives
34
public class DependentServiceInvocationCallable<T, E>implements Callable<T> { BlockingServiceInvocationCallback<E> blockingCallback; ServiceInvocationCallback<T, E> serviceCallback; HttpServletRequest request;
public DependentServiceInvocationCallable( BlockingServiceInvocationCallback<E> blockingCallback, ServiceInvocationCallback<T, E> serviceCallback, HttpServletRequest request) { ... }
public T call() throws ServiceInvocationException { ThreadLocalHelper.beforeExecute(request); try { E serviceRequest = blockingCallback .waitForResult(); if (serviceRequest == null) { throw new ServiceInvocationException( "Result of blocking helper was null"); } return serviceCallback.invokeService( serviceRequest, request); } finally { ThreadLocalHelper.afterExecute(); } }}
- 35 -
Future
More use of distributed data structures
Spring 3.0
@Async
@Scheduled
JSR-315 Servlets 3.0
AsyncContext
More parallelism
35
BlockingQueue<MyTask> q =Hazelcast.getQueue("tasks");q.put(new MyTask());MyTask task = q.take();
@Async // Spring 3public Future<Customer> createCustomer(Stringfn, String ln, String email);
@Scheduled(cron="*/5 * * * * MON-FRI")public void doSomething() { // something that should execute onweekdays only}
- 36 -
Resources, Books
Java Concurrency in Practice (Goetz)
Effective Java (Bloch)
Spring Enterprise Recipes (Long, Mak)
http://jcp.org/en/jsr/detail?id=133
http://jcp.org/en/jsr/detail?id=166
Spring 3.0
36
- 37 -
For more info…go to:
http://tech.shopzilla.com
http://rodneybarlow.org
- 38 -