Developing distributed applications with Akka and Akka Cluster
-
Upload
konstantin-tsykulenko -
Category
Technology
-
view
438 -
download
11
Transcript of Developing distributed applications with Akka and Akka Cluster
Developing distributed applications with Akka and Akka Cluster
What is Akka?
• A Scala and Java framework for scalability, fault-tolerance, concurrency and remoting through actors.
• Inspired by Erlang OTP.• Developed by Typesafe: https://www.typesafe.com/.
Concurrency paradigms
• Shared state and locks• Software Transactional Memory (STM)• Message-Passing Concurrency (Actors)• Dataflow Concurrency • and more…
STM
Dataflow Concurrency
Actors
• Originate in a 1973 paper by Carl Hewitt• Implemented in Erlang• Encapsulate state and behavior• Closer to the definition of OO than classes
• Implements Message-Passing Concurrency• Share nothing• Isolated lightweight processes• Communicates through messages• Asynchronous and non-blocking
Concurrency model
• No shared state – no synchronization• Each actor has a mailbox (message queue)• Non-blocking send• Blocking receive• Messages are immutable
userActor ! User(“John Doe”)
class UserActor extends Actor { def receive = { case User(name) => sender ! “Hi $name” }}
Dispatchers
sample-dispatcher { type = Dispatcher executor = "fork-join-executor" fork-join-executor { parallelism-min = 2 parallelism-factor = 2.0 parallelism-max = 10 } throughput = 100}
Supervision and hierarchy
Routers
• Local• Remote• Various routing algorithms (round robin, random, consistent hashing
etc)
Let’s build a web crawler!
1.Fetch a page2.Parse the page to get links3.Check if max crawl depth has been reached and if yes, finish4.Go to 1 for all parsed links
Demo time
Going remote
• Everything works using asynchronous message passing which is good for remoting
• Akka-remoting allows working with remote actors just as if they were in the same JVM
• Still need to handle additional issues like serialization and handling potential networking problems
Akka Cluster
• Based on DynamoDB clustering• Completely decentralized, uses gossip protocol for membership and
failure detection• Cluster aware routers can be used for balancing tasks across the
cluster
Cluster aware routersactor { deployment { /crawlerService/crawlWorkers { router = consistent-hashing-group nr-of-instances = 100 routees.paths = ["/user/crawlWorker"] cluster { enabled = on allow-local-routees = on use-role = backend } } } provider = "akka.cluster.ClusterActorRefProvider" }
Demo time
Detecting cycles
• Need to detect link cycles to avoid needless downloads• Distributed shared state?• Akka CRDTs can help!
CRDTs
• Good performance and scalability, the cost is eventual consistency• Two main classes: operation based and state based
CmRDTs
• Based on messages• Requires messages to be delivered and processed exactly once
(complex!)• No need to transfer the whole state
CvRDTs
• Based on the object’s state• Need merge function that must be commutative, associative, and
idempotent
Demo time
References
• Akka documentation: http://akka.io/ • Good presentation on CRDTs: https://vimeo.com/43903960• On DynamoDB clustering: http://
www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf
Questions?