Apache Mesos: a simple explanation of basics

24
Mesos “There's Just No Getting around It: You're Building a Distributed System” -Mark Cavage A simple presentation on mesos by Gladson Manuel

Transcript of Apache Mesos: a simple explanation of basics

Mesos

“There's Just No Getting around It: You're Building a Distributed System” -Mark Cavage

A simple presentation on mesos by Gladson Manuel

What is Mesos?

● Mesos is a kernel designed to run on distributed systems.

● In a distributed environment, Mesos runs on every machine.

● Scheduler capable of handling multiple resources.

Why Mesos?

● Scalability● Fault tolerant● Support Docker● Isolation between tasks and linux containers● Frameworks can be built on

Java/Python/Scala/C/C++● WebUI

Architecture

Before and after Mesos

Structure of kernel

Representation

Mesos, because static partitioning is harmful

Mesos partitioning

● Mesos is datacentre kernel, so resources of a node is not for that one node. It is for the whole distributed system.

Zookeeper:

● Used to elect master if a running master is failed. --It is recommended to keep the number of slaves as odd. As leader election is based on a strict majority, zookeeper splits the available number of masters into two and select the set with higher number of Nodes.

Zookeeper quorum:

A limited number of zookeeper servers.

The Flow

● Slave 1 reports to the master that it has 4 CPUs and 4 GB of memory free. The master then invokes the allocation policy module, which tells it that framework 1 should be offered all available resources.

● The master sends a resource offer describing what is available on slave 1 to framework 1.

● The framework’s scheduler replies to the master with information about two tasks to run on the slave, using <2 CPUs, 1 GB RAM> for the first task, and <1 CPUs, 2 GB RAM> for the second task.

● Finally, the master sends the tasks to the slave, which allocates appropriate resources to the framework’s executor, which in turn launches the two tasks (depicted with dotted-line borders in the figure). Because 1 CPU and 1 GB of RAM are still unallocated, the allocation module may now offer them to framework 2.

● In addition, this resource offer process repeats when tasks finish and new resources become free.

Getting Started

● Download source install dependencies and build

$ wget http://www.apache.org/dist/mesos/0.20.1/mesos-0.20.1.tar.gz

● Clone from git repository

$ git clone http://git-wip-us.apache.org/repos/asf/mesos.git

Common issues while build

● Java home not set● Fix: export JAVA_HOME=/usr/java/<jdk_as_mentioned_by

mesos>/bin/java

● Maven downloads nothing● Set maven proxy in ~/.m2/settings.xml

● DELAY!!● Sorry its a compilation process. Either upgrade your hardware to

moster configuration or try compiled packages like mesosphere(practical workaround).

Mesosphere installation

● Add Key of mesosphere repositorysudo apt-key adv --keyserver keyserver.ubuntu.com --recv E56151BF

DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]')

CODENAME=$(lsb_release -cs)

● Add mesosphere repository to ubuntu sourcesecho "deb http://repos.mesosphere.io/${DISTRO} ${CODENAME} main" | \

sudo tee /etc/apt/sources.list.d/mesosphere.list

sudo apt-get -y update

sudo apt-get install mesosphere

Build is done(finally). Now what?

● Start mesos master./bin/mesos-master.sh --ip=127.0.0.1 –work_dir=/var/lib/mesos

● Start mesos slave./bin/mesos-slave.sh –master=127.0.0.1:5050

● Access WebUIhttp://127.0.0.1:5050

Running example frameworks

● ./src/examples/python/test-framework 127.0.0.1:5050

Frameworks in detail

Components:

Scheduler: Receives resource offers and Launch tasks

Executor: Executor is launched by the slave to execute tasks on the slave

Flow of Execution

● slave notifies master about its available resources

● tasks is scheduled by the scheduler.So scheduler have info about available tasks

● Scheduler sends the tasks to the right slave based on the available resources of the slave

● Slave check of executor that is already running,if not it launches a new one and execute the task on the executor.

Status Updates

● non-terminal updates(TASK_RUNNING)● terminal updates(TASK_FINISHED). Terminal

updates are very important since it is the only way mesos get informed that a task in done. Only then the resources are freed

● TASK_LOST(slave terminated)

Activities

● Callbacks

Callbacks are synchronous and single threaded. Since only one one call is made at a time no blocks will occur

--registered(frameworkId,masterInfo)

--resourceOffers(offers)

--statusUpdate(taskStatus)

● Actions

Asynchronous, for example sendStatusUpdate() gets queued in the driver

--launchTasks(offerId,taskInfo,filters)

--killTask(taskId)

--declineOffer(offerId,filters)

Whats going on in the code?

● Every request is looped through each offer

● If offer satisfies a request, MesosTask is created by calling driver.launchTasks(offer.getId,tasks,filters)

● Task is executed and perform an exit()

Available frameworks

● Scala: Chronos,Spark● Java: hadoop, jenkins, storm● Python: dpark