Introduction to Mesos

download Introduction to Mesos

If you can't read please download the document

Transcript of Introduction to Mesos

  1. 1. [email protected] Introduction to Apache Mesos
  2. 2. [email protected] Overview Introduction Architecture Security Container High availability
  3. 3. [email protected] Introduction First release in 2009 at the Berkley University Framework to use datacenter resources efficiently Combine Cpu, storage, memory etc. to one big shared virtual resource A distributed systems kernel 10.000 lines C++ code
  4. 4. [email protected] Introduction - Definitions Master - Scheduler Slaves Working Nodes Frameworks Application running on Mesos Executors Run tasks on the slaves Executor-Task - Running job on the slave Resource Offer - Slave resources which could be used by the frameworks
  5. 5. [email protected] Architecture
  6. 6. [email protected] Introduction - Frameworks https://docs.mesosphere.com/frameworks/
  7. 7. [email protected] Resource Allocation
  8. 8. [email protected] Resource Allocation 1) Slave 1 reports to the master that it has 4 CPUs and 4 GB of memory free. The master then invokes the allocation policy module, which tells it that framework 1 should be offered all available resources. 2) The master sends a resource offer describing what is available on slave 1 to framework 1.
  9. 9. [email protected] Resource Allocation 3) The frameworks scheduler replies to the master with information about two tasks to run on the slave, using for the first task, and for the second task. 2) Finally, the master sends the tasks to the slave, which allocates appropriate resources to the frameworks executor, which in turn launches the two tasks
  10. 10. [email protected] Resource Allocation - DRF Resource offer decision are made by the Resource Allocation Modul in the master In a heterogeneous environment resource allocation is difficult What is a fair share, when: User a require 1 CPU, 4GB RAM User b require 3 CPUs, 1 GB RAM Mesos: Dominant Resource Fairness
  11. 11. [email protected] DRF A modified fair share algorithm The goal is that each framework receives a fair share of the the resources most needed by the framework Dominant resource: Resource most demand by the framework Dominant Share: The highest percentage of shares owned across all resources of a framework
  12. 12. [email protected] DRF - Example Resource offer: 9 Cpu, 18GB RAM Tasks User A: 1CPU, 4 GB RAM - RAM=DR Tasks User B: 3CPUs, 1GB RAM CPU=DR Each Framework has 2/3 of its DS
  13. 13. [email protected] DRF - Example Framework1: 1CPU, 4GB RAM Framework2: 3CPU, 1GB RAM Buggy tasks could be killed by mesos Framework can have guaranteed allocation, non of its tasks should be killed
  14. 14. [email protected] RA Master Configuration Name Default Example allocation_interval 1s framework_sorter drf user_sorter drf offer_timeout 5 minutes roles - marathon,jenkins weights - marathon=2,jenkins=1
  15. 15. [email protected] RA Slave Configuration Name Default Example attributes ssd:true,rack:2 default_role * resources cpus(jenkins):1;disk(jenkins):10000; cpus(marathon):3;mem(marathon):2000
  16. 16. [email protected] Mesos Security Default configuration = No security Name Example Master authenticate_slaves true credentials /etc/mesos.pw authenticators crammd5 authenticate true Slave credential /etc/mesos.pw
  17. 17. [email protected] Framework Security 1) Framework to (re-)register with authorized roles 2)Framework to launch task/executors as authorized users 3)Authorized principals to shutdown frameworks through /shutdown HTTP endpoint
  18. 18. [email protected] Security ACLs Subjects Action Object principals register_framework roles usernames run_tasks users shutdown_frameworks framework_principals A set of subjects can perform an action on a set of objects
  19. 19. [email protected] Security ACLs Example
  20. 20. [email protected] Extract of the mesos api URL Function master:5050/help REST Documentation master:5050/metrics/snapshot Metrics of the cluster master:5050/master/tasks.json List mesos tasks master:5050/master/redirect 307 to the leading master master:5050/master/shutdown Shutdown Framework master:5050/registrar(1)/registry Content of the current registry slave:5051/files/browse.json?path=pathOnSlave Browse files in sandbox slave:5051/files/read.json?path=stdoutOnSlave Read stdout from sandbox slave:5051/system/stats.json Local system metrics
  21. 21. [email protected] Resource Isolation Mesos supports Docker - and Mesos Container Resource isolation with cgroups or posix
  22. 22. [email protected] Mesos and Docker
  23. 23. [email protected] Mesos HA
  24. 24. [email protected] Mesos Tasks States TaskState Int Description TASK_STARTING 0 TASK_RUNNING 1 Task TASK_FINISHED 2 TERMINAL: The task finished successfully TASK_FAILED 3 TERMINAL: The task failed to finished TASK_KILLED 4 TERMINAL: The task was killed by executor TASK_LOST 5 TERMINAL: The task was failed but can rescheduled TASK_STAGING 6 Initial State TASK_ERROR 7 TERMINAL: Task description contains an error
  25. 25. [email protected] References http://mesos.apache.org/documentation/latest/ Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center Dominant Resource Fairness: Fair Allocation of Multiple Resource Types playing-traffic-cop-resource-allocation-in-apache-mesos https://mesosphere.com/
  26. 26. [email protected] Thank you