Managing computational resources with Apache Mesos

54
Managing Computational resources with Apache Mesos Jackson Oliveira @cyber_jso

Transcript of Managing computational resources with Apache Mesos

Managing Computational resources with Apache Mesos

Jackson Oliveira@cyber_jso

Software Architect, more than 13 years working with I.T

Currently Working at ilegra as consultant.

SOA specialist.

Believe on agile, devops principles, people transformation and open source community ideas behind.

Like videogames and watch series.

Football fun.

Blog: http://jackson-s-oliveira.blogspot.com.br/Linkedin: https://www.linkedin.com/in/jacksonsoliveirafacebook:https://www.facebook.com/jackson.dossantos.5

Storymap traditional data centers resource usage ….90s… A different scenario

Storymap traditional data centers resource usage ….90s… Changes for datacenter adminstration

Storymap traditional data centers resource usage ….Virtualization, a bit game changing

Siloed cluster (static partition,low granularity ) no shared resource ...Brought a different issue… The siloed cluster

Siloed cluster (static partition,low granularity ) no shared resource ...Siloed cluster (static partitioning, low granularity) no shared resource

Siloed data center Data center managed by mesos

mesos

Resilience ...Resilience

Management complexity (technology , hardware facts ) ...Management complexity (technology, hardware, facts) ..

Distributed systems added complexity

Releases demand more effort

Orchestration complexity

Heterogeneous Archs on the same datacenter

Failures

Service Discovery

Big Data - distribute is needed

Mesos - The datacenter operational system

Resources as abstractions

Mesos Architecture

Mesos Architecture

Mesos Architecture

Mesos Architecture

Mesos Architecture

Slave node anatomy

Master node responsibilities

Master nodes High Availability

Frameworks responsibility

Frameworks Ecosystem

Long Running jobs

Big Data Processing

Batch Scheduling

Data Storage

Frameworks can coexist on the same datacenter

Resource Offering process

Resource Offering process

Resource Offering process

Resource Offering process

Resource Isolation

Native isolation using Linux containers

Isolation Mechanisms

CPU Share, Disk quotas and Bandwith limits

CPU Disk Network

- Core Isolation - Enforce maximum space usage limits

- Limit I/O bandwidth usage- Ports

Resource allocation

How to prevent frameworks starving?

Before workload After workload

How to prevent frameworks starving?

Static reservation - Good for Stateful services

Dynamic reservation - Good for Stateful Scheduled tasks

Resource preemption

Resource preemption

Frameworks: Marathon!

Built to support long running jobs

Specificing contranstraints

Good to ensure tasks are:

● Running all slaves● Running At least one per

datacenter● Running on specific slaves● Running at least one per

cluster group

Scaling applications out

New tasks can be allocated dinamically

Handling failures

Handling failures

Timeout!

Handling failures

Report the frameworks!

Handling failures

Reschedule the tasks!

● Automation is heavily needed on this environment● Troubleshooting can be tricky● Monolithic systems may not take full advantage from this solution● Ops: supporting the platform rather than specific products● Applications that demands specific SO and hardware improvements may

wont leverage the benefits from this approach

Considerations

Managing Computational resources with Apache Mesos

Jackson Oliveira@cyber_jso

Cesar Mesquita@cmesquita00

Thank you!