SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

21
Hajira Jabeen, University of Bonn M1-M18 Review Meeting BDE Architecture

Transcript of SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

Page 1: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

Hajira Jabeen, University of Bonn

M1-M18 Review Meeting

BDE Architecture

Page 2: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

Structure

◎Evolution of BDE architecture

◎User of BDE

◎Working

2

Page 3: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

Platform Description

3

Page 4: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

Technology assessment

◎Lessons learned:o A lot of technologies availableo Big Data space moves fasto High barrier to entry

◎Focus:o Ease of use

❖ Installation, development, deployment, monitoring

o Flexibility❖Keep options open for future

o Reuse effort of the community❖Don't reinvent the wheel

4

Page 5: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

Technical requirements

◎Input: o WP2: General requirements elicitationo WP5: Specific pilot requirements

◎Initial idea: platform profile per Vo Not 1 V that overrules the others per SC⇒ Provide component suggestions per V

5

Page 6: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

Architectural design6

Page 7: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

Architectural design7

Page 8: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

Architectural design8

Page 9: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

User of BDE

The minimum knowledge requirements for theBDE user are:◎Ability to write programs for his particular use

case◎Inter connectivity of components, if he wants

to create a pipeline of different components◎Basics of distributed systems and web-

services◎However, this does not exclude experienced

users or data scientists from using theplatform with ease.

9

Page 10: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

User profiles10

Page 12: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

Developing a component

◎Base Docker imageso Serve as a template for a (Big Data) technologyo Easily extendable custom algorithm/data

◎Published componentso Responsibilities divided b/w partnerso Image repositories on GitHubo Automated builds on DockerHubo Documentation on BDE Wiki

12

Page 13: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

Deploying a Big Data pipeline

◎Pipeline: collection of communicating components to solve a specific problem

◎Described in Docker Composeo Component configurationo Application topology

◎Orchestrator required for initialization processo Components may depend on each othero Components may require manual intervention

13

Page 14: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

Scalability of BDE

◎1000 Nodes◎3000 Containers◎1 Swarm Manager◎Docker swarm V 1.0

14

Page 15: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

BDE vs Hadoop distributions

15

Page 16: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

BDE vs Hadoop distributions

Hortonworks Cloudera MapR Bigtop BDE

File System HDFS HDFS NFS HDFS HDFS

Installation Native Native Native Native lightweight virtualization

Plug & play components (no rigid schema)

no no no no yes

High Availability Single failure recovery (yarn)

Single failure recovery (yarn)

Self healing, mult. failure rec.

Single failure recovery (yarn)

Multiple Failure recovery

Cost Commercial Commercial Commercial Free Free

Scaling Freemium Freemium Freemium Free Free

Addition of custom components

Not easy No No No Yes

Integration testing yes yes yes yes --

Operating systems Linux Linux Linux Linux All

Management tool Ambari Cloudera manager

MapR Control system

- Docker swarm UI+ Custom

16

Page 17: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

BDE vs Hadoop distributions

BDE is:◎Not built on top of existing distributions◎Targets

o Communitieso Research institutions

◎Bridges scientists and open data◎Multi Tier research efforts towards Smart

Data

17

Page 18: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

User interfaces

◎Target: facilitate use of the platform

◎Available interfaces

o Workflow UIs

❖Workflow Builder

❖Workflow Monitor

o Swarm UI

o Integrator UI

18

Page 19: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

BDE Workflow builder19

Page 20: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

BDE Workflow monitor20

Page 21: SC4 Workshop 2: Hajira Jabeen BDE Platform architecture

Swarm UI21