S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker...

43
SUMMIT BERLIN

Transcript of S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker...

Page 1: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

S U MM I TB E R L I N

Page 2: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Building an Image Analysis auto-scaling hybrid HPC to research cancerAmador PahimQuality Assurance EngineerDefiniens AG

S e s s i o n I D

Page 3: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Page 4: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Our vision is to improve patient lives by matching patients to the

best therapies based on the most comprehensive digital profiling

Page 5: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Our proprietary technology finds structures, patterns and textures

in the tumor tissue image to better understand the disease biology

Page 6: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Page 7: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Page 8: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Project Steps

Project

Initiation

Receiving

Inspection

Region

annotations

Image

Analysis

Data

processingQC

Report and

Delivery

Page 9: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Project

Initiation

Receiving

Inspection

Region

annotations

Image

Analysis

Data

processingQC

Report and

Delivery

Definiens

Proprietary

Software

Project Steps

Page 10: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Project

Initiation

Receiving

Inspection

Region

annotations

Image

Analysis

Data

processingQC

Report and

Delivery

Definiens

Proprietary

Software

Project Steps

Page 11: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Project

Initiation

Receiving

Inspection

Region

annotations

Image

Analysis

Data

processingQC

Report and

Delivery

Definiens

Proprietary

Software

Project Steps

Page 12: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Project

Initiation

Receiving

Inspection

Region

annotations

Image

Analysis

Data

processingQC

Report and

Delivery

Definiens

Proprietary

Software

Project Steps

Page 13: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Internal Grid System

Tissue

Blur

Nucleus

Detectio

n

Tumor

Stroma

Annotation

s

Cell

Segmentatio

n

Level

Aggregatio

n

Page 14: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

800 cores

Internal Grid System

Tissue

Blur

Nucleus

Detectio

n

Tumor

Stroma

Annotation

s

Cell

Segmentatio

n

Level

Aggregatio

n

Page 15: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Tissue

Blur

Nucleus

Detectio

n

Tumor

Stroma

Annotation

s

Cell

Segmentatio

n

Level

Aggregatio

n

800 cores

35TB of RAM

Internal Grid System

Page 16: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Project

Initiation

Receiving

Inspection

Region

annotations

Image

Analysis

Data

processingQC

Report and

Delivery

Definiens

Proprietary

Software

Project Steps

Page 17: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Project

Initiation

Receiving

Inspection

Region

annotations

Image

Analysis

Data

processingQC

Report and

Delivery

Definiens

Proprietary

Software

Project Steps

Page 18: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Project

Initiation

Receiving

Inspection

Region

annotations

Image

Analysis

Data

processingQC

Report and

Delivery

Definiens

Proprietary

Software

Project Steps

Page 19: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Page 20: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

• Multiple data sets

• Different types of tasks

• Hybrid cloud support

• Auto scaling

• Job flow control

• Easy deployment

Requirements

Page 21: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task Dispatcher

Executor

API Web UI

Executor Executor

Executor

Task Scheduler

Architecture

Page 22: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Executor

Executor

Executor

Executor

Task Dispatcher

API Web UI

Task Scheduler

Deployment

Page 23: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

User Input

Tasks:

• Task1

• Type: python

• App: convert_format.py

• Task2:

• Type: spark

• App: heatmaps_calculation.py

• Upstream tasks: Task1,

Input Data

• Slide1

• Slide2

Task1/Slide1 Task1/Slide2

Task2/Slide1 Task2/Slide2

Resulting Tasks Workflow

First Level of Parallelismper input data

Page 24: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task Dispatcher

Second Level of Parallelismmultiprocessing

Task1/Slide1

Page 25: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task Dispatcher

Second Level of Parallelismmultiprocessing

Task1/Slide1

Python

Executor

- Python Executor

provisioning

Page 26: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task Dispatcher

Second Level of Parallelismmultiprocessing

Task1/Slide1- Python Executor

provisioning

- Task parameters

Python

Executor

Page 27: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task Dispatcher

Second Level of Parallelismmultiprocessing

Task1/Slide1- Python Executor

provisioning

- Task parameters

- Parallel execution of App

run() methods

Python

Executor

Page 28: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task Dispatcher

Second Level of Parallelismmultiprocessing

Task1/Slide1- Python Executor

provisioning

- Task parameters

- Parallel execution of App

run() methods

- Results report

Python

Executor

Page 29: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task Dispatcher

Second Level of Parallelismmultiprocessing

Task1/Slide1- Python Executor

provisioning

- Task parameters

- Parallel execution of App

run() methods

- Results report

- Executor teardown

Page 30: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task2/Slide1

Task Dispatcher

Second Level of Parallelismdistributed processing

Page 31: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task2/Slide1

Task Dispatcher

Spark

Driver

- Spark Driver provisioning

Second Level of Parallelismdistributed processing

Page 32: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task2/Slide1

Task Dispatcher

Spark

Driver

- Spark Driver provisioning

- Task parameters

Second Level of Parallelismdistributed processing

Page 33: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task2/Slide1

Task Dispatcher

Spark

Driver

Spark

Worker

Spark

Worker

Spark

Worker

- Spark Driver provisioning

- Task parameters

- Spark Workers

provisioning

Second Level of Parallelismdistributed processing

Page 34: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task2/Slide1

Task Dispatcher

Spark

Driver

Spark

Worker

Spark

Worker

Spark

Worker

- Spark Driver provisioning

- Task parameters

- Spark Workers

provisioning

- Processing orchestration

Second Level of Parallelismdistributed processing

Page 35: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task2/Slide1

Task Dispatcher

Spark

Driver

Spark

Worker

Spark

WorkerSpark

Worker

- Spark Driver provisioning

- Task parameters

- Spark Workers

provisioning

- Processing orchestration

- Results report

Second Level of Parallelismdistributed processing

Page 36: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task2/Slide1

Task Dispatcher

- Spark Driver provisioning

- Task parameters

- Spark Workers

provisioning

- Processing orchestration

- Results report

- Cluster teardown

Second Level of Parallelismdistributed processing

Page 37: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

• DSS - Data Streaming Service

• Serves tiles from multiple file formats

• Standard data access service for internal applications

• Can be executed as a container

• Supports multiple storage backends (S3 included)

Data Access

Page 38: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

DSS

Executor

DSS

Data Access

Page 39: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

from jpc import Job

from jpc import InputData

from jpc import PythonTask

task1 = PythonTask(name='heatmaps',

app='Heatmaps.py',

app_args=['-p', '-r'],

repository_url='[email protected]:projects/12312.git')

input_data = InputData([['dss://dss.definiens.com/projects/12312/slide1'],

['dss://dss.definiens.com/projects/12312/slide2']])

job = Job(name='heatmaps_generation',

tasks=[task1],

input_data=input_data)

job_status = job.submit()

User Interface – First Version

Page 40: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

User Interface – First Version

Page 41: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

• Data Access Framework

• Final User Interface: Portal Integration

• More executors to come:

• Amazon SageMaker

• Amazon EMR

• Amazon Lambda experiment

• Projects billing

Next Steps

Page 42: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Thank you!

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Amador [email protected]

Page 43: S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMITSUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.