D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System...

25
This document is issued within the frame and for the purpose of the CANDELA project. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 776193. The opinions expressed and arguments employed herein do not necessarily reflect the official views of the European Commission. The dissemination of this document reflects only the author’s view and the European Commission is not responsible for any use that may be made of the information it contains. This document and its content are the property of the CANDELA Consortium. The content of all or parts of this document can be used and distributed provided that the CANDELA project and the document are properly referenced. Each CANDELA Partner may use this document in conformity with the CANDELA Consortium Grant Agreement provisions. (*) Dissemination level: PU: Public, fully open, e.g. web; CO: Confidential, restricted under conditions set out in Model Grant Agreement; CI: Classified, Int = Internal Working Document, information as referred to in Commission Decision 2001/844/EC. Copernicus Access Platform Intermediate Layers Small Scale Demonstrator D3.1 System architecture design and Operational scenarios document v1 Keywords: Cloud, Platform as a Service, data processing, earth observation Document Identification Status Final Due Date 31/10/2018 Version 1.0 Submission Date 29/10/2018 Related WP WP3 Document Reference D3.1 Related Deliverable(s) N/A Dissemination Level (*) PU Lead Participant Atos France Lead Author Fabien CASTEL (Atos France) Contributors Anne-Sophie TONNEAU (Atos France) Reviewers Michelle Aubrun (TAS FR) Ref. Ares(2018)5539748 - 29/10/2018

Transcript of D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System...

Page 1: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

This document is issued within the frame and for the purpose of the CANDELA project. This project has received funding from the European

Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 776193. The opinions expressed and arguments

employed herein do not necessarily reflect the official views of the European Commission.

The dissemination of this document reflects only the author’s view and the European Commission is not responsible for any use that may

be made of the information it contains. This document and its content are the property of the CANDELA Consortium. The content of all or

parts of this document can be used and distributed provided that the CANDELA project and the document are properly referenced.

Each CANDELA Partner may use this document in conformity with the CANDELA Consortium Grant Agreement provisions.

(*) Dissemination level: PU: Public, fully open, e.g. web; CO: Confidential, restricted under conditions set out in Model Grant Agreement; CI:

Classified, Int = Internal Working Document, information as referred to in Commission Decision 2001/844/EC.

Copernicus Access Platform Intermediate Layers Small Scale Demonstrator

D3.1 System architecture design and Operational

scenarios document v1

Keywords:

Cloud, Platform as a Service, data processing, earth observation

Document Identification

Status Final Due Date 31/10/2018

Version 1.0 Submission Date 29/10/2018

Related WP WP3 Document Reference D3.1

Related Deliverable(s)

N/A Dissemination Level (*) PU

Lead Participant Atos France Lead Author Fabien CASTEL (Atos France)

Contributors Anne-Sophie TONNEAU (Atos France)

Reviewers Michelle Aubrun (TAS FR)

Ref. Ares(2018)5539748 - 29/10/2018

Page 2: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 2 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

Document Information

List of Contributors

Name Partner

Fabien CASTEL Atos France

Anne-Sophie TONNEAU Atos France

Document History

Version Date Change editors Changes

0.1 31/08/2018 Anne-Sophie TONNEAU (ATOS FR)

Initial Table of Content

0.2 29/09/2018 Fabien CASTEL (ATOS FR) Reworked table of content

Recap of the topic to be addressed in each chapter

0.3 17/10/2018 Fabien CASTEL (ATOS FR) Version to be reviewed by partners

0.4 26/10/2018 Fabien CASTEL (ATOS FR) Version for quality review

0.5 29/10/2018 Juan Alonso (ATOS ES) Quality Assessment

1.0 29/10/2018 Jose Lorenzo (ATOS ES) Coordinator approval for submission

Quality Control

Role Who (Partner short name) Approval Date

Deliverable leader Fabien CASTEL (Atos FR) 26/10/2018

Quality manager Juan Alonso (ATOS ES) 29/10/2018

Project Coordinator Jose Lorenzo (ATOS ES) 29/10/2018

Page 3: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 3 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

Table of Contents

Document Information ............................................................................................................................ 2

Table of Contents .................................................................................................................................... 3

List of Tables ............................................................................................................................................ 4

List of Figures ........................................................................................................................................... 5

List of Acronyms ...................................................................................................................................... 6

Executive Summary ................................................................................................................................. 7

1 Platform architecture overview ....................................................................................................... 8

1.1 Architectural overview of the infrastructure .......................................................................... 8

1.2 Architectural overview of the PaaS ......................................................................................... 9

1.2.1 Platform as a Service architecture ...................................................................................... 9

1.2.2 Cloud Elastic Resources ....................................................................................................... 9

1.2.3 Application Management .................................................................................................... 9

2 COTS components .......................................................................................................................... 10

2.1 GeoServer .............................................................................................................................. 10

2.2 User Management with Keycloak ......................................................................................... 10

3 Candela components ..................................................................................................................... 11

3.1 Data access components ....................................................................................................... 11

3.1.1 CreoDIAS Data Connector ................................................................................................. 11

3.1.2 Internal data management ................................................................................................ 12

3.2 User access components ....................................................................................................... 13

3.2.1 Notebook Environment ..................................................................................................... 13

3.2.2 JupyterHub ........................................................................................................................ 14

3.3 Processing pipeline ................................................................................................................ 15

4 Operational scenarios .................................................................................................................... 17

4.1 Change detection processing chain ....................................................................................... 17

4.1.1 Functional view.................................................................................................................. 17

4.1.2 Technical view ................................................................................................................... 17

5 Conclusions .................................................................................................................................... 19

References ............................................................................................................................................. 20

Annexes ................................................................................................................................................. 21

Annex I - Change detection service description ................................................................................ 21

Annex II - Change detection pipeline execution request ................................................................... 23

Page 4: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 4 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

List of Tables

Table 1: Kubernetes Cluster Machines .................................................................................................... 8

Table 2: Jupyter main kernels ................................................................................................................ 14

Page 5: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 5 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

List of Figures

Figure 1: CANDELA development platform ______________________________________________ 8

Figure 2: Metadata extract of a Sentinel-1 product ______________________________________ 12

Figure 3: Collections available through s3fs mount point __________________________________ 12

Figure 4: Jupyter Notebook web interface _____________________________________________ 13

Figure 5: Notebook environment architecture __________________________________________ 14

Figure 6: JupyterHub global architecture ______________________________________________ 15

Figure 7: CANDELA platform processing pipeline ________________________________________ 16

Figure 8: TAS-FR Change Detection Pipeline ____________________________________________ 17

Page 6: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 6 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

List of Acronyms

Abbreviation / acronym

Description

API Application Programming Interface

COTS Commercial Off-The-Shelf

CSW Catalogue Service for the Web, OGC standard for data catalogue requesting

D3.1 Deliverable number y belonging to WP3

EC European Commission

EO Earth Observation

GIS Geographic Information System

HTML HyperText Markup Langage

HTTP HyperText Transfer Protocol

JSON JavaScript Object Notation

LDAP Lightweight Directory Access Protocol

OGC Open Geospatial Consortium

PaaS Platform as a service

REST Representational State Transfer

SAR Synthetic Aperture Radar

SMTP Simple Mail Transfer Protocol

SSH Secure Shell

URL Uniform Resource Locator

VM Virtual Machine

WCS Web Coverage Service

WFS Web Feature Service, OGC standard for georeferenced object

WMS Web Map Service, OGC standard for georeferenced coverage

WP Work Package

WPS Web Processing Service, OGC standard for geospatial processing services

WS Web Service

XML Extensible Markup Language

Page 7: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 7 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

Executive Summary

This deliverable describes the first version of the CANDELA platform. It presents all the foreseen components and their interactions. It is the result of the first development phase iteration.

At this step of the overall project, and according to the planned development process, inputs from other work packages (mainly WP2) are minimal as functional requirements are still being elaborated. Thus, the architecture presented here is a basis, with high level features that are required whatever the functional choices that will be taken (PaaS setup, user management) or specific features that are strongly foreseen to be required (processing pipeline).

The document presents an overview of the platform architecture. The main concepts of Platform as a Service, resources management and application integration on a cloud environment are exposed.

It then provides a comprehensive list of the components deployed on the platform. These components fall into the COTS (reused components) or custom (component involving specific development) categories.

Finally, a description of the first operational scenario is described, based on the image processing chain provided by TAS-FR. A functional view is presented followed by the description of the technical work required to integrate this chain to the CANDELA platform.

Page 8: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 8 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

1 Platform architecture overview

1.1 Architectural overview of the infrastructure

The CANDELA development platform is organized as follow:

Figure 1: CANDELA development platform

The bounce server is the only access point from the outside. An SSH connection has to be initiated to this machine. This step requires transmitting an SSH public key to the platform administrator so that the SSH connection could be established.

Once connected to the bounce server, all the other machines can be accessed through SSH without any more constraints. The computing cluster is composed of a Kubernetes Master and 2 Kubernetes Workers. The following table recap the characteristics of the different machines.

Table 1: Kubernetes Cluster Machines

Type Machine Name CPU RAM Local Disk Space

Master candela-0001 4 8 GB 100 GB

Worker candela-0002 8 16 GB 100 GB

Worker candela-0003 8 16 GB 100 GB

Page 9: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 9 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

1.2 Architectural overview of the PaaS

1.2.1 Platform as a Service architecture

The cloud environment of the CANDELA platform is hosted on the CreoDIAS cloud facility, provided by CloudFerro. CreoDIAS is part of the ESA DIAS program. Static VMs and storage units are provisioned by a platform administrator.

The platform components are independent application running on this infrastructure, integrated and communicating together to provide a set of features. Running all this components on same VM would be challenging (incompatibilities on the dependencies from all the components, high complexity induced by running several applications on the same system, network issues…). Using one VM for each component would require the provisioning of a lot of VMs which would increase the overall hosting cost. The solution that is used for the platform is to use an intermediate PaaS – Platform as a Service – layer to abstract the infrastructure layer for the applications. The hosted applications are managed as containers. The VMs are gathered as nodes into a cluster. Containers are executed in whatever node of the cluster.

Docker is used as the basis containerization technology. Kubernetes is used as the PaaS technology to setup the cluster and ensure the container management.

1.2.2 Cloud Elastic Resources

The need in terms of computation resources is very variable depending on the algorithms executed on the platform by the users. To adapt to this variable consumption, an elastic provisioning of the cloud computation resources has to be implemented.

The elasticity management on CANDELA relies on OpenStack [1] APIs provided by the CreoDIAS infrastructure. A specific configuration has to be done on the CANDELA Kubernetes instance to connect with these APIs, and a logic has to be implemented to describe how the cluster should increase or decrease according to the activity of the hosted applications.

This feature is not implemented on the first version of the platform but will be a major task on the future versions.

1.2.3 Application Management

All the components constituting the platform are packaged into Docker containers. This packaging is done by writing Docker specific configuration file (i.e. Dockerfile).

Containers are executed by Kubernetes into the cluster. The way each container is handled by Kubernetes is configured into Kubernetes specific configuration files. In this file all the following features can be configured:

• Network (all containers can be executed on a single virtual network or several sub-networks can be defined if some isolation is required)

• Redundancy (to automatically execute several instances of an application an manage the load-balancing between each instance)

• failure handling (to automatically restart an application if it fails inside the container)

• port mapping (to expose some ports to the outside)

• shared volume (to share folders between a container and the host system and/or between containers)

Page 10: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 10 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

2 COTS components

2.1 GeoServer

GeoServer [3] is an open-source server written in Java that allows users sharing, processing and editing geospatial data. Designed for interoperability, it publishes data from any major spatial data source using open standards. GeoServer functions are the reference implementation of the Open Geospatial Consortium Web Feature Service standard, and also implements the Web Map Service, Web Coverage Service and Web Processing Service specifications.

GeoServer can be considered more as a processing tool than a catalogue. Its main use is to distribute georeferenced data by implementing the OGC standards (WFS, WMS, WCS…). However, it implements the OGC WPS standard, as such can be used as a processing tool catalogue.

2.2 User Management with Keycloak

User management in the CANDELA platform is ensured by reusing an off-the-shelf component deployed on the cluster, called Keycloak [2]. Keycloak is an open source Identity and Access Management solution providing a simple Single-Sign-On solution. Users authenticate with Keycloak rather than doing so on each individual CANDELA component. This means that the components don't have to deal with login forms, authenticating users, and storing users. Once logged-in to Keycloak, users don't have to login again to access a different application.

Keycloak can connect to various sources of authorized users (LDAP, Active Directory, RDBMS "Social login"...). In the CANDELA configuration, it is configured to store users in a dedicated PostgreSQL database pod saving its content on the cluster filesystem. New users are added by a Keycloak administrator through the management console or through a dedicated API. User accounts are initialized with a random password that the user will change at its first connection. The admin has no access to user passwords by design. The Keycloak server can be connected to an SMTP server to automatize the registration and password reset processes.

CANDELA components are connected to Keycloak through the standard OAuth 2 protocol. OAuth 2 clients such as the Jupyter Hub component are registered by an administrator over the Keycloak management interface or API, and are given a client ID and secret, corresponding to an origin URL and a call-back URL.

When trying to authenticate a user, the client authenticates itself to Keycloak with its Id and Secret, the origin URL and the call-back address given by the client is verified against Keycloak registered information. When client identity is ensured, the user is redirected to a login page. When user identity is ensured Keycloak calls back the client with an authorization and refresh token. Authorization token can be used by the client to get additional information about the user. Refresh token is used to generate a new token for the user without it having to go through the login process.

Page 11: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 11 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

3 Candela components

3.1 Data access components

3.1.1 CreoDIAS Data Connector

CreoDIAS provides access to metadata through a Data Finder API [5]. This is an HTTP WS API that is accessible for free and anonymously.

The collections available are:

• Sentinel1

• Sentinel2

• Landsat8

• Landsat7

• Landsat5

• Envisat

An XML description is available for each collection, for instance for Sentinel1, the description of the request to perform is:

https://finder.creodias.eu/resto/api/collections/Sentinel1/describe.xml

Requests are performed following the OpenSearch standard. For instance, to get the Sentinel1 products covering Poland:

https://finder.creodias.eu/resto/api/collections/Sentinel1/search.json?_pretty=true&maxRecords=1000&q=Poland

This request returns a JSON description of every products covering Poland. As it can be seen in Figure 2, the field “productIdentifier” contains the path where the product files are located on the platform. See next section to know how to access product files.

Page 12: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 12 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

Figure 2: Metadata extract of a Sentinel-1 product

For more information about OpenSearch, see [6].

3.1.2 Internal data management

An s3fs volume is mounted on the platform, making directly accessible the EO data provided by CreoDIAS.

The mount point is /s3fs, and from this folder the following collections are available:

Figure 3: Collections available through s3fs mount point

For more information about how to access EO data from CreoDIAS, see [7]

Page 13: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 13 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

3.2 User access components

One feature of CANDELA is to enable users to execute processing tools on a remote platform. In general, processing tools are black boxes: pieces of code packaged and integrated on the platform that cannot be explored. The CANDELA development environment goal is to give the users the capacity to work at a lower level. It enables users to develop their own code and execute it on the platform with full access to the platform computation resources and data repository.

3.2.1 Notebook Environment

Jupyter Notebook [4] is the technology used to allow user to interact with CANDELA’s server components.

Figure 4: Jupyter Notebook web interface

Jupyter enables users to manipulate notebook documents. A document is an HTML file containing code, textual information and additional metadata (execution language, version…). From the user point of view, a document is a sequence of cells. A cell can contain code or rich text information encoded with markdown syntax. Each cell can be executed independently. An execution context is managed by the server to keep in memory all variables defined in a previously executed cell.

Page 14: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 14 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

Jupyter is a remote development environment available for the users from their browser that allows writing code and executing it on a server machine.

Figure 5: Notebook environment architecture

The notebook documents are stored on the server machine. They can be downloaded by the user to be kept locally or shared with other users.

There are several obvious benefits of such an approach. First, users do not need to install anything in their local machine. They can run Python scripts without any local Python installation for instance. Moreover, programmes executed in the Notebook environment can access data located on the Notebook server. There is no need to download locally all the data. When dealing with big amounts of data, it is possible to execute code that filters, selects or reduces them, and transfers only small amounts to the user machine for displaying.

The core Notebook server is responsible for routing the client request and managing the Notebook documents. The actual execution of the code is performed by components installed in addition to the core server. The basic Jupyter installation provides a Python 2.7 kernel. To execute code from any other language, additional kernels should be installed and configured. When installing a language specific kernel, it is possible to install additional libraries so that users can use them natively in their Notebook.

Targeted languages and additional libraries are listed in the following table:

Table 2: Jupyter main kernels

Kernel Name Coding Language

Language Version

Additional libraries

Python 3 Python 3.6.5 rasterio, netCDF4, NumPy, SciPy, pandas

3.2.2 JupyterHub

The JupyterHub project is a Jupyter sub-project adding a multi-user layer to the Jupyter core server with an authentication system that allows using it efficiently in a business environment.

JupyterHub is composed of:

Page 15: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 15 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

• A Hub component, managing the multi-users connections, with their specific access rights and password.

• A frontend proxy, routing request to the Hub and creating Jupyter server instances specific for each user. The proxy is then in charge of transferring the incoming requests to the corresponding Jupyter server.

• Several standard Jupyter servers.

Figure 6: JupyterHub global architecture

The action of creating a Jupyter server specific for a user is called spawning. In the Hub, a specific module, the Spawner, handles this action. Several implementations exist for this module according to the strategy chosen to spawn the Notebook servers: local or remote spawning based on cluster management software (Torque, PBS...), on Docker, on Kubernetes… The spawner implementation installed on the CANDELA platform is based on Kubernetes.

A specific module handles the user management on the hub. As for the spawning, several implementations of this module exist: based on the UNIX users, on the OAuth protocol or on LDAP. LDAP is the adopted solution for the platform, as it is the technology used for the Identity and Authorization Manager.

A specific Jupyter instance is created every time a user launches the Notebook feature in the platform. Thus, by default the workspace of this instance is empty. Different solutions exist to enable a persistent workspace. On the platform, instance spawning is based on Kubernetes, and thus by extension on Docker containerization. Each newly created instance is a new Docker container running on the Kubernetes platform. Kubernetes/Docker allows configuring volumes, i.e. folders shared between containers, or between a container and the host system. The strategy to make the user workspace persistent on JupyterHub is to create a volume mapping the JupyterHub workspace folder with the user workspace folder currently used on the platform.

3.3 Processing pipeline

Algorithms are delivered by service providers as Docker containers, in addition with a JSON description of inputs/outputs of the service. We talk about processing services. A GeoServer and Kubernetes API are used to parameterize, launch and monitor executed services. Each service Docker is manually built, and the Docker image is created and ready to be launched through the Kubernetes API.

Page 16: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 16 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

GeoServer provides the ability to discover, execute and manage processing services through an OGC Web Processing Standard (WPS) interface. Users can then interact with processing services with WPS requests to the GeoServer instance. The WPS standard also allows to chain processing services, creating a processing pipeline of services. GeoServer will handle the chaining of the outputs of one service as inputs of other services and will run each service of the pipeline step by step.

Figure 7: CANDELA platform processing pipeline

For each processing service, based on the JSON description, a corresponding processing script describing the inputs and outputs of the processing service is manually deployed in GeoServer. The GeoServer exposes the service, thanks to its WPS plugin. The service is then retrieved when performing a “GetCapabilities” request to GeoServer.

When GeoServer receives an "ExecuteProcess" WPS request, the script is parameterized with the WPS request inputs and GeoServer launches the Execution script corresponding to the processing service. The script sets the environment of the processing service, sends a scheduling order to the Kubernetes API with the corresponding Docker image, and monitors the execution of the resulting container.

During the execution of the service, a repository is created with the id of the execution and contains resulting files processed by the service and log files. User can interact with the file system and gather its results.

Page 17: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 17 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

4 Operational scenarios

4.1 Change detection processing chain

4.1.1 Functional view

TAS proposes to develop an application of change detection on Earth Observation satellite images time series. Using the fine temporal (few days) and spatial (5 - 10 meters per pixel) resolution of Sentinel-1 and Sentinel-2, a lot of use cases can be carried out in many diverse sectors (e.g. Security, Emergency, Marine and Land surveillance). Moreover, this application will allow not only to make faster decisions, but also to analyse trends and to predict future changes, thus simplifying the work of many operators and generate economic and social value assessment.

Figure 8: TAS-FR Change Detection Pipeline

Due to the difference between optical and SAR sensors, change detection chains proposed are different for both sensors. Each chain has the purpose of being scalable and split into micro services for easy deployment.

The change detection tool for both optical and SAR images time series are basically composed of three main modules: change detection map generation, change index calculator and classification.

• The first module provides the geo-referenced change detection maps between consecutive images in a time series.

• The second module transforms the raw change-detection maps from the first module into a more interpretable change index image

• The third module infers groups of change families through time by making clustering analysis on the change index images

4.1.2 Technical view

The ChangeDetection, ChangeIndex and ChangeClustering processes are integrated to Candela platform as GeoServer services, as described in section 3.3 Processing pipeline.

To get the description of these services or run the pipeline of services described in Figure 8: TAS-FR Change Detection Pipeline, a WPS API is exposed by GeoServer.

Page 18: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 18 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

For instance, the Annex I - Change detection service description shows that expected inputs are:

• IMAGES: a folder containing GeoTIFF Sentinel 2 images

• USERNAME: the name of the user

• OUTPUT_FILENAME (optional): the name of the resulting file of the ChangeDetection process

• LOG_FILE (optional): the name of the log file

As described in Figure 8: TAS-FR Change Detection Pipeline, the pipeline consists of chaining the ChangeDetection, ChangeIndex and ChangeClustering services. That can be performed with a WPS request, as described in the example of request in Annex II - Change detection pipeline execution request. Performing a POST request to Candela GeoServer will run simultaneously the 3 services on the platform.

Page 19: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 19 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

5 Conclusions

This document presented the first version of the CANDELA platform. A set of initial features have been described that allows covering a first scenario. That scenario while minimal and embryonic is operational at this point.

The very next step will be to generalise the work done for this scenario to integrate the other expected scenarios provided by the other WP1 and WP2 partners.

Another work to carry on the months to come will be to improve the integration between the development environment (the Jupyter user interface) and the WPS engine (processing pipeline). An easy to use API should be provided to easily instantiate, chain and trigger WPS executions.

Finally, the most important topic in the next step of the project will be to progress on the scalability features that the platform provides. In particular, the management of elastic cloud resources will be the main solution to investigate.

Page 20: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 20 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

References

[1] OpenStack official website, https://docs.openstack.org, retrieved 2018-10-12

[2] Keycloak official web page, https://www.keycloak.org/, retrieved 2018-10-15

[3] GeoServer, http://geoserver.org, retrieved 2018-06-14

[4] Jupyter Notebook, http://jupyter.org, retrieved 2018-06-12

[5] CreoDIAS Data Finder API https://creodias.eu/eo-data-finder-api-manual , retrieved 2018-10-17

[6] OpenSearch standard http://www.opensearch.org/Home , retrieved 2018-10-17

[7] How to access earth observation data https://creodias.eu/-/how-do-i-access-earth-observation-eo-data- , retrieved 2018-10-17

Page 21: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 21 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

Annexes

Annex I - Change detection service description

<?xml version="1.0" encoding="UTF-8"?>

<wps:ProcessDescriptions

xmlns:xs=http://www.w3.org/2001/XMLSchema

xmlns:ows=http://www.opengis.net/ows/1.1

xmlns:wps=http://www.opengis.net/wps/1.0.0

xmlns:xlink=http://www.w3.org/1999/xlink

xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance

xml:lang="en" service="WPS" version="1.0.0"

xsi:schemaLocation="http://www.opengis.net/wps/1.0.0

http://schemas.opengis.net/wps/1.0.0/wpsAll.xsd">

<ProcessDescription wps:processVersion="1.0.0" statusSupported="true"

storeSupported="true">

<ows:Identifier>candela:ChangeDetectionProcessing</ows:Identifier>

<ows:Title>ChangeDetection</ows:Title>

<ows:Abstract>&lt;p&gt;Computes unsupervised change detection on a time-series of

sentinel 2 images.&lt;/p&gt;</ows:Abstract>

<DataInputs>

<Input maxOccurs="1" minOccurs="1">

<ows:Identifier>IMAGES</ows:Identifier>

<ows:Title>IMAGES</ows:Title>

<ows:Abstract>Path to the sentinel 2 time-series folder, each image should be

in GeoTiff format</ows:Abstract>

<LiteralData>

<ows:AnyValue/>

</LiteralData>

</Input>

<Input maxOccurs="1" minOccurs="1">

<ows:Identifier>USERNAME</ows:Identifier>

<ows:Title>USERNAME</ows:Title>

<ows:Abstract>User name</ows:Abstract>

<LiteralData>

<ows:AnyValue/>

</LiteralData>

</Input>

<Input maxOccurs="1" minOccurs="0">

<ows:Identifier>OUTPUT_FILENAME</ows:Identifier>

<ows:Title>OUTPUT_FILENAME</ows:Title>

<ows:Abstract>User defined OUTPUT_FILENAME filename. If empty, default file

basename is detected_changes.tif</ows:Abstract>

<LiteralData>

<ows:AnyValue/>

<DefaultValue/>

</LiteralData>

</Input>

<Input maxOccurs="1" minOccurs="0">

<ows:Identifier>LOG_FILE</ows:Identifier>

<ows:Title>LOG_FILE</ows:Title>

<ows:Abstract>name of the output log file. Must have .log extension. If empty,

default file basename is debug.log</ows:Abstract>

Page 22: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 22 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

<LiteralData>

<ows:AnyValue/>

<DefaultValue/>

</LiteralData>

</Input>

</DataInputs>

<ProcessOutputs>

<Output>

<ows:Identifier>filePath</ows:Identifier>

<ows:Title>filePath</ows:Title>

<LiteralOutput/>

</Output>

<Output>

<ows:Identifier>logFiles</ows:Identifier>

<ows:Title>logFiles</ows:Title>

<LiteralOutput/>

</Output>

</ProcessOutputs>

</ProcessDescription>

</wps:ProcessDescriptions>

Page 23: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 23 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

Annex II - Change detection pipeline execution request

<?xml version="1.0" encoding="UTF-8"?>

<wps:Execute

xmlns:wps=http://www.opengis.net/wps/1.0.0

xmlns=http://www.opengis.net/wps/1.0.0

xmlns:gml=http://www.opengis.net/gml

xmlns:ogc=http://www.opengis.net/ogc

xmlns:ows=http://www.opengis.net/ows/1.1

xmlns:wcs=http://www.opengis.net/wcs/1.1.1

xmlns:wfs=http://www.opengis.net/wfs

xmlns:xlink=http://www.w3.org/1999/xlink

xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance

version="1.0.0" service="WPS"

xsi:schemaLocation="http://www.opengis.net/wps/1.0.0

http://schemas.opengis.net/wps/1.0.0/wpsAll.xsd">

<ows:Identifier>candela:ChangeClusteringProcessing</ows:Identifier>

<wps:DataInputs>

<wps:Input>

<ows:Identifier>IMAGE</ows:Identifier>

<wps:Reference method="POST" mimeType="text/xml"

xlink:href="http://geoserver/wps">

<wps:Body>

<wps:Execute service="WPS" version="1.0.0">

<ows:Identifier>candela:ChangeIndexProcessing</ows:Identifier>

<wps:DataInputs>

<wps:Input>

<ows:Identifier>IMAGE</ows:Identifier>

<wps:Reference method="POST" mimeType="text/xml"

xlink:href="http://geoserver/wps">

<wps:Body>

<wps:Execute service="WPS" version="1.0.0">

<ows:Identifier>candela:ChangeDetectionProcessing</ows:Identifier>

<wps:DataInputs>

<wps:Input>

<ows:Identifier>IMAGES</ows:Identifier>

<wps:Data>

<wps:LiteralData>/data/exchange/candela/common_data/test-

files/Images/Harbour/TimeSeries</wps:LiteralData>

</wps:Data>

</wps:Input>

<wps:Input>

<ows:Identifier>USERNAME</ows:Identifier>

<wps:Data>

<wps:LiteralData>a.tonneau</wps:LiteralData>

</wps:Data>

</wps:Input>

<wps:Input>

<ows:Identifier>OUTPUT_FILENAME</ows:Identifier>

<wps:Data>

Page 24: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 24 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

<wps:LiteralData>changedetection20181009_result.tif</wps:LiteralData>

</wps:Data>

</wps:Input>

<wps:Input>

<ows:Identifier>LOG_FILE</ows:Identifier>

<wps:Data>

<wps:LiteralData>debug.log</wps:LiteralData>

</wps:Data>

</wps:Input>

</wps:DataInputs>

<wps:ResponseForm>

<wps:RawDataOutput>

<ows:Identifier>filePath</ows:Identifier>

</wps:RawDataOutput>

</wps:ResponseForm>

</wps:Execute>

</wps:Body>

</wps:Reference>

</wps:Input>

<wps:Input>

<ows:Identifier>USERNAME</ows:Identifier>

<wps:Data>

<wps:LiteralData>a.tonneau</wps:LiteralData>

</wps:Data>

</wps:Input>

<wps:Input>

<ows:Identifier>OUTPUT_FILENAME</ows:Identifier>

<wps:Data>

<wps:LiteralData>changeindex20181009_result.tif</wps:LiteralData>

</wps:Data>

</wps:Input>

<wps:Input>

<ows:Identifier>LOG_FILE</ows:Identifier>

<wps:Data>

<wps:LiteralData>debug.log</wps:LiteralData>

</wps:Data>

</wps:Input>

</wps:DataInputs>

<wps:ResponseForm>

<wps:RawDataOutput>

<ows:Identifier>filePath</ows:Identifier>

</wps:RawDataOutput>

</wps:ResponseForm>

</wps:Execute>

</wps:Body>

</wps:Reference>

</wps:Input>

<wps:Input>

<ows:Identifier>USERNAME</ows:Identifier>

<wps:Data>

<wps:LiteralData>a.tonneau</wps:LiteralData>

Page 25: D3.1 System architecture design and Operational scenarios ... · Document name: D3.1 System architecture design and Operational scenarios document v1 Page: 2 of 25 Reference: D3.1

Document name: D3.1 System architecture design and Operational scenarios document v1

Page: 25 of 25

Reference: D3.1 Dissemination: PU Version: 1.0 Status: Final

</wps:Data>

</wps:Input>

<wps:Input>

<ows:Identifier>OUTPUT_FILENAME</ows:Identifier>

<wps:Data>

<wps:LiteralData>changeclustering20181009_result.tif</wps:LiteralData>

</wps:Data>

</wps:Input>

<wps:Input>

<ows:Identifier>LOG_FILE</ows:Identifier>

<wps:Data>

<wps:LiteralData>debug.log</wps:LiteralData>

</wps:Data>

</wps:Input>

</wps:DataInputs>

<wps:ResponseForm>

<wps:ResponseDocument/>

</wps:ResponseForm>

</wps:Execute>