KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event...

17
ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar and Training Course 01.02.2016 – 31.07.2017 (preparation period) Contractual Date of Delivery: 01.08.2016 Actual Date of Delivery: 31.01.2017 Author(s): Marko Stajcer Sr. (PI), Marko Stajcer Jr. (PI), Michael Kamp (FHG), Michael Mock (FHG) Institution: PI Workpackage: WP6 Security: PU Nature: R Total number of pages: 17

Transcript of KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event...

Page 1: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

ICT, STREP

FERARI ICT-FP7-619491

Flexible Event pRocessing for big dAta aRchItectures

Collaborative Project

D 6.4

Project Workshop, Seminar and Training Course 01.02.2016 – 31.07.2017 (preparation period)

Contractual Date of Delivery: 01.08.2016

Actual Date of Delivery: 31.01.2017

Author(s): Marko Stajcer Sr. (PI), Marko Stajcer Jr.

(PI), Michael Kamp (FHG), Michael Mock (FHG)

Institution: PI

Workpackage: WP6

Security: PU

Nature: R

Total number of pages: 17

Page 2: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

Project funded by the European Community under the Information and Communication Technologies Programme Contract ICT-FP7-619491

Project coordinator name: Michael Mock

Project coordinator organisation name:

Fraunhofer Institute for Intelligent Analysis

and Information Systems (IAIS)

Revision: 1

Schloss Birlinghoven, 53754 Sankt Augustin, Germany

URL: http://www.iais.fraunhofer.de

Abstract:

During the FERARI project’s lifetime, the consortium organised several

workshops to disseminate the project’s results and research. These workshops

targeted the scientific community as well as the industry. In this document, we list the workshops held and provide a description of their topics, their agendas and participants. We also describe already planned workshops that will take

place between month 30 and the project end.

Page 3: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

Project funded by the European Community under the Information and Communication Technologies Programme Contract ICT-FP7-619491

Revision history

Administration Status

Project acronym: FERARI ID: ICT-FP7-619491

Document identifier: D 6.4 Project Workshop, Seminar and Training Course (03.02.2016 – 31.01.2017)

Leading Partner: PI

Report version: 1 Report preparation date: 09.01.2017 Classification: PU

Nature: REPORT

Author(s) and contributors: Marko Stajcer Sr. (PI), Marko Stajcer Jr. (PI), Michael

Kamp (FHG), Michael Mock (FHG)

Status: - Plan

- Draft

- Working

- Final

x Submitted

Copyright

This report is © FERARI Consortium 2016. Its duplication is restricted to the personal use

within the consortium and the European Commission. www.ferari-project.eu

Page 4: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

FERARI Deliverable D6.4

Project Workshop, Seminar and Training Course

Marko Stajcer Sr. (PI), Marko Stajcer Jr. (PI), Michael Kamp (FHG), Michael Mock (FHG)

Page 5: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

2

Contents

1. Introduction ............................................................................................ 3

2. Event Processing, Forecasting and Decision-Making in the Big Data Era ............................ 4

3. Large-Scale Distributed Streaming – Methods And Applications ..................................... 8

4. Big Data Summer School ............................................................................. 10

5. FOI Data Science Summer School (Planned) ........................................................ 12

6. Workshop on Mobile Phone Fraud Detection (planned) ............................................ 13

7. Summary .............................................................................................. 14

Page 6: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

3

1. Introduction

Until month 30, FERARI organized three project-related workshops and plans to hold another two, one in September and another in November. The first workshop took place co-located with the EDBT/ICDT joint conference on Extending Database Technology and Database Theory and was organized by TUC. It focused on the core topic of FERARI, namely event processing and forecasting in the Big Data Era. The second workshop took place in Munich, Germany, together with researchers from LMU Munich, CERN and experts from Siemens AG; it was organized by FhG. It focused on distributed stream-processing and M2M interaction for the Internet of Things. Third workshop was organized by Poslovna Inteligencija togather with researches from TVZ – Polytechnicum Zagrabiense as part of Big Data summer school and it was held in Zagreb. Fourth workshop will be held in Varaždin as part of Data Science Summer School at Faculty of Organization and Informatics. The fifth workshop will take place in Zagreb, Croatia, together with researchers from the University of Zagreb, and experts form Ericson Nikola Tesla and IN2Data; it is organized by HT. The focus will be on use-cases for the FERARI architecture in the industry, in particular on the FERARI approach to mobile fraud detection.

Page 7: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

4

2. Event Processing, Forecasting and Decision-Making in the Big Data Era

Date and Location March 27th, 2015, Brussels, Belgium (in conjunction with EDBT 2015) Chairs General Chairs: Alexander Artikis, NCSR Demokritos, Greece Antonios Deligiannakis, Technical University of Crete, Greece Program Committee Chairs: Minos Garofalakis, Technical University of Crete, Greece Pedro Bizarro, FeedZai, Portugal Publicity Chair: Elias Alevizos (CONTACT PERSON), NCSR Demokritos, Greece Program Committee: Baber, Chris (University of Birmingham) Boley, Mario (Fraunhofer) Etzion, Opher (Yezreel Valley College) Fournier, Fabiana (IBM Research Haifa) Gal, Avi (Technion Institute of Technology) Giatrakos, Nikos (Technical University of Crete) Goulart, Paul (Oxford University) Kamp, Michael (Fraunhofer) Karkaletsis, Vagelis (NCSR Demokritos) Keren, Daniel (Haifa University) Lygeros, John (ETH Zurich) Paliouras, George (NCSR Demokritos) Papapetrou, Odysseas (Technical University of Crete) Patroumpas, Kostas (Athena Research and Innovation Center) Pelekis, Nikos (University of Piraeus) Pitt, Jeremy (Imperial College London) Sharfman, Izchak (Technion) Stathis, Kostas (Royal Holloway, Univ. of London) Theodoridis, Yannis (University of Piraeus) Weidlich, Matthias (Imperial College London) Website http://cer.iit.demokritos.gr/epfordm

Page 8: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

5

Program 9:00-9:10 Opening 9:10-10:10 Keynote: Challenges from Industrial Data Analytics Michael May, Siemens

Abstract Big data applications in industry pose a number of unique challenges, setting them apart from domains such as consumer analytics in the web. Central for many industrial applications is time series data generated by often hundreds or thousands of sensors at a high rate, e.g. by a turbine. Another important data source are log files generated by control units in complex technical equipment, e.g. PLCs (programmable logic controller). This data can be used for failure statistics, root cause analysis, predictive maintenance, or for optimizing the performance during product design. Especially interesting are use cases that combine in-situ streaming analytics inside the local devices with centralized information, e.g. time series data collected from a whole fleet of wind turbines. In this talk I will describe a number of Siemens’ machine learning applications, especially failure diagnostics at the CERN Large Hadron Collider, self-optimizing wind turbines, and levee monitoring for Waternet Amsterdam. I will also discuss architectural challenges for such systems from a Big Data point of view. About the speaker Michael May is Head of the Technology Field Business Analytics & Monitoring at Siemens Research and Technology Center, and responsible for ten research groups in Munich, Vienna, Brasov, St. Petersburg, Princeton, and Berkeley. He is driving research at Siemens in data analytics and big data architectures and implements with his teams data analytics solutions across Siemens. Before joining Siemens in 2013, he was Head of the Knowledge Discovery Department at the Fraunhofer Institute for Intelligent Analysis and Information Systems in Bonn, Germany. In cooperation with industry he developed Big Data Analytics applications in sectors ranging from telecommunication, automotive, retail, logistics to finance and advertising. Michael was responsible for a number of National and European funded research projects in the area of Data Mining, Machine Learning, and Big Data. Between 2002 and 2009 he coordinated two Research Networks in Data Mining and Machine Learning at the European level, and he was local chair of ICML 2005. He did his PhD on machine discovery of causal relationships at the Graduate Programme for Cognitive Science at the University of Hamburg

10:10-10:30 Complex Event Processing under Uncertainty: A Short Survey Elias Alevizos, Anastasios Skarlatidis, Alexander Artikis, Georgios Paliouras 10:30-11:00 BREAK 11:00-11:20 Extending Event-Driven Architecture for Proactive Systems Fabiana Fournier, Alexander Kofman, Inna Skarbovsky, Anastasios Skarlatidis 11:20-11:40 Towards Flexible Event Processing in Distributed Data Streams Sebastian Bothe, Vasiliki Manikaki, Antonios Deligianakis, Antonios Deligianakis 11:40-12:00 Latent Fault Detection With Unbalanced Workloads Moshe Gabel, Kento Sato, Daniel Keren, Satoshi Matsuoka, Assaf Schuster

Page 9: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

6

12:00-12:20 What You See Is What You Do: applying Ecological Interface Design to Visual Analytics Natan Morar, Chris Baber, Peter Bak, Adam Duncan 12:20-12:30 The Proasense project Hans Torvatn 12:30 Closing Workshop Proceedings http://ceur-ws.org/Vol-1330/EDBTICDT-WS2015-complete.pdf Workshop Description

The Big Data era has posed a number of challenges in applications related to event processing. In particular, the data volume, velocity and distribution necessitate the design on new scalable approaches for the efficient and timely processing of the produced data. The lack of veracity in the handled data/events further complicates the problem. Moreover, key challenges concern the use of the voluminous data in order to forecast future events and perform proactive event-driven decision-making.

Event forecasting is important because eliminating or mitigating an anticipated problem, or capitalizing on a forecast opportunity, can substantially improve our quality of life, and prevent environmental and economic damage. For example, changing traffic-light priority and speed limits to avoid traffic congestions will reduce carbon emissions, optimize transportation and increase the productivity of commuters. At the business level, making smart decisions ahead of time can become a differentiator leading to significant competitive advantage. In a wide range of applications, prevention is more effective than the cure. To prevent problems and to capitalize on opportunities before they even occur, a proactive event-driven decision-making paradigm is necessary. Decisions are triggered by forecasting events instead of reacting to them once they happen. Moreover, decisions are made in real-time and require on-the-fly processing of Big Data, that is, extremely large amounts of noisy data flooding in from various locations, as well as historical data.

The aim of the EPForDM workshop is to bring together computer scientists with interests in the fields of event processing, event forecasting and event-driven decision-making to present recent innovations, find topics of common interest and stimulate further development of new approaches to make sense of Big Data.

Topics of interest include (but are not limited to):

Scalable event processing under uncertainty

Distributed event processing

Event forecasting

Multi-scale temporal aggregation of events

Machine learning for event processing and forecasting

Distributed machine learning

Event-driven decision-making

Visual analytics for proactive decision-making and Big Data

Human Factors evaluation of proactive event-driven systems

Novel architectures for Big Data processing

Engineering proactive event-driven systems

Position papers on proactive event-driven systems

Privacy issues in Big Data processing

Page 10: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

7

Energy efficiency and reliability in Big Data processing

Scheduling and provisioning issues in Big Data processing

The Workshop on event processing, forecasting and decision-making in the big data era took place in conjunction with EDBT 2015 in Brussels, Belgium. The workshop received 16 submissions out of which 6 have been selected for presentation by a scientific committee. Moreover, the workshop organizers were able to invite Michael May from Siemens AG to give a keynote talk. The workshop was well-attended during the whole day and let to lively and interesting technical discussions, both during the session and the breaks.

Page 11: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

8

3. Large-Scale Distributed Streaming – Methods And Applications

Date and Location June 15th to 17th, 2016, Munich, Germany Chairs

Michael Mock (Fraunhofer IAIS) Website http://www.ferari-project.eu/2016/06/20/large-scale-distributed-streaming-methods-and-applications-workshop-successfully-held-at-fraunhofer-gesellschaft-in-munich/ Program

9:00 – 9:30

Michael Mock, Fraunhofer IAIS

Welcome, Overview on the FERARI Project

9:30 – 10:00

Daniel Keren, Haifa University

Lightweight Monitoring of Distributed Streams Using Convex Bounding Functions

10:00 – 10:30

Minos Garofalakis, TUC

Efficient Analytics over Big, Dynamic, Distributed Data

11:00 – 11:20

Moshe Gabel, Technion Using Stream Mining Techniques for Machine Health Monitoring

11:20 – 11:40

Moshe Gabel, Technion Monitoring Least Squares Models of Distributed Streams

11:40 – 12:10

Assaf Schuster Technion

Lazy Detection of Complex Events over Event Streams

12:10 – 12:40

Antonios Deligiannakis, TUC

Optimizing Massive-Scale Complex Event Processing

14:00 – 14:30

Michael May, Siemens Industrial Data Analytics @ Siemens

14:30 – 15:00

Volker Tresp, Siemens Machine Learning for Context Sensitive Event Prediction

15:30 – 16:00

Filippo M. Tilaro, Manuel G. Berges, CERN

Data Analytics for CERN Control System

16:00 – 16:30

Thomas Seidl, LMU Analyzing even faster data streams

16:30 – 17:00

Matthias Schubert, LMU

Monitoring Relational Patterns in Volatile Data Set

Workshop Description In recent years the amount of data generating devices is growing rapidly: mobile phones, sensors in cars, smart home devices, or industrial machines. Many of today’s Big Data technologies were built for processing human-generated data and focus on batch processing of data stored on distributed file systems. As Big Data finds its way to other data sources, this design decision becomes limiting. An area with great future potential is machine-to-machine interaction (M2M), and the Internet of Things. This, however, requires processing of massive and predominantly transient data streams.

Page 12: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

9

The goal of this workshop is to bring together researchers and practitioners in the field of large-scale distributed streaming in order to discuss current developments at the interface between research and industrial applications. It aims to foster an exchange between the research communities as well as to promote recent research results to applications in industry. This public workshop is organized by the FERARI Project in collaboration with Siemens and the LMU Munich. The workshop took place at Fraunhofer central in Munich, Germany. The workshop was well attended with 12 speakers and 16 participants. Besides members of the FERARI consortium the workshop was able to obtain speakers from the Ludwig-Maximilians-University Munich, from Siemens AG and from CERN. In addition to the excellent talks, the workshop enabled lively discussion on use-cases for FERARI technology for both CERN and Siemens, as well as on the possibilities of collaboration with LMU.

Page 13: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

10

4. Big Data Summer School

Date and Location July 4th to 8th, 2016, Zagreb, Croatia Chairs

Marko Stajcer (Poslovna Inteligencija)

Sergej Luković (TVZ)

Hrvoje Appelt (TVZ)

Lada Banić (Poslovna Inteligencija)

Hrvoje Gabelica (Poslovna Inteligencija)

Iva Sorić (Poslovna Inteligencija) Web site http://www2.tvz.hr/2016/05/big-data-summer-school/ Program Day 1: Business Aspect Way of thinking: Sociotechnical systems way of thinking, Think like a Data Scientist Day 2: Big data concept / design Way of thinking: Think like a Big Data Architect Day 3: Big Data Architecture Ingestion and Processing Big Data Way of thinking: Think like a Big Data Architect Day 4: Big Data Analytics Way of thinking: Think like a Data Scientist Day 5: Machine Learning and Visualization Way of thinking: Think like a Data Scientist Course Description School is based on the perfect combination of theory and practice. Participants are introduced to the concepts, best practices which can be combined with real-world case studies and practical examples with lots of practical exercises and knowledge transfer sessions. Big data summer school is based on knowledge transfer with main pillars:

Page 14: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

11

Trainings and knowledge transfers are provided by academic researchers on Big Data concepts, environments and analytics as well as leading practitioner experts. All of our trainings are based on our extensive practical knowledge and expertise within different industries and business segments with lots of practical examples and exercises.

FERARI workshop was part of Day 3 module: Big Data Architecture Ingestion and Processing Big Data Program 13:00 – 13:45 Streaming processing in Big Data Environments 13:45 – 14:30 FERARI architecture and Use-cases 14:30 – 16:00 Demo - Fraud detection in telecommunication using FERARI architecture

Page 15: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

12

5. FOI Data Science Summer School (Planned)

Date and Location September 13th to 15th, 2016, Faculty of Organization and Informatics Varaždin, Croatia Chairs

Marko Štajcer (Poslovna Inteligencija)

Dijana Oreški (Faculty of Organization and Informatics)

Web site http://www.foi.unizg.hr/hr/novosti/ljetna-skola-data-science Program 9:30 – 10:15 Streaming processing in Big Data Environments 10:15 – 11:00 FERARI architecture and Use-cases 11:00 – 12:00 Demo - Fraud detection in telecommunication using FERARI architecture Workshop Description

FERARI project and its role in Big Data World.

Page 16: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

13

6. Workshop on Mobile Phone Fraud Detection (planned)

Date and Location November 22nd, 2016, Zagreb, Croatia Chairs

Michael Mock (Fraunhofer IAIS)

Damir Bogadi (Hrvatski Telekom) Program 9:30 – 10:15 Overview on the FERARI project and architecture Michael Mock 10:15 – 11:00 Adaptive Query Optimization Antonis Deligiannakis 11:00 – 12:00 Discussion Workshop Description Communication service providers are often targets of fraud schemes that can significantly impact their revenues and service performance. 2015 Global Loss Survey states that loss from fraud in telecommunications accounts for 1.69 percent of lost revenues. Usual fraud detection systems in telecommunications utilize billing and usage information, network data, location data, CRM Data and some sources of external data in order to build systems that can detect certain patterns that correspond to fraudulent behaviour. The goal in fraud mining is to identify users, which use a network service without the intention to pay for that use. Many fraud mining systems in telecommunications use some form of rules, often defined by fraud experts or automatically by some software, to raise alarms. These alarms are checked by fraud investigators on a case-by-case basis. During night times when no fraud investigators are present the software may automatically block certain calls to prevent damage. During day times the fraud investigators take actions after they have investigated a case. It is their duty to decide whether a suspicious behavior is fraudulent or legal. This depends on the current call, the call history, the customer history and the subscription plan of the customer. The focus within FERARI lies on the identification of suspicious calls and users and the design of distributed communication efficient systems for this task. Within this coarse definition of telecommunication fraud several well-known patterns exist, each with its own characteristics. This workshop brings together the FERARI consortium, fraud mining experts from Hrvatski Telekom, experts from Ericsson Nikola Tesla, scientists from the University of Zagreb and experts from IN2Data in order to discuss the FERARI approach and architecture for fraud detection.

Page 17: KDUbiq Kick-Off Meeting · 2017-01-31 · ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.4 Project Workshop, Seminar

14

7. Summary

The FERARI consortium organised 5 workshops in total, including one workshop in conjunction with a top-tier conference, two workshops that brought together researchers with the industry and two summer schools. These events were used to (i) disseminate the consortium’s research and the FERARI approach in general, to (ii) advertise the FERARI open source architecture and its building blocks, and to (iii) receive feedback and discuss the approach. The workshops were very successful regarding all three of these goals. The EDBT workshop (2) increased the visibility of FERARI in the scientific community already at the beginning of the second year of the project. The workshop with mobile phone experts from HT and Ericsson (6) not only advertised the FERARI architecture to the industry but also brought valuable feedback and sparked fruitful discussions on our approach to the fraud detection use-case, as well as the general architecture. The Munich workshop (3) extended project’s visibility with attendees and speakers from Siemens AG, CERN, and the Ludwig-Maximillian-University Munich. The two summer schools (4, 5) further disseminated the project within the industry and linked the FERARI in-situ approach deeply with Big Data and Data Science topics. In addition to the workshops listed, the consortium members plan several further workshops that will take place after the project ends. A list of these workshops can be found in Deliverable D6.6.

The workshops reflect the project’s goal of simultaneously performing advanced research, driving the development of in-situ methods, and providing tangible solution prototypes to the industry. Together with the exploitation and dissemination activities and plans described in Deliverable D6.6, they are an integral part to the successful dissemination of the FERARI approach and the open source architecture.