Data Science, Knowledge Discover, Mining and Learning

10
WCN Panel Interdisciplinary Research for Cloud Computing: Future and challenges Data Science, Knowledge Discovery, Mining and Learning Wagner Meira Jr. Universidade Federal de Minas Gerais, Brazil

Transcript of Data Science, Knowledge Discover, Mining and Learning

WCN Panel

Interdisciplinary Research for Cloud Computing: Future and challenges

Data Science, Knowledge Discovery, Mining and Learning

Wagner Meira Jr.Universidade Federal de Minas Gerais, Brazil

Panel Questions

1) How are the basic premises and challenges of a given research area being affected by the rise of cloud computing?

2) What are the changes we already see in those areas?

3) Where are we heading? Are there new holy grails?

Context: Data Science, Knowledge Discovery, Mining and Learning

● Data science (and related areas) aims to extract actionable knowledge from large volumes of data (semi-) automatically.

● Enablers:– Data storage reached unprecedent scale and is still increasing

(e.g. IoT)

– Computational power is cheaper than ever.

– Techniques are mature and have broad applicability.

– Increasing and broad interest for using computational analytics and intelligence in daily activities. (That's the difference!)

Question 1: How does cloud computing change premises?

● Latency:

– Interactive tasks (e.g., visualization) become more challenging● Bandwidth:

– Transferring large amounts of data is not trivial, affecting application setup and computation.● Computational power:

– Elasticity brings an additional variable for analytics, since we suddenly may vary the resource usage.

● Storage:

– Larger storage availability allows trading computation and communication.● Streaming vs. Batch:

– Better connectivity enables better integration of streaming and batch tasks.● Computing workload:

– How to exploit reference locality and other strategies for multi-user scenarios.

Question 2: What are the changes we already see?

● Issues:– QoS: virtualization enables computing management, but several SLA

dimensions are still open.

– Scalability: parallel programming environments are in place, not necessarily being accessible and capable of efficiently solving all algorithmical demands

– Privacy and security: Very few proposals that are clear, effective and covering a broad spectrum. Legal issues are a challenge.

● Commercially, Hadoop enabled several services, and it is being extended to Spark.

● There is still a huge market of data-related applications and services to be integrated into the clouds, and much more to come.

Question 3: Where are we heading?

● Data will keep increasing in terms of volume, diversity and complexity, demanding novel models and algorithms, and making QoS, scalability, privacy and security even more challenging:– Multimedia data

– Small data

– IoT

● Data scientists' role is to design data flows, not necessarily to code them.● Data-driven decision making will become commonplace

– Agents

– Analytics applications

– Smart cities

● Cloud-aware algorithms: strategy for making algorithms cost-effective considering their intensity both in terms of computation and communication:– Hybrid Memory Cube

– GPU

Thank you!

Interdisciplinary Research for Cloud Computing: Future and challenges

Data Science, Knowledge Discovery, Mining and Learning

Wagner Meira Jr.Universidade Federal de Minas Gerais, Brazil