Cloud hpc-bigdata-challenges
-
Upload
microsoft-azure-for-research -
Category
Data & Analytics
-
view
154 -
download
0
Transcript of Cloud hpc-bigdata-challenges
![Page 2: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/2.jpg)
![Page 3: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/3.jpg)
![Page 4: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/4.jpg)
IT
PAC
![Page 5: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/5.jpg)
![Page 6: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/6.jpg)
Melbourne
Sydney
Brazil
Beijing
![Page 7: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/7.jpg)
Programming tools: Scala, IPython, Azure ML, …
Frameworks: Spark, Hadoop, Yarn, HDInsight, Reef, Twister, Brisk
Software Defined Storage
Software Defined Networks
Hardware Abstraction/Virtualization
![Page 8: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/8.jpg)
http://tce.technion.ac.il/files/2012/06/Scott-shenker.pdf
www.opennetsummit.org/pdf/2013/presentations/albert_greenberg.pdf
http://www.cs.princeton.edu/~jrex/papers/pyretic-login13.pdf
![Page 9: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/9.jpg)
![Page 10: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/10.jpg)
The Science Perspective
![Page 11: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/11.jpg)
Every research field is now a data science field
![Page 12: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/12.jpg)
Last
few decades
Thousand
years ago
Today and the FutureLast few
hundred years
2
2
2.
3
4
a
cG
a
a
Simulation of
complex phenomena
Newton’s laws,
Maxwell’s equations…
Description of natural
phenomena
Unify theory, experiment and
simulation with large
multidisciplinary Data
Using data exploration and
data mining
(from instruments, sensors,
humans…)
Distributed Communities
![Page 14: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/14.jpg)
![Page 15: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/15.jpg)
![Page 16: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/16.jpg)
![Page 17: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/17.jpg)
![Page 18: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/18.jpg)
![Page 19: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/19.jpg)
Inputs (training data)
Labels
Hidden layers
Input dataDetected featuresMona Lisa
![Page 20: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/20.jpg)
![Page 21: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/21.jpg)
• The Genetic Causes of Disease
(David Heckerman)
• Wellcome Trust for a GWAS for a large
population
• Looking for causes for seven common
diseases (bipolar, r. arthritis, coronary,
hypertension, ….)
• Confounding is a problem. Needed a
new algorithm.
• Ran on Azure cloud using 35,000 cores
in 3 weeks.
![Page 22: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/22.jpg)
![Page 23: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/23.jpg)
![Page 24: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/24.jpg)
Chameleon Cloud SDN
NIH data commons
![Page 25: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/25.jpg)
Mesos
Tachyon
Docker Spark
Data Analytics and ML programming tools
Reef
Twister
![Page 26: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/26.jpg)
![Page 27: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/27.jpg)
![Page 28: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/28.jpg)
![Page 29: Cloud hpc-bigdata-challenges](https://reader030.fdocuments.net/reader030/viewer/2022032616/55a4dd961a28ab43768b45f1/html5/thumbnails/29.jpg)
• Many Examples
• The Challenge: sustainability Data
Acquisition &
modelling
Collaboration
and
visualisation
Analysis &
data mining
Dissemination
& sharing
Archiving and
preserving