Workflow Basics Guide - Informatica · 1 • • • • • • • • • • • workflow.
OpenEarth: a 4th paradigm workflow for marine and coastal science
Transcript of OpenEarth: a 4th paradigm workflow for marine and coastal science
`
Dutch Data Award for beta sciences 2012
Feb 28th 2013, SRIE 2013
Need for marine and coastal science/engineering http://nupedailynews.com/wp-content/uploads/2013/01/telescope..jpg
http://chromblog.thermoscientific.com/Portals/49739/images/ lims for biobanking1.png
http://www.eiroforum.org/media/photo_galleries/cern/cern-02l.jpg
http://oeatech.net/wp-content/uploads/2011/03/RADARSAT2-satellite.jpg
suspended mud particle
64 μm
6400 km: North Atlantic Oscillation (NAO)
gap 10-6 > 10+6
Water scarcity Flood protection Water pollution
http://upload.wikimedia.org/wikipedia/commons/2/22/ Da_Vinci_Vitruve_Luc_Viatour.jpg
Marine/coastal scales
OpenEarth is a workflow
philosophy technology
community
OpenEarth philosophy
Open up & collaborate
philosophy Non-zero sum game: 1 + 1 = 3 Tragedy of the Commons OpenEarth wants to keep sharing going
OpenEarth philosophy
Open up & collaborate
community
OpenEarth philosophy
Open up & collaborate
• all collaborating is not enough • chaos • coordination needed • but no overall boss • we need smooth workflow • like wikipedia
OpenEarth philosophy
technology
Open up & collaborate
check this out
OpenEarth community: social media & hands-on
philosophy technology
community
LinkedIn group 269 members
Sprint sessions: hands-on outreach & feed-back every year prior to Netherlands Centrum voor Kustonderzoek (NCK) days
OpenEarth philosophy: universal principles
philosophy Technology principles
community
1) data & tools are one: database alone is useless
raw data: volts counts
+ software
data product =
https://earth.esa.int/handbooks/meris/aux-files/image047.jpg http://logfurniturehowto.com/wp-content/uploads/2012/07/ Types-Of-Screwdrivers-And-Their-Uses.jpg
ESA/NASA raw satellite volts
ESA/NASA useful product physical units
2) good enough to share, if good enough to publish transparency to combat sloppiness, mistakes & fraud. No standardization yet.
http://adaywithsfe.files.wordpress.com/2012/09/eazie5.jpg
http://static.nl.groupon-content.net/25/05/1333050000525.jpg
http://cdn.chud.com/a/af/afdcafa0_2601_vegetable_lo_mein.jpeg
http://farm4.static.flickr.com/3330/3599399622_839ec0cc66.jpg http://blog.gjvanbussel.nl/wp-content/uploads/2011/09/Stapel.jpg
3) Quality is a version control process
• Sharing will generate feed-back: version control becomes inevitable • update your own work • allow altered copies to proliferate
• Version control is common practice for • ISO 9001: doc, xls, pdf • Toyota Kaizen: keep on improving • Software engineering: windows update • … science is lacking behind anyway
http://www.knups.nl/leuk_plaatje/56/Helaas_pindakaas.html http://www.food-info.net/nl/national/ww-pindakaas_clip_image001.jpg
OpenEarth technology: a pragmatic choice
philosophy Technology: an open software stack
community
Off-The-Shelf (OTS) vs off-the-drawing board
http://www.bnr.nl/incoming/621827-1301/fyra-578.jpg/ALTERNATES/i/Fyra-578.jpg
http://1.bp.blogspot.com/-Z8vfA9SmoOg/T84rGCE_XaI/AAAAAAAABLM/Wp2nXaTvrEc/s1600/Amsterdam+086.jpg
We seek an open source software stack that can serve our needs today. We do embrace new technology, but will not allow it to delay us from proper data management now.
Cloud technology: operational de facto web-standards
tailored data > OGC WxS …
> ISO SQL- PostGIS > OGC netCDF- CF OPeNDAP
SubVersion …
> OGC KML …
… graphics of data
standard data
raw data
catalogue of data
work done on server
work done on client
what where when who why how … data URLs
smart phone & tablet users scientists professionals
Subversion: version control on data and tools
Subversion: version control on data and tools
# commits
Time in quarters
Web services to data subsets: linking with DOI
All data in Google Earth: DataTube weather models
10 km resolution
satellite data
1km resolution
Dutch digital elevation maps
100 m resolution
scale of interest: km/yr
coastal bathymetry
10 m resolution
dune profiles
1m resolution
Pro
ces
scal
e: k
m/y
r
scale of interest: km/yr
Data scarcity > data abundance: data scientist most sexy job 21st century 1. 1000’s yrs: Empirical Archimedes ea. 2. 100’s yrs: Theoretical Newton ea. 3. 10’s yrs: Computational Neumann ea. 4. Now 4th paradigm by Microsoft Research
Portal to filter: deal with complaints overwhelmed users
Questions