PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban...

23
International Workshop on Science Gateways Monday 19 October 2014, Brisbane PLEASE NOTE: Registration for this workshop is via http://doodle.com/4y8hyxe7b2rwfgxs. There is no registration option via the eResearch Conference Registration Site Time Speaker Title The Role of Science Gateways 9 Keynote: Nancy Wilkins-Diehr (by videoconference) Science Gateways: The Importance of Building Community 9:30 Nigel Ward, Glenn Moloney Reflections on the NeCTAR Virtual Laboratory Program 9:50 Richard O. Sinnott, Christopher Bayliss, Andrew Bromage, Gerson Galang, Yikai Gong, Philip Greenwood, Glenn Jayaputera, Davis Marques, Luca Morandini, Ghazal Nogoorani, Marcos Nino-Ruiz, Hossein Pursultani, Rosana Rabanal, Muhammad Sarwar, William Voorsluys, Ivo Widjaja The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation Virtual Laboratory 10:30 Morning tea Science Gateway Experiences Part 1 10:50 Richard O. Sinnott, Jemie Effendy, Stephan Gloeckner, Anthony Stell Beyond a Disease Registry: An Integrated Virtual Environment for Adrenal Cancer Research 11:10 Michelle Barker A Science Gateway for Malaria: Successes and Challenges 11:30 Uwe Rosebrock, Peter Oke, Roger Proter, Gary Carroll, Simon Pigot, Xiao (Ming) Fu The Marine Virtual Laboratory – ocean modeling made easy 11:50 Aurel F. Moise, Tim Pugh, Martin Dix, Bertrand Timbal The Australian Climate and Weather Science Virtual Laboratory (CWSLab) 12:10 Discussion 12:20 Lunch Science Gateway Experiences Part 2 1:10 Ian M. Atkinson, Jeremy Vanderwal, Daniel Baird, Andrew Krockenberger, Nigel Bajima, Scott Mills, Nigel G. Sim An Integrated Sensor Network and Research Data Management System for the Daintree Rainforest Observatory 1:30 Siddeswara Guru, Hoang Anh Nguyen, Shilo Banihit, Matthew Mulholland, Kim Olsson, Tim Clancy Development of cloud-based virtual desktop environment for synthesis and analysis for ecosystem science community 1:50 David Abramson, Hoang Nguyen Workflow driven Science Gateways 2:10 Sandra Gesing Developing Science Gateways: Current Solutions and Future Challenges 2:40 Afternoon tea The Future of Science Gateways 3:00 Panel discussion Chair: Rhys Francis 4:30 End

Transcript of PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban...

Page 1: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

                       

 

International Workshop on Science Gateways Monday 19 October 2014, Brisbane

PLEASE NOTE: Registration for this workshop is via http://doodle.com/4y8hyxe7b2rwfgxs.

There is no registration option via the eResearch Conference Registration Site Time Speaker Title The Role of Science Gateways 9 Keynote: Nancy Wilkins-Diehr (by

videoconference) Science Gateways: The Importance of Building Community

9:30 Nigel Ward, Glenn Moloney Reflections on the NeCTAR Virtual Laboratory Program

9:50 Richard O. Sinnott, Christopher Bayliss, Andrew Bromage, Gerson Galang, Yikai Gong, Philip Greenwood, Glenn Jayaputera, Davis Marques, Luca Morandini, Ghazal Nogoorani, Marcos Nino-Ruiz, Hossein Pursultani, Rosana Rabanal, Muhammad Sarwar, William Voorsluys, Ivo Widjaja

The Collaborative Urban Research Environment for Australia

10:10 Wojtek James Goscinski The Characterisation Virtual Laboratory 10:30 Morning tea Science Gateway Experiences Part 1 10:50 Richard O. Sinnott, Jemie Effendy,

Stephan Gloeckner, Anthony Stell

Beyond a Disease Registry: An Integrated Virtual Environment for Adrenal Cancer Research

11:10 Michelle Barker A Science Gateway for Malaria: Successes and Challenges

11:30 Uwe Rosebrock, Peter Oke, Roger Proter, Gary Carroll, Simon Pigot, Xiao (Ming) Fu  

The Marine Virtual Laboratory – ocean modeling made easy

11:50 Aurel F. Moise, Tim Pugh, Martin Dix, Bertrand Timbal

The Australian Climate and Weather Science Virtual Laboratory (CWSLab)

12:10 Discussion 12:20 Lunch Science Gateway Experiences Part 2 1:10 Ian M. Atkinson, Jeremy Vanderwal,

Daniel Baird, Andrew Krockenberger, Nigel Bajima, Scott Mills, Nigel G. Sim

An Integrated Sensor Network and Research Data Management System for the Daintree Rainforest Observatory

1:30 Siddeswara Guru, Hoang Anh Nguyen, Shilo Banihit, Matthew Mulholland, Kim Olsson, Tim Clancy

Development of cloud-based virtual desktop environment for synthesis and analysis for ecosystem science community

1:50 David Abramson, Hoang Nguyen Workflow driven Science Gateways 2:10 Sandra Gesing Developing Science Gateways: Current

Solutions and Future Challenges 2:40 Afternoon tea The Future of Science Gateways 3:00 Panel discussion Chair: Rhys Francis 4:30 End

Page 2: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

   

eResearch  Australasia  Conference  |  Melbourne  –  Australia  |  27  -­‐  31  October  -­‐  2014  

Science  Gateways:  The  Importance  of  Building  Community  Nancy  Wilkins-­‐Diehr  

San  Diego  Supercomputer  Centre        

Science   Gateways   or   portals   provide   researchers   with   online   access   to   data,   computational   tools,   and  resources.   They   help   researchers   collaborate,   discover,   organise,   analyse,   visualise   and   engage   –   in   short,  they   enable   research   in   bold   new   ways.   Recent   research   co-­‐ordinated   by   sciencegateways.org   involving  nearly   5,000   members   of   the   research   community   illustrates   the   importance   of   science   gateways   for  researchers   and   educators,   and   opportunities   for   growth.   The   research   measured   the   extent   and  characteristics  of  the  gateway  community  (reliance  on  gateways,  nature  of  existing  resources)  to  understand  useful  services  and  support  for  builders  and  users.  Sciencegateways.org  provides  a  range  of  opportunities  for  sharing   of   experiences   among   gateway   developers.  While   there   are   still   funding   challenges   for   gateways,  USA’s   National   Science   Foundation   recognising   of   the   importance   of   science   gateways,   and   advances   are  being   made   by   highlighting   how   small   investments   can   benefit   many   researchers,   highlighting   the  importance  of  building  community.  

ABOUT  THE  AUTHOR(S)    Nancy  Wilkins-­‐Diehr  is  Associate  Director  at  San  Diego  Supercomputer  Centre    and  co-­‐director  of  XSEDE’s  Extended  Collaborative  Support  program.  She  has  been  involved  in  science  gateways  and  their  interfaces  to  high-­‐performance  computing  since  2005.  Nancy  received  her  Bachelor’s  degree  from  Boston  College  in  Mathematics  and  Philosophy  and  her  Master’s  degree  in  Aerospace  Engineering  from  San  Diego  State  University.    

 

Page 3: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

   

eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  –  23  October  –  2015  

Reflections  on  the  NeCTAR  Virtual  Laboratory  Program  Nigel  Ward,  Glenn  Moloney  

1The  University  of  Melbourne,  Melbourne,  Australia,  [email protected]  2The  University  of  Melbourne,  Melbourne,  Australia,  [email protected]    

 The   NeCTAR   (National   eResearch   Collaboration   Tools   and   Resources)   project   [1]   has   established   eleven   Virtual  Laboratories   that   provide   rich   domain-­‐oriented   online   environments   connecting   Australian   researchers   to   facilities,  data  repositories  and  computational  tools  on  a  national  scale.  This  presentation  will  provide  an  overview  of  the  NeCTAR  Virtual  Laboratories,  describe  their  usage  and  successes,  and  reflect  on  how  NeCTAR’s  project  approach  supported  and  hindered  that  success.  

THE  NECTAR  VIRTUAL  LABORATORIES  The  NeCTAR  Virtual  Laboratories  provide  rich  domain-­‐oriented  online  environments  that  draw  together  research  data,  models,  analysis   tools  and  workflows  to  support  collaborative  research  across   institutional  and  discipline  boundaries.    They  include:  

• Gemomics   Virtual   Laboratory   [2],   that   aims   to   “take   the   information   technology   out   of   bioinformatics”,  providing  biologists  with  easy  access  to  a  suite  of  genomics  tools  and  resources  through  a  web-­‐portal;  

• MARVL   Marine   Virtual   Laboratory   [3],   supporting   forecasting   and   planning   for   marine   and   coastal  environments  by  providing  access   to  ocean  observations  and  ocean  models  which  prior   to   the   lab  had  been  difficult  for  users  to  set  up.  

• Virtual   Geophysics   Laboratory   [4],   providing   geophysicists  with   easy   access   to   geophysics   data,   workflows,  simulations,  software  tools  and  computational  infrastructure.  

• Climate   and  Weather   Science   Laboratory   [5],   supporting   use   of   the   ACCESS   weather   model,   reproducible  climate  and  weather  analysis  workflows,  visualisations,  and  access  to  climate  data.  

• Characterisation   Virtual   Laboratory   [6],   providing   a   data   management   and   workflow   environment   for  scientists  who  use  advanced  imaging  techniques  in  Neuroimaging,  Structural  Biology,  Energy  Materials  (X-­‐ray),  and  Energy  Materials  (Atom  Probe).  

• All  Sky  Virtual  Observatory  [7],  housing  cosmological  simulations  and  tools  that  allow  astronomers  to  observe  each  virtual  universe  as   if   it  were  real,  and  an  environment  for  the  hosting,  analysis,  and  exploration  of  data  from  the  SkyMapper  Southern-­‐Sky  Survey.  

• Humanities   Networked   Infrastructure   (HuNI)   [8],   that   combines   information   from   30   of   Australia’s   most  significant  cultural  datasets,  and  allows  researchers  to  assert  relationships  among  data.  

• Industrial   Ecology   Lab   [9],   supporting   modeling   of   environmental   impacts   through   complex   inter-­‐industry  supply-­‐chain  networks.  

• Biodiversity   and   Climate   Change   Virtual   Laboratory   [10],   supporting   experiments   in   Species   Distribution  Modelling,  including  projection  onto  future  climate  layers,  Species  Trait  Modelling,  Biodiversity  modelling,  and  Ensemble  statistical  distribution.  

• Endocrine  Virtual  Laboratory   (endoVL)   [11],  hosting   targeted  disease  registries  which  assist   researchers  and  clinicians  to  gather  a  large  enough  cohort  of  patients  to  conduct  a  study  or  clinical  trial  with  patients  suffering  from  the  rarer  endocrine  conditions    

• Alveo   [12]   that  brings   together  data  collections,  analysis   tools,  and  workflows   in  a  common  environment   to  allow  human  communication  scientists  to  study  speech,  language,  text,  and  music  on  a  large  scale.  

REFLECTIONS  ON  SUCCESS  From   a   project   management   perspective,   the   NeCTAR   Virtual   Laboratory   projects   carried   significant   risk:   they   all  involved   cross-­‐institutional   collaboration,   and   all   involved   integration   of   infrastructure   controlled   by   other  organisations.  Despite  these  risks,  they  are  all  successfully  operating  infrastructure,  are  all  reporting  strong  uptake  and  utilisation  by   their   respective   research  communities,   and   their   governance   committees  are  asserting  delivery  against  research  stakeholder  expectations.  How  did  NeCTAR  and  these  sub-­‐projects  successfully  deliver  research  value  in  spite  of  their  inherit  risks?  Where  did  it  go  wrong?  

What  went  well  The  NeCTAR  Request  for  Proposals  process  aimed  to  maximise  the  long-­‐term  research  benefits  of  the  infrastructure  by  favouring  proposals  that:  

• addressed  a  real  research  needs  within  an  existing  research  community;    

Page 4: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

   

eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  –  23  October  –  2015  

• involved   collaborative   partnerships   between   researchers,   infrastructure   developers   and   infrastructure  operators;  

• leveraged  existing  national  investments  in  computation,  storage,  data,  trust,  networks,  and  instruments;  • included  co-­‐investment  to  cover  operational  costs  beyond  the  development  phase.  

o all   projects   committed   to   operate   infrastructure   for   at   least   six   months,   although   some   have  committed  to  operate  for  longer,  and  many  have  since  successfully  sourced  additional  investments  to  sustain  their  infrastructure  beyond  these  initial  commitments.  

 All  NeCTAR  sub-­‐project  governance  groups  contain   researchers   responsible   for  ensuring   the  projects  deliver  value   to  their  target  research  domain.  These  research  participants  are  also  expected  to  act  as  champions  and  advocates  for  the  sub-­‐project  within  their  research  domains,  supporting  uptake  and  utilisation  by  the  broader  research  community.    The   most   successful   Virtual   Laboratory   projects   involved   strong   partnerships   between   researchers   and   software  developers.  The   researchers   contributed   their   vision,  domain  knowledge  and  bespoke   tools   to   the  project,  while   the  software   developers   helped  make   the   tools  more   robust   and  widely   available,   often  moving   them   from   stand-­‐alone  tools   that   run   on   a   researcher’s   desktop   and   into   online   research-­‐Software-­‐as-­‐a-­‐Service   offerings.   Given   the   novel  nature   of   Software-­‐as-­‐a-­‐Service   as   a   research   infrastructure   delivery   model,   NeCTAR   actively   fostered   knowledge  exchange  across  the  software  development  supporting  the  Virtual  Laboratories.    NeCTAR  required  its  sub-­‐projects  to  regularly  report  on  progress  against  milestones,  expenditure,  co-­‐investment,  risks,  measures  of  uptake,  and  communication  activities.  Having  projects  regularly  report  on  issues  and  risks  to  both  NeCTAR  and   at   sub-­‐project   governance   meetings   was   useful   in   setting   NeCTAR’s   own   risk   management   and   coordination  agendas.    NeCTAR  supported  a  change  control  process  that  allowed  projects  to  respond  to  opportunities  for  delivering  research  value   not   originally   envisaged   in   their   proposals.   These   processes   ensured   ongoing   changes   in   the   sub-­‐project  implementation  are  fully  understood  and  agreed  between  the  project  owners,  the  research  community,  and  NeCTAR.    NeCTAR   required   the   involvement   of   research   end-­‐users   in   Acceptance   Testing   of   agreed   sub-­‐project   deliverables.  Many  projects   found   that  writing   acceptance   criteria   ahead  of   time  and   involving   researchers   in   sign-­‐off   on  delivery  useful  in  managing  stakeholder  expectations.      

What  didn’t  go  so  well  Despite  the  success  of  the  NeCTAR  Virtual  Laboratories,  there  were  of  course  instances  where  projects  were  not  able  to  meet  research  community  expectations.    

• All   of   the   NeCTAR   Virtual   Laboratories   delivered   late.   This   is   perhaps   not   surprising   given   they   involved  complex  integrations  and  short  timelines  for  delivery  (2  years  for  the  stage  1  projects,  18  months  for  stage  2).    

• Some  projects   failed   to   continuously   deliver   infrastructure  during   their   software  development   phase,  which  disenfranchised   some   of   their   research   stakeholders.   Delays   in   delivery   or   stability   of   3rd   party   research  infrastructure   needed   by   the   Virtual   Laboratories   further   compounded   these   timeline   and   engagement  pressures.  

• Unfortunately,  the  relationship  between  the  researchers  and  software  developers  collapsed  in  a  few  projects,  leading   to   difficult   governance   conversations   about   changing   collaborators   mid-­‐project,   serious   delays   to  project  delivery  and  reduction  in  scope.  

• While   all   Virtual   Laboratories   used   the  NeCTAR   change   control   process,   they   universally   found   it   laborious,  expressing  a  desire  for  a  change  process  that  involves  governance  groups  rather  than  lawyers.  

CONCLUSION  Despite  its  inherent  risks,  the  NeCTAR  Virtual  Laboratory  program  has  been  an  outstanding  success.    All  of  the  NeCTAR  Virtual  Laboratories  are  operating  rich  domain-­‐oriented  online  environments  that  draw  together  research  data,  models,  analysis   tools  and  workflows   to   support   collaborative   research  across   institutional  and  discipline  boundaries.    All   are  reporting  strong  uptake  and  utilisation  by  their  respective  research  communities,  and  many  have  found  extra  funds  to  continue  to  operate  beyond  their  initial  commitments.    We  hope  that  this  brief  overview  of  the  NeCTAR  Virtual  Laboratory  Program  provides  inspiration  for  others  to  pursue  the  creation  and  operation  of  similar  domain-­‐oriented  online  research  environments.  

Page 5: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

   

eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  –  23  October  –  2015  

REFERENCES  1. National   eResearch  Collaboration   Tools   and  Resources  project.  Available   from  http://nectar.org.au/   ,   accessed  8  

June  2015.  2. Genomics  Virtual  Laboratory.  Available  from  https://genome.edu.au/,  accessed  8  June  2015.  3. MARVL  Marine  Virtual  Laboratory.  Available  from  https://portal.marvl.org.au/,  accessed  8  June  2015.  4. Virtual  Geophysics  Laboratory.  Available  from  http://vgl.auscope.org/,  accessed  8  June  2015.  5. Climate  and  Weather  Science  Laboratory.  Available  from  http://cwslab.nci.org.au/,  accessed  8  June  2015.  6. Characterisation  Virtual  Laboratory.  Available  from  https://www.massive.org.au/cvl,  accessed  8  June  2015.  7. All  Sky  Virtual  Observatory.  Available  from  http://www.asvo.org.au/,  accessed  8  June  2015.  8. Humanities  Networked  Infrastructure.  Available  from  https://huni.net.au/,  accessed  8  June  2015.  9. Industrial  Ecology  Lab.  Available  from  http://www.isa.org.usyd.edu.au/ielab/ielab.shtml,  accessed  8  June  2015.  10. Biodiversity   and   Climate   Change   Virtual   Laboratory.   Available   from   http://www.bccvl.org.au/,   accessed   8   June  

2015.  11. Endocrine  Virtual  Laboratory.  Available  at  https://endovl.org.au/,  accessed  8  June  2015.  12. Alveo.  Available  at  http://alveo.edu.au/,  accessed  8  June  2015.  

ABOUT  THE  AUTHOR(S)  Nigel  Ward   is   Deputy  Director   (Software   Infrastructure)   at   the  NeCTAR   (National   eResearch   Collaboration   Tools   and  Resources)  project,  where  he  primarily  co-­‐ordinates  projects  developing  cloud-­‐based  software  tools  for  the  Australian  research   community.   Nigel   is   based   at   the   University   of   Queensland,   and   before   joining   NeCTAR  managed   projects  within  the  UQ  ITEE  eResearch  Group  aimed  at  improving  research  capability  through  the  provision  of  IT  infrastructure.  In   previous   roles   he   worked   on   interoperability   and   standards   for  research   and   learning   technologies   in   the   Higher  Education  sector.    Glenn  Moloney  is  Director  of  the  NeCTAR  (National  eResearch  Collaboration  Tools  and  Resources)  project.  

 

Page 6: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

   

eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  –  23  October  –  2015  

The  Collaborative  Urban  Research  Environment  for  Australia  Richard  O.  Sinnott,  Christopher  Bayliss,  Andrew  Bromage,  Gerson  Galang,  Yikai  Gong,  Philip  Greenwood,    

Glenn  Jayaputera,  Davis  Marques,  Luca  Morandini,  Ghazal  Nogoorani,  Marcos  Nino-­‐Ruiz,    Hossein  Pursultani,  Rosana  Rabanal,  Muhammad  Sarwar,  William  Voorsluys,  Ivo  Widjaja    

(and  the  AURIN  Network)  Department  of  Computing  and  Information  Systems  

University  of  Melbourne  Melbourne,  Australia  

[email protected]    

Presenter  Name:  Richard  Sinnott  

ABSTRACT  

The   federally   funded  Australian  Urban  Research   Infrastructure  Network   (AURIN)  project   (www.aurin.org.au)  began   in  July   2010.   AURIN   was   tasked   with   developing   a   secure,   web-­‐based   virtual   environment   and   underpinning   e-­‐Infrastructure   offering   seamless,   secure   access   to   diverse,   distributed   and   extremely   heterogeneous   data   sets   from  numerous   agencies   across   Australia  with   an   extensive   portfolio   of   targeted   analytical   and   visualization   tools.   This   is  being  provisioned  for  Australia-­‐wide  urban  and  built  environment  researchers  –  itself  a  highly  heterogeneous  collection  of  research  communities  with  diverse  demands,  through  a  unified  urban  researcher  and  provider-­‐driven  collaboration  environment.   The   AURIN   platform   has   extensive   features   that   have   been   incorporated   to   support   the   research  community  in  gaining  access  to  a  wide  array  of  distributed  data  sets  as  well  as  for  data  providers  to  define  their  own  access  and  usage  demands  on  their  data  sets  including  access  to  highly  sensitive  data  sets.  This  paper  describes  these  demands   and   how   the   e-­‐Infrastructure   has   been   designed   and   implemented   to   accommodate   this   diversity   of  requirements,  both  from  the  user/researcher  perspective  and  from  the  data  provider  perspective.  The  utility  of  the  e-­‐Infrastructure   is   demonstrated   through   a   range   of   scenarios   reflecting   the   inter-­‐disciplinary   urban   research   now  possible  with   specific   focus  on  hitherto  challenging   (impossible!)   scenarios   that  demand  utmost   security   in  accessing  sensitive  (unit  level)  data  sets  and  commercially  sensitive  data.  

INTRODUCTION  The  Australian  Urban  Research  Infrastructure  Network  (AURIN)  project  (www.aurin.org.au)  is  a  major  national  project  across   Australia   that   commenced   formally   in   July   2010.   AURIN   initially   received   $20   million   of   funding   from   the  Australian   Government   Department   of   Industry   for   the   ‘establishment   of   facilities   to   enhance   the   understanding   of  urban  resource  use  and  management’  [1].  In  2013,  the  project  received  a  further  $4m  to  extend  (harden!)  the  facilities.  In  particular,  the  AURIN  project  was  tasked  with  providing  urban  and  built  environment  researchers  with  a  state  of  the  art   research   infrastructure   –   an   e-­‐Infrastructure   -­‐   offering   seamless   and   secure   access   to   data   and   tools   for  interrogating  a  wide  array  of  distributed  data  sets   from  diverse  agencies,   to  support  a  portfolio  of   research  activities  reflecting  the  diversity  of  the  urban  and  built  environment  research  agenda.  This  is  being  provisioned  through  a  unified  urban  collaborative  environment  offering  a  complete   lab-­‐in-­‐a-­‐browser  experience.  Key   to  AURIN  was   that   it  provides  access  to  data  from  the  definitive  urban  data  providers  across  Australia.  At  present  AURIN  makes  access  to  over  1800  data   sets   from   70   major   agencies   available.   This   includes   organisations   such   as   the   Australian   Bureau   of   Statistics,  VicRoads,  VicHealth  amongst  many  others.  The  basic  early   functionality  of  the  AURIN  platform  was  described   in  [2,3]  and  the  way  in  which  it  has  been  designed  and  developed  utilizing  agile  technologies  described  in  [4].  The  use  of  the  platform   is   described   in   [5-­‐7].   In   the   last   year   further  work   has   been   undertaken   in   extending   the   platform   to   give  access  to  more  data  sets  and  to  scale  the  systems  for  increased  numbers  of  research  and  their  workloads/demands.    

Many  of  the  AURIN  data  sets  are  highly  sensitive  including  both  commercially  and  because  they  relate  to  individuals  with   explicit   confidentiality   requirements.   To   tackle   this,   AURIN   has   developed   a   flexible   and   fine-­‐grained   security  model   through   extending   the   Australian   Access   Federation   authentication   with   more   advanced   authorization  capabilities.  This  paper  describes  these  solutions  and  how  they  can  be  used  to  restrict  access  to  data  sets.  Furthermore,  many  of   the  data   sets  demand  much   finer-­‐grained  access  and  usage   scenarios   to  be   supported  and  are  expected   to  meet  the  explicit  demands  of  data  providers.  We  describe  these  solutions  and  how  unit  level  data  can  now  be  utilized  by  exploiting  advanced  privacy-­‐driven  geospatial  data  aggregation  techniques.    

Key  to  the  success  of  AURIN  or  indeed  any  major  research  infrastructure  is  the  uptake  and  adoption  by  the  research  community   that   it   is   intended   for.   Since   the   project   started,   the   AURIN   platform   has   been   accessed   and   used   over  35,000  times  with  increasing  numbers  of  users  coming  from  non-­‐academic  domains  including  government  and  industry.  Figure  1  shows  the  access  and  usage  statistics  (data  provided  by  the  Australian  Access  Federation)  since  the  release  of  the  Beta-­‐5  version  of  the  platform  in  September  2014.  

Page 7: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

   

eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  –  23  October  –  2015  

 Figure  1:  AURIN  Access  and  Usage  Statistics  (September  2014  -­‐  May  2015)  

 This  talk  will  cover  all  of  these  aspects  and  plans  for  the  future  including  new  research  domains  that  will  build  upon  the  AURIN  platform.  

REFERENCES  1. AURIN  Final  Project  Plan,  http://aurin.org.au/resources/final-­‐project-­‐plan    2. R.O.  Sinnott,  G.  Galang,  M.  Tomko,  R.  Stimson,  Towards  an  e-­‐Infrastructure  for  Urban  Research  Across  Australia,  

IEEE  e-­‐Science  Conference,  Stockholm,  Sweden,  December  2011.  3. R.O.  Sinnott,  C.  Bayliss,  G.  Galang,  P.  Greenwood,  G.  Koetsier,  D.  Mannix,  L.  Morandini,  M.  Nino-­‐Ruiz,  C.  Pettit,  M.  

Tomko,  M.  Sarwar,  R.  Stimson,  W.  Voorsluys,  I.  Widjaja,  A  Data-­‐driven  Urban  Research  Environment  for  Australia,  IEEE  e-­‐Science  Conference,  Chicago  USA,  October  2012.  

4. R.O.  Sinnott,  C.  Bayliss,  L.  Morandini,  M.  Tomko,  Tools  and  Processes  to  Support  the  Development  of  a  National  Platform   for  Urban   Research:   Lessons   (Being)   Learnt   from   the   AURIN   Project,   11th   Australasian   Symposium  on  Parallel  and  Distributed  Computing  (AusPDC  2013),  Adelaide,  South  Australia,  January  2013.  

5. C.  Pettit,  et  al,  Building  an  e-­‐infrastructure  to  support  urban  and  built  environment  research  in  Australia:  a   lens-­‐centric  view,  Surveying  &  Spatial  Sciences  Conference  2013,  Canberra,  Australia,  April  2013.  

6. R.O.  Sinnott,  C.  Bayliss,  A.  Bromage,  G.  Galang,  G.  Grazioli,   P.  Greenwood,  G.  Macauley,   L.  Morandini,  M.  Nino-­‐Ruiz,  C.  Pettit,  M.  Tomko,  M.  Sarwar,  R.  Stimson,  W.  Voorsluys,  I.  Widjaja,  The  Australian  Urban  Research  Gateway,  Journal  of  Concurrency  and  Computation:  Practice  and  Experience,  April  2014,  doi:  10.1002/cpe.3282.  

7. C.   Pettit,   R.   Stimson,   J.   Barton,   X.   Goldie,   R.O.   Sinnott,   T.   Kvan,   The   Australian   Urban   Intelligence   Network  supporting   Smart   Cities,   CUPUM  2015   Conference   book   on   Smart   Cities   and   Planning   Support   Systems,   eds:   S.  Geertman,  J.  Stillwell,  J.  Ferreira,  R.  Goodspeed,  February  2015.  

ABOUT  THE  AUTHORS  Professor   Richard   O.   Sinnott   is   the   Director   of   eResearch   at   the   University   of   Melbourne   and   Chair   of   Applied  Computing  Systems.  In  these  roles  he  is  responsible  for  all  aspects  of  eResearch  (research-­‐oriented  IT  development)  at  the  University.  He  has  been   lead   software  engineer/architect  on  an  extensive  portfolio  of   national   and   international  projects,  with  specific  focus  on  those  research  domains  requiring  finer-­‐grained  access  control  (security).  He  is  technical  lead   for   the  AURIN  project.  Christopher  Bayliss   is   a   software  engineer  within  AURIN  with   focus  on   security;  Andrew  Bromage  is  a  software  (data)  engineer  within  AURIN;  Gerson  Galang  is  a  software  engineer  within  AURIN  with  focus  on  data  clients;  Yikai  Gong   is  a  PhD  candidate  at  the  University  of  Melbourne;  Philip  Greenwood   is  a  software  engineer  within  AURIN  with   focus  on  workflow  tools;  Glenn  Jayaputera   is   the   implementation  project  manager   for   the  AURIN  technical   team;   Davis   Marques   is   a   software   engineer   within   AURIN   with   focus   on   the   portal   user   interface;   Luca  Morandini  is  the  AURIN  data  architect;  Ghazal  Nogoorani  is  a  software  engineer  within  AURIN  with  focus  on  the  portal  user  interface;  Marcos  Nino-­‐Ruiz  is  a  software  engineer  within  AURIN  with  focus  on  data  clients;  Hossein  Pursultani  is  a  software   engineer  within  AURIN  with   focus   on   the   supporting   infrastructure   and   the   continuous   build   environment;  Rosana  Rabanal   is  a  software  engineer  within  AURIN  with  focus  on  the  supporting   infrastructure  and  the  continuous  build   environment;   Muhammad   Sarwar   is   a   software   engineer   within   AURIN   with   focus   on   the   supporting  middleware/business   logic;   William   Voorsluys   is   a   software   engineer   within   AURIN   with   focus   on   the   workflow  environment,  Ivo  Widjaja    is  a  software  engineer  within  AURIN  with  focus  on  the  portal  user  interface.  

Page 8: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

   

eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  -­‐  23  October  -­‐  2015  

The  Characterisation  Virtual  Laboratory    Wojtek  James  Goscinski  

Monash  University,  [email protected]    

DEMONSTRATION  ABSTRACT  The  Characterisation  Virtual  Laboratory  (CVL  –  www.massive.org.au/cvl)  is  a  collaborative  NeCTAR-­‐funded  project  to  develop  online  environments  for  researchers  using  advanced  imaging  techniques,  and  demonstrate  the  impact  of  connecting  national  instruments  with  computing  and  data  storage  infrastructure.      The  CVL  project  has  three  major  goals:    

1. To  integrate  Australia’s  imaging  equipment  with  specialised  HPC  and  cloud  capabilities.      

More  than  450  registered  researchers  have  used  and  benefited  from  the  technology  that  has  been  developed  by  the  CVL  project,  providing  them  with  an  easier  mechanism  to  capture  instrument  data  and  process  that  data  on  centralised  cloud  and  HPC  infrastructure,  including  MASSIVE  and  NCI.    

2. To  provide  scientists  with  a  common  cloud-­‐based  environment  for  analysis  and  collaboration.      

The  CVL  have  been  deployed  across  NeCTAR  federated  clouds  at  the  University  of  Melbourne,  Monash  University,  and  QCIF.  CVL  technology  has  been  used  to  provide  easier  access  to  HPC  facilities  at  MASSIVE,  NCI  and  Central  Queensland  University.  

 

3. To  produce  four  exemplar  platforms,  called  Workbenches,  for  multi-­‐modal  or  large-­‐scale  imaging  in  Neuroimaging,  Structural  Biology,  Energy  Materials  (X-­‐ray),  and  Energy  Materials  (Atom  Probe).      

The  CVL  environment  now  contains  103  tools  for  specialised  data  analysis  and  visualisation  in  Workbenches.  Over  20  imaging  instruments  have  been  integrated  so  that  data  automatically  flows  into  the  cloud  for  management  and  analysis.  

 

The  technology  developed  under  the  CVL  provides  simple  access  to  centralized  processing,  analysis  and  visualisation  software,  and  HPC  infrastructure,  for  newcomers  and  inexperienced  HPC  users.      

The  CVL  is  a  NeCTAR-­‐funded  collaboration  between  Monash  University,  Australian  Microscopy  &  Microanalysis  Research  Facility  (AMMRF),  Australian  Nuclear  Science  and  Technology  Organisation  (ANSTO),  Australian  Synchrotron,  National  Imaging  Facility  (NIF),  Australian  National  University,  The  University  of  Sydney,  and  The  University  of  Queensland.      

Page 9: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

   

eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  –  23  October  –  2015  

Beyond  a  Disease  Registry:  An  Integrated  Virtual  Environment  for  Adrenal  Cancer  Research  

Richard  O.  Sinnott,  Jemie  Effendy,  Stephan  Gloeckner,  Anthony  Stell  Department  of  Computing  and  Information  Systems  

University  of  Melbourne  Melbourne,  Australia  

[email protected]    

Presenter  Name:  Richard  Sinnott    

ABSTRACT  Many   biomedical   research   collaborations   are   focused   on   establishment   of   web-­‐based   databases   that   capture  phenotypic  and  in  some  cases  genotypic   information,  targeted  to  specific  diseases  –  so  called  disease  registries.  Such  resources  are  often  used  for  clinical  matchmaking  and  allow  information  on  patients  and  patient  disorders  to  be  shared  by   clinicians   with   wider   biomedical   research   communities   outside   of   a   given   hospital   setting,   and   potentially   with  patients  and/or  patient  advisory  groups.  Whilst  addressing  aspects  of  clinical  collaborations  through  making  (targeted)  biomedical  accessible  -­‐  such  registries  are  really  only  a  starting  point  for  what  can  be  achieved  to  support  biomedical  research   collaborations.   In   particular,   registries   should   ideally   be   augmented   with   a   portfolio   of   additional   service  offerings  that   facilitate  secure  research  collaborations:  bio-­‐banking  and  bio-­‐sample  data  tracking  capabilities;  support  for   feasibility  analysis  on  clinical   trials  and  studies;  offer   seamless  data   transfer   to/from  clinical   trials;  provide  search  and  analytical   capabilities   in  a  user-­‐driven   research  environment.  Such  a   feature   rich,   Internet-­‐based  virtual   research  environment  (VRE)  has  been  established  as  part  of  the  European  Union  funded  ENS@T-­‐CANCER  (www.ensat-­‐cancer.eu)  project   that   has   a   particular   focus   on   supporting   research   into   four   primary   types   of   adrenal   tumours.   This   paper  provides  an  overview  of   the  ENS@T-­‐CANCER  VRE,  outlining   its   core  capabilities  and  how   it  has  galvanized  previously  and  largely  fragmented,  country-­‐specific  database  and  registry  efforts.  The  ENSAT-­‐CANCER  VRE  is  now  globally  adopted  with  70  major  cancer  centres  around  the  world  using  the  VRE  and  over  20  major  international  multi-­‐centre  clinical  trials  now  fully  supported  and  integrated  into  the  platform.  This  VRE  provides  the  basis  for  the  recently  funded  Horizon2020  ENSAT-­‐HT  program.  This  talk  describes  the  platform  and  the  lessons  being  learned  in  its  development.  

INTRODUCTION  A  ubiquitous  problem  that  exists  in  undertaking  biomedical  research  is  access  to  clinical/biomedical  data,  and  especially  access   to   and   sharing  of  data  across  organisational   and  national  boundaries.   These   challenges  are  multi-­‐faceted  and  comprise  information  governance,  ethics  and  privacy  challenges;  human/social  and  organisational  factors,  as  well  as  a  variety   of   technical   implementation   issues   that  must   be   overcome.   On   the   latter   challenge,   a   body   of   work   on   the  realisation  of  solutions  for  secure,  web-­‐based  biomedical  data  sharing  now  exists   [1-­‐3].    For  many  clinical/biomedical  collaborations   this   is   through   support   for   targeted   disease   registries   [4,5].   Such   systems   are   used   to   aggregate  clinical/phenotypic   data   that   can   be   used   to   coalesce   the   common   understanding   and   treatment/management   of  patients   with   those   particular   diseases   at   a   clinical   level,   and   provide   a   model   for   potentially   sharing   of   physical  biosamples  of  patients  with  a  specific  phenotype  -­‐  subject   to  ethics  and  agreement  of  clinicians,  patients  and   indeed  the   organisations   involved.   These   biosamples   can   then   be   used   for   a   range   of   bioinformatics   analysis   and   –omics  research.  

One  prime  example  of  a  disease   registry   is   the   international  disorders  of   sex  development   registry   (I-­‐DSD  www.i-­‐dsd.org).  The  I-­‐DSD  registry  includes  extensive  phenotypic  information  on  over  1200  patients  with  rare  disorders  of  sex  development.  This   system  has  been  adopted  on  a  global   scale  and  captures  best  practice   in  disease   registries.   I-­‐DSD  provides  the  critical  mass  of  ‘standardised’  patient  data  that  has,  for  the  first  time,  allowed  research  into  disorders  of  sex   development   to   be   undertaken   in   a   systematic   and   statistically   relevant   manner   through   access   to   large-­‐scale  phenotypic   data   sets   covering   a   spectrum   of   DSD   manifestations.   However   whilst   essential   for   inter-­‐organisational  research  collaborations,  such  a  web-­‐based  disease   focused  registry   represents   really  a  starting  point   for  what  can  be  achieved  as  a  collaboration  platform.  

The  ENS@T-­‐CANCER  project   (www.ensat-­‐cancer.eu)  has   taken  the  basic   idea  of  a  web-­‐based  disease  registry   to  a  new  technological   level  to  support  a  complete  virtual  research  environment  (VRE)  for   integrated  biomedical  research  into  adrenal  tumours.    The  primary  focus  of  this  paper  is  to  describe  the  core  features  of  the  ENS@T-­‐CANCER  VRE  and  illustrate   the   way   in   which   they   collectively   provide   a   step   change   in   biomedical   research   capabilities   for   the  international  adrenal   tumour   research  community.  Some  of   these   features  were  originally  presented   in   [6],  however  

Page 10: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

   

eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  –  23  October  –  2015  

the  platform  has  evolved  to  include  a  range  of  more  advanced  features  to  support  biomedical  research.  These  include  VRE   data   access   and   usage   tracking   and   their   temporal   visualization;   logging   analysis   for   access   and   usage   statistics  exploiting  Cloud-­‐based  big  data  technologies;  mobile  applications;  as  well  as  support  for  a  range  of  biosample  labeling  and  tracking  capabilities.  This  talk  will  cover  these  capabilities  and  how  they  support  the  scientific  research.  

The  current  status  of  the  ENSAT-­‐CANCER  VRE  is  shown  in  Figure  1.  

 Figure  1:  ENSAT-­‐CANCER  Status  (May  2015)  

REFERENCES  1.   Ahmed,   S.F.,   Rodie,  M.,   Jiang,   J.,   Sinnott,   R.O.,   The   European   DSD   Registry   –   A   Virtual   Research   Environment,  

International  Journal  on  Sexual  Development,  Special  issue  on  Disorders  of  Sex  Development,    “New  concepts  for  human  disorders  of  sex  development”,  Sex  Dev.  2010;  4:192-­‐198  (http://DOI:10.1159/000313434).    

2.   Sinnott,   R.O.,   Stell,   A.J.,   Jiang,   J.,   Classifying   Architectural   Data   Sharing   Models   for   e-­‐Health   Collaborations,  Proceedings  of  the  International  HealthGrid  Conference,  Bristol,  UK,  June  2011.  

3.   Stell,  A.J.,  Sinnott,  R.O.,  Jiang,  J.,  Donald,  R.,  Chambers,  I.,  Citerio,  G.,  Enblad,  P.,  Gregson,  B.,  Howells,  T.,  Kiening,  K.,   Nilsson,   P.,   Ragauskas,   A.,   Sahuquillo,   J.,   Piper,   I.,   Federating   Distributed   Clinical   Data   for   the   Prediction   of  Adverse  Hypotensive  Events  Journal  of  the  Philosophical  Transactions  of  the  Royal  Society  A,  July  2009,  367:2679-­‐2690.  

4.   Bellgard,  M.  I.,  Macgregor,  A.,  Janon,  F.,  Harvey,  A.,  O'Leary,  P.,  Hunter,  A.,  &  Dawkins,  H.  A  modular  approach  to  disease  registry  design:  Successful  adoption  of  an  internet-­‐based  rare  disease  registry.  Human  Mutation.  

5.   Kleophas,   W.,   Bieber,   B.,   Robinson,   B.   M.,   Duttlinger,   J.,   Fliser,   D.,   Lonnemann,   G.,   and   Reichel,   H.   (2012).  Implementation  and  first  results  of  a  German  Chronic  Kidney  Disease  Registry.  Clinical  nephrology.  

6.   Stell,   A.J.,   Sinnott,   R.O.,   Jiang.   J.,   Enabling   Secure,   Distributed   Collaborations   for   Adrenal   Tumor   Research,  Proceedings  of  the  International  HealthGrid  conference,  Paris,  France,  June  2010.  

ABOUT  THE  AUTHORS  Professor   Richard   O.   Sinnott   is   the   Director   of   eResearch   at   the   University   of   Melbourne   and   Chair   of   Applied  Computing  Systems.  In  these  roles  he  is  responsible  for  all  aspects  of  eResearch  (research-­‐oriented  IT  development)  at  the  University.  He  has  been   lead   software  engineer/architect  on  an  extensive  portfolio  of   national   and   international  projects,  with  specific  focus  on  those  research  domains  requiring  finer-­‐grained  access  control  (security).      Stephan  Gloeckner  is  a  PhD  candidate  at  the  University  of  Melbourne.  His  research  is  in  auditing  and  quality  assurance  of  data  collected  in  biomedical  research  settings.  His  PhD  is   joint  with  the  University  of  Birmingham  (UK)   in  Medicine  and  the  University  of  Melbourne  (Computing).      Jemie   Effendy   is   a   software   engineer   within   the   Melbourne   eResearch   Group   at   the   University   of   Melbourne.   His  research   focus   is  on   technologies   for  big  data  processing  with   specific   focus  on   their  application   to   security-­‐oriented  data  domains.    Anthony   Stell   is   a   senior   software   engineer   working   within   the   Melbourne   eResearch   Group   at   the   University   of  Melbourne.   He   was/is   the   primary   software   developer   for   the   ENSAT-­‐CANCER   VRE.   Previously   he   has   worked   on   a  range  of  other  clinical  research  collaboration  platforms  including  clinical  trials  systems  in  the  brain  trauma  domain  and  disorders  of  sex  development  amongst  many  others.      

 

Page 11: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

   

eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  -­‐  23  October  -­‐  2015  

A  Science  Gateway  for  Malaria:  Successes  and  challenges  Michelle  Barker  

James  Cook  University,  Cairns,  Australia,  [email protected]    Abstract  for  lightning  talk.    DESCRIPTION    The   Vector-­‐Borne   Disease   Network   (www.vecnet.org)   is   a   science   gateway   for   the   malaria   community.   VecNet   has  received  over  USD  $7  million   funding  over  4  years,  most  of  which  has  been  spent  on  development.  VecNet  provides  free   online   access   to   simulation   modelling   tools   and   data   to   increase   the   use   of   modelling   in   policy   and   funding  decisions.  Simulation  models  use  existing  data  to  predict  malaria  intervention  outcomes.  Their  use  within  a  project  can  reduce   costs   and   save   time   by   demonstrating   the   likely   impacts   of   interventions   before   resources   are   committed.  VecNet  also  provides  user-­‐friendly  interfaces  to  data  and  information  storage,  and  these  can  be  linked  to  the  simulation  modelling   interfaces.  The  VecNet   tools   offer   accessible,   transparent   and   comprehensible   information   and   simulation  modelling   programs,   to   allow   users   who   may   not   usually   use   models,   to   ask   "what-­‐if"   questions   to   explore  combinations  of  vector  and  drug  based  interventions  to  determine  the  optimal  mix  for  use  in  specific  geographic  areas.    This   talk  will   offer   insights   from   the  VecNet  experience   into   success   factors   in  building   science   gateways,   identifying  enablers   and   challenges   in   relation   to   the   common   tensions  experience  by   science  gateways   in   the  work  of  Wilkins-­‐Diehr  and  Lawrence  (2010):      

1. Funding:  Development  vs.  Operations  2. Project  Goals:  Research  vs.  Production  3. Tools:  Standardised  vs.  Open-­‐Source  vs.  Custom  4. Community  Engagement:  Delivering  What  the  Users  Want  5. Rewards  &  Recognition:  Traditional  vs.  New  

 Both   Community   Engagement   and   Rewards   and   recognition   are   key   focuses   of   the   program’s   current   phase   of  development.  There  will  be  discussion  on  the  approaches  being  utilizsd  to  engage  with  different  parts  of  the  community  and  encourage  usage,  particularly  the  role  of  early  adopters.    Nancy  Wilkins-­‐Diehr  and  Katherine  A  Lawrence.  2010  Opening  science  gateways   to   future   success:  The  challenges  of  gateway   sustainability.   Gateway   Computing   Environments   Workshop,   IEEE.  http://users.sdsc.edu/~wilkinsn/GCE10_Wilkins-­‐Diehr_Lawrence.pdf    

Page 12: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

   

eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  -­‐  23  October  -­‐  2015  

The  Marine  Virtual  Laboratory  –  ocean  modelling  made  easy.    Lightning  Talk  Uwe  Rosebrock  

CSIRO  Ocean  &  Atmosphere  Flagship,  Hobart,  [email protected]    DESCRIPTION    Ocean  models  are   routinely  applied   to  predict   the  past,  present,  and   future  state  of   the  ocean   to  better  understand  ocean   dynamics.  Many   applications   require   high-­‐resolution   ocean  models   to   produce   detailed   analysis.   For  these  application   it   is  common  to  configure  and  run  regional  models,  where  the  domain  extends  only  a  few  hundred  kilometers  or  less.  Historically,  regional  ocean  models  are  time-­‐consuming  to  configure,  requiring  an  expert  modeller  to  configure  a  grid  –  carefully  setting  the  spatial  extent  of  the  model  domain  and  the  model  resolution.  The  modeler  then  gathers  data  from  multiple  sources  for  the  initial  set  up  and  for  input  during  the  simulation  time,  obtains  observation  data  for  validation  or  data  assimilation,  and  finally  manipulates  the  variety  of  data  to  match  the  requirements  of   the  model  code  he  or  she  applies.      The  Australian  Marine  Virtual  Laboratory  (MARVL)  is  a  new  development  in  modelling  frameworks  for  researchers  in  Australia.  MARVL  makes  use  of  the  Java-­‐based  control  system  named  TRIKE,  which  has  been  developed  by  the  CSIRO  for   some  years.   It   allows  a  non-­‐specialist  modeller   to   automate  many  of   the  modelling  preparation   steps  needed   to  bring  the  researcher  faster  to  the  stage  of  simulation  and  analysis.    Currently  MARVL   is   configured   for   several   different   hydrodynamic  models   (MOM4,   ROMS,   SHOC)   and  wave  models  (WaveWatch3,   SWAN)   and   offers   initial   and   boundary   conditions   from   a   variety   of   regional   or   global   ocean   and  atmospheric  models.  It  furthermore  provides  bathymetry  and  masking  of  the  domain  where  needed  together  with  the  observations  available  through  IMOS.    MARVL  has  been  applied  in  a  number  of  case  studies  around  Australia  ranging  in  scale  from  locally  confined  estuaries  to  the   Tasman   Sea   between  Australia   and  New   Zealand.    The   underlying   infrastructure  will   be   described   at   a   technical  level,  challenges  and  opportunities  high-­‐lighted  and  an  example  of  its  use  will  be  given.    

Page 13: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

eResearch Australasia Conference | Brisbane – Australia | 19 – 23 October – 2015

The Australian Climate and Weather Science Virtual Laboratory (CWSLab)

Aurel F. Moise1, Tim Pugh1, Martin Dix2, Bertrand Timbal1 1Bureau of Meteorology Research and Development, Melbourne, Australia, [email protected]

2CSIRO Ocean and Atmosphere Flagship, Aspendale, Australia, [email protected]

Aurel Moise

ABSTRACT

The presentation will provide an overview of the 2nd

Phase of the NeCTAR-funded Australian Climate and Weather Science Laboratory (CWSLab) project to build research infrastructure, services, tools, and repositories for the climate and weather community and the Centre for Australian Weather and Climate Research (CAWCR) at the National Computational Infrastructure (NCI) petascale facility at the Australian National University. The CWSLab is leveraging and integrating existing infrastructure to support an intrinsically complex Earth-System Simulator that allows scientists to simulate and analyze climate and weather phenomena.

During the second Phase the project developed a Virtual Laboratory and Web Portal called “The Climate and Weather Science Laboratory”. This laboratory utilises and integrates the Australian Community Climate Earth-System Simulator (ACCESS) infrastructure to support coupled and uncoupled model simulations of climate and weather phenomena.

Through the proposed integration and enhancements of existing community software such as ACCESS, the laboratory produces an integrated facility for climate and weather process studies in areas such as weather prediction and extreme events, atmosphere-ocean-land-ice interactions, climate variability and change, greenhouse gases, water cycles, and carbon cycles. Additionally, the laboratory provides a facility for the analysis of climate simulations, which will assist in the assessments of Australian climate change and contribute to the future assessment reports of the United Nations Intergovernmental Panel on Climate Change (IPCC).

The virtual laboratory is a community project to establish an integrated national facility for research in climate and weather sciences that complements and leverages the Australian Super Science initiative investments in computational and storage infrastructure at the ANU/NCI facility, and the strong collaboration in place by the Australian National University (nci.org.au), Australian Bureau of Meteorology (www.bom.gov.au), the CSIRO (www.csiro.au/cmar), the Collaboration for Australian Weather and Climate Research (www.cawcr.gov.au), and the Australian Research Council’s Centre of Excellence for Climate System Science (www.climatescience.org.au). (300 word)

THE SECOND PHASE OF THE CWSLab

Through the proposed integration and enhancements of existing community software such as ACCESS and Vistrails, the second phase of the CWSLab will produce an integrated facility for climate and weather process studies in areas such as weather prediction and extreme events, atmosphere-ocean-landice interactions, climate variability and change, greenhouse gases, water cycles, and carbon cycles. In order to showcase this capability, two prototype services have been created: the ACCESS Climate Model Metrics Tool and the Climate Data Analysis Tool. (78 words)

THE ACCESS CLIMATE MODEL METRICS TOOL This tool provides products around enhancements to the existing ACCESS CLIMATE MODEL SIMULATION service. A new package of climate model evaluation metrics from the World Climate Research Program (WCRP) is being integrated into the ACCESS modelling environment for routine verification of climate model performance. Furthermore, the runtime

Page 14: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

eResearch Australasia Conference | Brisbane – Australia | 19 – 23 October – 2015

environment is being enhanced to capture provenance information and perform model initialisation data processing from meteorological and reanalysis data collections in the RDSI data storage system. Finally, standard ACCESS Mdoel experiments are being developed with associated documentation and testing for Transpose AMIP modelling and a high-resolution regional atmospheric model to be used by the science community. “Transpose AMIP modelling” refers to a capability to initialise and run climate models in a manner similar to that for numerical weather prediction, according to certain protocols which facilitate comparison to the observed, resulting in an enhanced model evaluation capability. (143 words)

THE CLIMATE DATA ANLYSIS TOOL This tool provides enhancements to VisTrails (an open-source system that supports data exploration and visualization) to build a workflow management system to support the analysis of climate model output data (in this case statistical downscaling of climate models). The statistical downscaling method (BoM-SDM) is based on weather analogues developed by the Bureau of Meteorology and has been used extensively in recent Australian scientific climate projects (e.g. ACCSP, NRM, SEACI, VicCI). Currently the BoM-SDM is using climate experiment from the most recent global climate model simulations (called CMIP5) and other data sources published internationally or derived locally. The VisTrails package and workflow allows the scientific community to easily access existing statistical downscaling results from CMIP5 climate simulations. In the future the package will support additional downscaling approaches thus enabling and building support for comparative assessments of downscaling methods and data products. The workflow provides runtime support for the execution of the downscaling methods based on the user’s selection of variables, climate model and experiment, and the temporal range for a given geographical location. The end product is statistically downscaled information from CMIP5 climate models tailored to the user needs. Several additional options are provided depending on user interest such as: future greenhouse gas emission scenarios, number of climate models to be included in the analysis, and a choice on the time slice of interest in the future. This constitutes the first level of service and delivers data quickly by relying on pre-computed meteorological analogues, a second level will allow users to changes parameters within the SDM itself to generate new outputs. Future development of the VisTrails downscaling package will allow the generation of downscaled results using other downscaling approaches. (278 words)

Page 15: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

eResearch Australasia Conference | Brisbane – Australia | 19 – 23 October – 2015

ABOUT THE AUTHOR(S)

Dr. Aurel Moise Aurel Moise is a Senior Research Scientist at the Bureau of Meteorology Research and Development section. As a climate scientist, he has lead several projects around the analysis of global and regional climate simulations as part of the Australian Climate Change Science Program. Since early 2015 he is the Project Leader of the NeCTAR/CWSLab project which is a new community project to establish an integrated national facility for research in climate and weather sciences that complements and leverages the Australian Super Science initiative investments in computational and storage infrastructure at the ANU/NCI facility, and the strong collaboration in place by the Australian National University, Australian Bureau of Meteorology, the CSIRO, the Collaboration for Australian Weather and Climate Research, and the Australian Research Council’s Centre of Excellence for Climate System Science. Tim Pugh Tim is a scientific programmer and Senior HPC Scientist with the Bureau of Meteorology specializing in computational fluid dynamics, parallel computing and application development, and internet-based information technology and data services. He is currently the Supercomputer Programme Director for the new Supercomputing Facility at the Bureau of Meteorology. Dr. Martin Dix Martin Dix is the leader of the ACCESS Climate Model Systems team within the Earth System Modelling Program of CSIRO. Prior to this he worked on the development and analysis of the CSIRO global climate models and on the regional model (CCAM) development. His research interest range from climate sensitivity to computational techniques. Dr. Dix is

the Leader for the work creating the ACCESS CLIMATE MODEL METRICS TOOL. Dr. Bertrand Timbal Dr Bertrand Timbal has worked in climate change research area since the early 1990s. He is currently a Research Scientist within the Bureau of Meteorology Research and Development section. His research aims to develop techniques to translate climate change information from climate models to smaller scales in order to provide useful information for climate changes impact studies as well as detection and attribution of on-going observed changes. Dr Timbal has published about 100 peer reviewed publicly available papers, many focusing on understanding rainfall variability on all timescales across South Eastern Australia. He has been involved as theme leader during the South eastern Australia Climate Initiative (SEACI) and now is a project leader in the Victoria Climate Initiative (VicCI). Dr.

Timbal is the Leader for the work creating the CLIMATE DATA ANLYSIS TOOL.

Page 16: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

   

eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  –  23  October  –  2015  

An  Integrated  Sensor  Network  and  Research  Data  Management  System  for  the  Daintree  Rainforest  Observatory      

Ian  M.  Atkinson,1  Jeremy  Vanderwal,1  Daniel  Baird,1  Andrew  Krockenberger,1  Nigel  Bajima,1  Scott  Mills,1    Nigel  G.  Sim2  

1  James  Cook  University,  Townsville,  Australia,  [email protected]  2  CoastalCOMS  Ltd.,  Gold  Coast,  Australia,  [email protected]  

Ian  Atkinson  

OVERVIEW  A   fully   integrated  environmental  monitoring  network  has  been  established   in   the  Daintree  Rainforest   that   combines  over  400  fixed  and  moveable  sensors,  high  and  low  data  rates  with  full  metadata  description,  linkage  to  Research  Data  Australia,  storage  in  the  RDS  infrastructure  and  a  range  of  portal  interfaces  for  use  by  field  technicians,  researchers  and  the  general  public.    This  environment  will  also  be  adapted  for  an  island  research  station.  

INTRODUCTION  The  Daintree  Rainforest  Observatory,  or  DRO,  is  a  premier  ecological  monitoring  site  located  in  lowland  tropical  rainforest  around  140km  North  of  the  city  of  Cairns  in  North  Queensland,  Australia.    The  Daintree  rainforest  has  the  highest  biodiversity  anywhere  in  Australia  and  offers  access  to  unique  Gondwanan  flora.    In  1988  the  rainforests  in  which  the  DRO  is  situated  were  declared  the  Wet  Tropics  World  Heritage  Area.    This  is  one  of  the  few  areas  in  the  world  where  rainforest  directly  meets  shorelines  with  coral  reef,  and  is  unique  in  having  two  World  Heritage  Areas  sit  side  by  side.    The  DRO  site  is  flanked  to  the  west  by  coastal  ranges  rising  to  more  than  1400m  (4600ft)  and  by  the  Coral  Sea  to  the  east.        The  DRO  is  a  20  Ha  site  and  started  life  as  the  Australian  Canopy  Crane,  constructed  in  1998  and  has  been  collecting  long-­‐term  observational  records  as  well  as  conducting  explicit  research  experiments  since  that  time.[1]    A  AU$10M  expansion  to  the  observatory  from  2013-­‐2015  enhanced  the  laboratory  facilities,  eco-­‐sensing  technologies  and  amenities  to  the  DRO  and  it  is  now  a  world-­‐class  research  and  education  facility.    However,  the  remote  location  and  terrain  mean,  that  for  now,  only  low-­‐speed,  low-­‐quota  Internet  access  is  available  from  the  DRO  site  meaning  that  creative  and  dedicated  cyberinfrastructure  is  required  for  this  site.    In  order  to  support  the  long-­‐term  monitoring  and  specific  experiments  conducted  at  the  DRO  a  range  of  eco-­‐sensing  instruments  have  been  deployed  across  the  DRO  site  with  specific  emphasis  on  the  areas  beneath  the  coverage  of  the  47m  canopy  crane  that  sweeps  an  area  of  ~1  Ha.  In  order  to  provide  access  to  the  instruments,  sensors  and  existing  data  collections  of  the  DRO  as  well  as  provide  simple  infield  maintainability,  local  and  remote  data  mirroring  and  data  collection  description  and  discovery  from  Research  Data  Australia  (RDA;  http://www.rda.edu.au)  an  end-­‐to-­‐end  data  management  environment  has  been  developed  –  the  DRO-­‐DMS.    This  same  core  data  system  also  facilitates  access  by  school  groups  and  support  public  education  and  outreach  activities.  

DRO  SENSOR  NETWORKS  A  wide  variety  of  sensors  have  been  deployed  on  the  DRO  site  and  these  are  being  constantly  extended  and  upgraded.    The   current   deployments   are   summarized   in   Table   1   below   and   range   from   very   low   data   rate   devices   though   to  streaming  HD   cameras.    While   the  main   study   site   is   remote   from   the  main   laboratory  we  have   interconnected   the  major   locations  via  armored   fiber-­‐optic  cable   to  ensure   local  Wi-­‐Fi  and  XBee  wireless  networks   remain  uncongested.    The  low-­‐light  of  the  rainforest  floor  has  resulted  in  the  use  of  innovative  power  solutions  as  well  as  the  development  of  very-­‐low  power  electronics   to  where  possible   conserve  power.    As  well   as  automated   sensor   collection  manual  data  collection  is  supported  by  bespoke  mobile  data  input  systems  that  synchronize  recorded  information  directly   into  the  core  DRO-­‐Data  Management  System.  Currently  there  are  ~400  sensors  in  active  duty.    

Table  1:  Sensors  Currently  Deployed  on  DRO  Site  

Sensor  Type   Number   Data  Type   Frequency   Data  Volume  p.a.  

HD  camera   6   HD  Video/Stills   2min  video/hr   ~4000GB  

Page 17: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

   

eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  –  23  October  –  2015  

Still  image/5min  

Tree  Dendrometer    (ICT  Systems)   60  

Numeric  (uncalibrated)   15  min   2GB  

Tree  Sapflow    (ICT  Systems)  

60   Numeric  (uncalibrated)  

15  min   2GB  

Temp./Rel  Humidity   240   Numeric   5  min   100Gb  

Soil  Moisture  pits   10   Numeric   15  min   1GB  

Meteorological   3   Numeric   5  min/200Hz   500GB  

Leaf  traps   20   Manual  count   Weekly   1GB  

 In  addition,  sensors  streams  related  to  isotopic  stream  water  composition,  light  intensity,  high  frequency  flux  networks,  building  performance  and  LIMS  data  are  being  integrated  to  the  core  platform.    Operational  maintenance  of  sensors  in  a   rainforest   is   a   complex   and   demanding   activity   and   it   is   essential   that   the   data   ingestion   system   can   either  transparently  account  for  changing/recalibration  of  sensors  or  that  calibration,  additions  and  relocation  of  devices  can  be  easily  configured,  without  error  in  the  field.  

DRO-­‐DATA  MANAGEMENT  SYSTEM  The   DRO-­‐DMS   has   at   its   core   an   instance   of   the   CoastalCOMS   platform.[2]   This   is   an   integrated   Digital   Asset  Management   environment   optimized   for   streaming   data   acquisition,   video   analytics,   event   detection   and  metadata  management.  The  operation  of  the  DRO-­‐DMS  is  controlled  via  a  web  interface.    Easy  to  configure  ingestors  ensure  data  from   any   sensor   type   can   be   accounted   for   and   data   calibration/post   analysis   can   be   triggered   on   the   built   in  event/analytics  service.    A  important  feature  is  that  all  devices  are  described  from  the  outset  with  sufficient  metadata  to  generate  an  RDA  compliant  metadata  record,  so  all  data  generated  is  discoverable  via  RDA.        Poor   internet   access   from   the   DRO   site   means   that   on   a   weekly   cycle   data   stored   in   the   DRO-­‐DMS   is   physically  transferred  to  JCU  Cairns  where   it   ingested   into  the  mainline  JCU  DRO-­‐DMS  where   it  operates  as  part  of  the  Tropical  Data   Hub   service[3]   and   can   be   served   more   widely.     The   DMS   seamlessly   ingests   and   indexes   new   records   and  accounts  for  any  duplicate  data  records.    Data  can  be  downloaded  from  the  DRO-­‐DMS  web  interface  for  further  analysis  or  visualized  within  the  portal.    An  innovative  ‘mindcraft’  like  data  visualization  and  access  tool  was  also  developed.    

 Figure  1:  DRO-­‐DMS  

REFERENCES  1.   Active   DRO   research   projects.   Available   from   https://research.jcu.edu.au/dro/research/research-­‐projects                                  

accessed  8  June  2015.  2.   CoastalCOMS  core  platform.    Available  from  http://www.envirocoms.com.au/  accessed  8  June  2015.  

Page 18: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

   

eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  –  23  October  –  2015  

3.   Tropical  Data  Hub.  Available  from  http://tropicaldatahub.org.au  accessed  8  June  2015.  

ABOUT  THE  AUTHORS  Ian  Atkinson  is  a  Director  of  the  eResearch  Centre  at  James  Cook  University.  His  PhD  studies  were  in  chemical  physics  but  nearly  20  years  ago  moved  from  experimental  science  into  computational  chemistry  and  high-­‐performance  computing.  He  has  a  long-­‐standing  interest  in  eResearch  methods,  tools,  scientific  data  management  and  user  interfaces  for  HPC  tools.  He  is  also  actively  involved  in  researching  how  new  systems  and  software  that  connect  the  physical  and  virtual  worlds,  particularly  focusing  on  environmental  monitoring  with  sensor  networks.  These  include  the  development  of  the  Tropical  Data  Hub,  involvement  in  “The  Digital  Homestead”  to  evaluate  how  modern  Information  and  Communication  Technologies  (ICT)  such  as  wireless  sensor  networks  (WSN’s),  data  analytics  and  rural  connectivity  could  support  greater  profitability  for  the  Northern  beef  industry,  and  a  range  of  ‘reef-­‐to-­‐rainforest’  biodiversity,  climate  and  other  environmental  monitoring  projects.        A/Prof.  Jeremy  VanDerWal  is  a  spatial  ecologist,  a  Senior  Research  Fellow  at  the  Centre  for  Tropical  Biodiversity  and  Climate  Change,  and  the  Deputy  Director  of  the  eResearch  Centre  at  James  Cook  University.  His  research  is  focussed  on  assessing  the  potential  impacts  of  past,  present  and  future  climate  on  the  distribution  and  abundance  of  species.  Much  of  his  research  explores  ecological  theories  with  applied  aspects.  Dr  VanDerWal  is  interested  in  ensuring  that  science  is  not  just  ‘theoretical’  but  rather  is  used  to  engage  and  inform  a  wide  variety  of  end  users.      Daniel  Baird  is  a  software  engineer  and  user  experience  designer  at  the  eResearch  Centre  at  James  Cook  University.  He  holds  degrees  in  psychology  and  computer  science  and  uses  both  to  create  data  engagement  experiences  at  the  eResearch  Centre.  In  the  past  Daniel  has  perpetrated  software  in  C++,  Delphi,  Java  and  Swing,  PHP,  Microsoft  Access,  Crystal  Reports,  ColdFusion,  Ruby,  and  web  technologies  across  the  retail,  higher  education  and  corporate  sectors,  and  co-­‐founded  the  wiki  hosting  site  http://tiddlyspot.com.    

David  Beitey  is  the  online  technologies  manager  for  the  eResearch  centre  at  JCU.  Working  closely  with  researchers  and  other  groups  alike,  David   is  extremely  passionate  about   free  and  open  source  software  and  all  aspects  of   it   security.  David   performs   development   and   operations   for   the  majority   of   eResearch   services,   where  most   code   produced   is  open-­‐sourced.  He   also   provides   support   for   services   like   JCU’s   research  portfolio   and   research  bait,   and   a   variety   of  other  it-­‐related  tasks,  such  as  operating  the  Vislab  3d  visualisation  room.  His  work  takes  him  far  and  wide,  delivering  a  high-­‐quality  end-­‐to-­‐end  service  for  researchers  inside  and  outside  of  JCU.  Andrew  Krockenberger    Nigel  Bajima    Scott  Mills    Nigel  G.  Sim  

Page 19: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

 eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  –  23  October  –  2015  

Development  of  cloud-­‐based  virtual  desktop  environment  for  synthesis  and  analysis  for  ecosystem  science  community  

Siddeswara  Guru1,  Hoang  Anh  Nguyen2,  Shilo  Banihit3,  Matthew  Mulholland3,  Kim  Olsson3,  Tim  Clancy1  1  Terrestrial  Ecosystem  Research  Ecosystem,  University  of  Queensland,  St  Lucia,  Australia,  [email protected],  

[email protected]    2  Research  Computing  Centre,  University  of  Queensland,  St  Lucia,  Australia,  [email protected]  

3Queensland  Cyber  Infrastructure  Foundation,  University  of  Queensland,  St  Lucia,  Australia  

Siddeswara  Guru  

INTRODUCTION  Current   scientific   experiments   are   becoming   increasingly   complicated.   These   experiments   often   consist   of   multiple  models,  analytical  tools  and  data  stores.  Furthermore,  computational  components  of  an  experiment  can  be  executed  in  parallel  and/or  in  a  distributed  computing  infrastructure.  This  makes  the  creation  and  execution  of  these  experiments  challenging  and  due  to  a  vast  variety  of  software  and  processes  used  in  the  scientific  experiments,   it   is  a  challenge  to  capture   all   procedures   and  processes   utilized   in   every   steps  of   an   experiment   to  make   the   experiment   transferable,  shareable  and  reproducible.    Scientific  workflow  technology  has  become  popular  by  providing  a  high-­‐level  environment  that  can  automate,  manage  and   execute   various   steps   in   scientific   research  with   an   ability   to   store   and   track   provenance   information.   Scientific  workflows  provide  a  powerful  unifying  platform   that  allows   scientists   to  build  arbitrarily   complicated  applications  by  combining   predefined   components   [1],   which   may   be   implemented   in   different   programming   languages.   Once   the  workflow  is  built,  it  can  be  re-­‐used,  re-­‐executed  with  minimal  effort.  These  intrinsic  capabilities  of    a    workflow  system  with   provenance   tracking   functionality   would   improve   reproducibility   of   experiments   and   encourage   sharing   of  experimental   processes   and   results.  Workflow   systems   offer   a   broad   range   of   components,   and   that   perform   tasks  ranging  from  acquiring  data  from  sensors,  querying  databases,  data-­‐mining  and  visualization,  through  to  execution  of  arbitrary  applications.    Workflows  are  widely  used  in  most  of  the  data-­‐intensive  scientific  domain.  These  large-­‐scale  experiments  often  require  distributed  computing  resources  for  computation  and  storage.  In  an  Australian  context,  most  often  researchers  use  the  National   research  platforms  National   eResearch  Collaboration  Tools   and  Resources   (NeCTAR)   and   the  Research  Data  Storage   Infrastructure   (RDSI)   to  get  compute  and  storage  cloud  resource.  However,  both  RDSI  and  NeCTAR  provision  resource   in   an   Infrastructure   as   a   Service   (IaaS)  model.   Scientists   need   to   apply   for   resources,   build   and  maintain   a  platform  of  use,  and  run  experiments  on  the  platform.  This  requires  significant  system  administration  skills,  which  most  often   domain   scientists  would   not   have.   Therefore,   there   should   be   an   alternative   approach  where   researchers   get  access  to  distributed  computing  infrastructure,  and  it  is  not  cumbersome  to  setup,  access  and  manage.    In  this  paper,  we  present  the  development  of  Collaborative  Environment  for  Ecosystem  Science  Research  and  Analysis  (CoESRA),  which  provides  a  web-­‐based  virtual  desktop  environment  that  supports  analysis  and  synthesis  using  scientific  workflow.  CoESRA  is  built  on  NeCTAR  and  RDSI  cloud  infrastructure,  which  will  enable  users  to  access  ready  to  use  Linux  desktop   environment   through   a  web  browser.  One  of   the  motivations   for   the  development   of   CoESRA   is   to   provide  scientist   a   desktop   environment,   which   has   an   ability   to   leverage   analysis   tools   and   distributed   computing  infrastructure  to  perform  data  and  compute  intensive  experiments.    The  desktop  environment  will  be  made  accessible  from   a   web   browser   to   make   it   easy   to   access   and   lower   any   impediments   to   access   and   use   the   virtual   desktop  environment.  

SYSTEM  OVERVIEW  CoESRA   is   a  Web-­‐enabled   virtual   desktop   environment   running   on   cloud   infrastructure.   A   user   can   access   a   virtual  desktop  environment  and  use  it  to  build,  execute  and  share  workflow-­‐based  scientific  analysis  and  synthesis  activities.  The  high-­‐level  CoESRA  system  architecture  is  shown  in  Figure  1.  The  system  has  following  functionalities:  

• User  registration  and  creating  user  accounts  in  a  system,  • Create   and   provide   access   to   virtual   desktops   for   users   which   have   tools   like   Kepler,   RStudio,   Python   and  

Nimrod,  • Access  to  storage,  • Manage  user  access  to  virtual  desktops,  • Manage  virtual  desktop  instances,  

Page 20: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

 eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  –  23  October  –  2015  

• Ability   to   publish   workflow   as   a   service   record   in   Australian   National   Data   Service   (ANDS)   Research   Data  Australia  (RDA).  

 

   

Figure  1:  Architecture  of  CoESRA  System    

SYSTEM  OPERATION  

User  Registration  and  System  Access  A   user   needs   to   register   with   the   system   to   access   the   virtual   desktop   (VD).   A   system   supports   Australian   Access  Federation  (AAF)  as  a  login  mechanism  and  LDAP  for  an  internal  authentication.  Once  a  user  registers  to  a  system,  user-­‐provisioning  process  will  start  which  includes,  creating  LDAP  entry  to  a  user,  creating  user  home  folder  on  an  NFS  server  (cloud-­‐based  storage)  and  updating  user   information   in  a  database.    Once  all   these   steps  are   successfully  performed  user  will  get  a  system  generated  email  confirming  that  the  registration  process  is  complete.  When  a   registered  user   login   to  a   system  using  AAF,   a  VD   session   can  be   requested   from  CoESRA   system.  The  DaaS  Admin   Service   (Figure   1)   looks   for   a   free   VD   from   the   pool   and   assigns   that   VD   for   a   requested   user.   Once   VD   is  occupied,  it  cannot  be  assigned  until  the  user  releases  it  to  the  pool.    

Virtual  Desktop  Environment  The   VD   runs   on   CentOS   6.5   and   pre-­‐configured   with   Kepler   scientific   workflow   and   related  modules,   programming  languages   R,   Python,   an   environment   like   Rstudio   and   distribution   computing   tool   Nimrod[2].   Kepler   is   used   for   its  strong  support   in  ecology  domain  and  a   large  number  of  reusable  components.  A  VD  is  accessible  via  a  web  browser  using  Guacamole  and  Remote  Desktop  Protocol.  To  prevent  user  from  keeping  the  VDs  infinitely,  when  a  user  request  access   to  a   virtual  desktop,   it   is   assigned   to  a  user   for  a   fixed  duration   (48  hours)   and  an  email  will   be   sent  6  hours  before   the   session   expires.   A   user   can   request   session   extension   by   emailing   to   CoESRA   system   administrator.  Otherwise,  the  session  will  be  terminated,  and  a  VD  will  be  released  back  to  a  pool.  A  VD  has  access  to  common  storage  to  share  workflows  and  data.  This  will  promote    an  informal  way  of  sharing  and  collaboration.  A  new  feature  has  been  added   to  Kepler   graphical   user   interface   to   create   a  RIF-­‐CS   service   record   for  workflows   to  make   them  discoverable  from   ANDS-­‐   RDA.   This   will   provide     additional   functionality   to   share  workflows   on  MyExperiment.org   as   well   as   on  ANDS-­‐RDA.  All   Virtual   desktops   are  mapped   to   a   centralised  provenance  database   to   store   and  query   the  history   of  workflow  runs  performed  by  users.  The  users  will  also  have  the  flexibility  to  distribute  jobs  using  Nimrod  on  a  dedicated  cluster  as  well  as  locally  in  a  VD.    

Virtual  Desktop  Pool  All  the  free  VDs  in  CoESRA  are  kept  in  a  Virtual  Desktops  Pool  (VDP).  This  pool  is  resizable  depending  on  the  number  of  VD  requests.  The  load  balancer  (Figure  1)  handles  keeping  the  number  of  free  VDs  within  a  certain  threshold.  As  soon  as  the  number  of  free  VDs   is  below  “minimum  threshold”,   the   load  balancer   launches  virtual  machines  to  create  VDs  so  that   the   number   of   free  VDs   is  within   a   range.   Similarly,   the   load  balancer  deletes  VDs   if   a   number   of   free  VDs   are  higher  than  “maximum  threshold".  

REFERENCES  [1]   B.   Ludäscher,   I.   Altintas,   C.   Berkley,   D.   Higgins,   E.   Jaeger,   M.   Jones,   E.   A.   Lee,   J.   Tao,   and   Y.   Zhao,   "Scientific  workflow  management   and   the   Kepler   system,"  Concurrency   and   Computation:   Practice   and   Experience,   vol.   18,   pp.  1039-­‐1065,  2006.  [2]   D.  Abramson,  C.  Enticott,  and  I.  Altinas,  "Nimrod/K:  Towards  massively  parallel  dynamic  Grid  workflows,"   in  High  Performance  Computing,  Networking,  Storage  and  Analysis,  2008.  SC  2008.  International  Conference  for,  2008,  pp.  1-­‐11.  

Page 21: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

 eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  –  23  October  –  2015  

 

ABOUT  THE  AUTHORS  Siddeswara   Guru   is   a   Data   Integration   and   Synthesis   Coordinator   for   Terrestrial   Ecosystem   Research   Network.   His  research  area  is  in  the  scientific  data  management.  He  has  a  PhD  From  Melbourne  University,  MBA  from  the  University  of  Tasmania.  Previously,  he  has  worked  in  CSIRO  and  Australian  Ocean  Data  Network  development  office.    Mr.  Hoang  Anh  Nguyen  is  a  systems  programmer  at  the  Research  Computing  Centre,  the  University  of  Queensland.  He  received  a  Bachelor  Degree  with  Honours  in  Software  Engineering  from  Monash  University.  He  is  completing  his  PhD,  which  is  to  design  new  ways  to  interact  with  Kepler  workflows  running  behind  a  science  gateway.      Prof.   Tim   Clancy   is   a   Director   of   Terrestrial   Ecosystem   Research   Network.   Prior   to   this,   he   managed   the   Forest  Resources  Management  Section  of  the  Australian  Bureau  of  Agriculture  and  Resource  Economics  and  Sciences  (ABARES)  and   led  the  organisation's  Land  and  Forests  Theme.  Among  his   responsibilities  was  reporting  on  national   forest,   land  use,  land  management  and  vegetation  data.  

Page 22: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

   

eResearch  Australasia  Conference  |  Melbourne  –  Australia  |  27  -­‐  31  October  -­‐  2014  

The  WorkWays  Problem  Solving  Environment    David  Abramson,  Hoang  Nugyen  

University  of  Queensland    

Science   gateways   allow   computational   scientists   to   interact  with   a   complex  mix   of  mathematical  models,  software   tools   and   techniques,   and   high   performance   computers.   Accordingly,   various   groups   have   built  high-­‐level  problem-­‐solving  environments   that  allow  these   to  be  mixed   freely.   In   this   talk,  we   introduce  an  interactive   workflow-­‐based   science   gateway,   called   WorkWays.   WorkWays   integrates   different   domain  specific  tools,  and  at  the  same  time  is  flexible  enough  to  support  user  input,  so  that  users  can  monitor  and  steer  simulations  as  they  execute.  A  benchmark  design  experiment  is  used  to  demonstrate  WorkWays.  

ABOUT  THE  AUTHOR(S)  Professor  David  Abramson  has  been  involved  in  computer  architecture  and  high  performance  computing  research  since  1979.  He  has  held  appointments  at  Griffith  University,  CSIRO,  RMIT  and  Monash  University.  Most  recently  at  Monash  he  was  the  Director  of  the  Monash  e-­‐Education  Centre,  Deputy  Director  of  the  Monash  e-­‐Research  Centre  and  a  Professor  of  Computer  Science  in  the  Faculty  of  Information  Technology.  He  held  an  Australian  Research  Council  Professorial  Fellowship  from  2007  to  2011.  He  has  worked  on  a  variety  of  HPC  middleware  components  including  the  Nimrod  family  of  tools  and  the  Guard  relative  debugger.    Professor  Abramson  is  currently  the  Director  of  the  Research  Computing  Centre  at  the  University  of  Queensland.  He  is  a  fellow  of  the  Association  for  Computing  Machinery  (ACM)  and  the  Academy  of  Science  and  Technological  Engineering  (ATSE),  and  a  Senior  Member  of  the  IEEE.    Mr  Nguyen  is  a  PhD  student  in  the  School  of  Information  Technology  and  Electrical  Engineering  at  the  University  of  Queensland.  

 

Page 23: PLEASE NOTE: Registration for this workshop is via … · 2015. 9. 1. · The Collaborative Urban Research Environment for Australia 10:10 Wojtek James Goscinski The Characterisation

 

   

eResearch  Australasia  Conference  |  Brisbane  –  Australia  |  19  -­‐  23  October  -­‐  2015  

Developing  Science  Gateways:    

Current  Solutions  and  Future  Challenges  Sandra  Gesing  

University  or  Notre  Dame,  Notre  Dame,  USA,  [email protected]    

CURRENT  SOLUTIONS  In  the  last  10  years  the  research  area  on  science  gateways  has  extensively  grown  and  the  usage  of  science  gateways  has  highly  increased.  This  is  evident  in  publications  such  as  special  issues  on  science  gateways  and  statistics  that  providers  of  distributed  infrastructures  reported  last  year  that  the  first  time  in  history  their  resources  have  been  used  more  via  science   gateways   than   via   commandline.   Quite   a   few   mature   and   widely   used   science   gateway   frameworks   (e.g.,  Galaxy,  WS-­‐PGRADE)  and  APIs  (Apache  Airavata,  Agave)  have  evolved,  which  serve  developers  with  building  blocks  for  efficiently   implementing  science  gateways.  They  address   the  challenges  the  developers  have  to   face   for  each  science  gateway  -­‐  from  intuitive  user  interfaces  through  security  features  to  distributed  job,  data  and  workflow  management.    

FUTURE  CHALLENGES    The  pace  of  novel  developments  of  web-­‐based  technologies  as  well  as  agile  web  frameworks  steadily  increases  as  well  as  the  flexibility  of  utilizing  the  Internet.  To  be  able  to  support  and  integrate  such  novel  developments,  science  gateway  framework  solutions  need  to  have  a  short  release  cycle  warranted  by  a  modular  architecture  and  using  concepts  such  as  micro-­‐services.   Also   the   underlying   distributed   infrastructures   are   further   evolving   with   cloud   technologies,   with  light-­‐weight   containers   like   Docker   and   with   cutting-­‐edge   accelerated   architectures   on   the   hardware   side.   Future  approaches  will  include  not  only  the  extension  of  science  gateways  to  such  new  IT  technologies  but  also  the  integration  of  data  sources  in  labs  like  telescopes  such  as  the  Square  Kilometre  Array  (SKA),  which  will  create  data  rates  in  exa-­‐scale  size.  Especially,  the  amount  of  data  in  general  –  from  data  created  via  computational  methods  or  in  labs  –  will  demand  for  new  capabilities  offered  by  science  gateway  frameworks  and  APIs.      The   lightning   talk   will   give   a   brief   overview   of   existing   solutions   and   concludes   with   a   discussion   about   future  challenges.