559-233-4741 BOFV f BOF-2 BOF.21 BOF.22 B OF-23 SOF-24 BOF ...
Integrative Biology BOF - Usable Systems in the Global Environment All Hands 2006
-
Upload
demetrius-daniels -
Category
Documents
-
view
46 -
download
0
description
Transcript of Integrative Biology BOF - Usable Systems in the Global Environment All Hands 2006
July 2006
Integrative Biology 1
Integrative Biology
BOF - Usable Systems in the Global Environment
All Hands 2006
Thursday 21st September
July 2006
Integrative Biology 2
•What is Integrative Biology ? – a quick recap!
• Who are the IB users?
• Challenges in developing solutions for a diverse community
• The IB technology to date
Agenda
July 2006
Integrative Biology 3
Integrative Biology - Project Rationale
To leverage the global Grid infrastructure to build an international
“collaboratory” which places the applications scientist “within” the
Grid allowing fully integrated and collaborative use of:
•HPC resources (capacity and capability)•Computational steering, performance control and visualisation•Storage and data-mining of very large data sets•Easy incorporation of experimental data
•User- and science-friendly access
=> Predictive in-silico models to guide experiment and,
ultimately, design of novel drugs and treatment regimes
July 2006
Integrative Biology 4
What are our objectives?
NGSRAL
NGS Ox Compute
NGS Man NGS
Leeds Compute
Atlas DataStore
HPCx – parallel optimal codes
UCL Altix Test Machine
Global Users
Integrative Biology
Hide the complexity from the users through the use of an IB portal or client
Your own
Cluster..
EU Grids..
Teragrid
July 2006
Integrative Biology 5
The Integrative Biology Scientific Users
Degree and/or post grad qualification in
Industrial engineering, maths, biology, physiology
Typically…
Computing skills developed over time to allow
them to develop models. Not computer
scientists. Not grid savvy.
Keen to use and adapt other
Scientists work
Based in Oxford, Nottingham, Birmingham,
Auckland,Tulane, Washington Lee, Calgary,
Baltimore, Sheffield, Utrecht, Graz…
July 2006
Integrative Biology 6
Determining requirements • Evolving users, disparate needs, identify current pains• Evolving knowledge driving new requirements• Don’t know what they want until they see and refine it• Grid not something they want to know about, consideration of
language• Initial interviews assessed as is, constraints and security
requirements for competitive research • Concept of collaboration varied• Do they need a grid? Exploratory journey for users
July 2006
Integrative Biology 7
Key problems identified
• Data management problematic, too much generated and tying information together an art
• Current simulations tie up desktops for many hours
• Visualising results on desktop limited by local facilities and ad hoc development of suitable tools
• Research is sensitive, concept of an experiment either for
an individual or a collaborative group
• Laptop to HPC migration for most users a huge leap not a small step
• Collaboration and Communication requires tools e.g. Oxford/Tulane
• Cannot exclude scientific community who have not progressed to computational models (digital pens)
July 2006
Integrative Biology 8
The ‘collaboratory’ - What have we developed ?
• Facilities for submission of compute jobs to NGS and HPCx via portal or command line or Matlab. Extension to own clusters in development
• Comprehensive data management and metadata management facilities including federation of catalogues and with Auckland and UK
• Advanced visualisation techniques including movie generation utilising Meshalyser and Coolgraphics to date. Major revamp of these facilities due in the next 12-18 months for remote geometry generation and steering
• Phase space exploration for multi variable visualisation in Leeds
• A new VRE project developing usable interfaces to a digital research domain for IB through proof of concepts. Also exploring the digital world through a trial of digital pens for life scientist.
July 2006
Integrative Biology 9
Job submission and management via the IB Portal
Users are able to select the compute resource to be used, manage their data in their
own SRB space and to setup and manage their experiments through a metadata editor.
Users can link files and simulation information to created studies thus simplifying the
process of managing their scientific information.
This portal allows users
to submit their jobs to
these compute facilities,
monitor their progress
and to automatically pull
input files from and store
results in the project secure
repository ‘Storage Resource
Broker’.
July 2006
Integrative Biology 10
The data storage facility allows users to store any associated user files including input files, codes and output
results. Provenance data is automatically captured from a simulation run and stored alongside the results for
later use.These facilities are designed to offer large scale secure facilities for the individual researcher as well
as those interested in working more collaboratively with colleagues through the ability to share information.
Data Management and the Metadata editor
July 2006
Integrative Biology 11
Cool Graphics/Meshalyzer
IB Tools
Link to SRB and NGS
Visualization
(developed by Dr. J. Eason and Dr. E. Vigmond)
Can only be done on local machine – problem for low bandwidth users
… hence revised architecture
Planned over next 12-18 months
Issues
July 2006
Integrative Biology 12
Usable Solutions or lead weight?
• Early releases have required tame users to deal with less elegant means of submitting and managing jobs
• Constrained by infrastructure and agility of change
• VRE project aims to pull together multifaceted aspects
• Generic tools versus bespoke prototypes for selected groups e.g. Washington Lee parameter sweep
• Benefits for scientists have outweighed pains (certificates, varied rules re job queues, libraries and licensing) but
• Far from ideal solution…. Constraints still exist (bandwidth, monitoring, security)
July 2006
Integrative Biology 13
Scientific users are customers of technology
…. But technology team are users of provided
infrastructure…
– NGS
– HPCx
– (CSAR)
– SRB
– 3rd party tools …..
July 2006
Integrative Biology 14
Benefits and challenges for users
• Benefits
– Access to powerful compute resources,
– Access to vast file store facility,
– Prompt, efficient support structures.
– New science evolving and publication rate for scientists faster!
• Challenges
– Need to apply for and manage certificates
– Code development for optimal use of facilities still a challenge
– Legacy code hurdle
July 2006
Integrative Biology 15
Summary
• Integrative Biology has had to act as a bridge as well as a
provider of interfaces and services
• Starting small and iterating with users patient enough to
stick with it has enabled both teams to progress
• Security comes at a price
• Usable or tolerable?
…. But we have managed to increase publications for user
community!