Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments...

28
Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy Director (Technical) Bioinformatics Research Centre University of Glasgow [email protected] Research Manager National e-Science Centre University of Edinburgh [email protected]

Transcript of Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments...

Page 1: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

Dr Richard Sinnott Dr Dave Berry

5th February 2004

National e-Science Centre Local Developments

Technical Director National e-Science Centre ||| Deputy Director (Technical)

Bioinformatics Research Centre University of Glasgow

[email protected]

Research ManagerNational e-Science CentreUniversity of Edinburgh

[email protected]

Page 2: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

OverviewNeSC Role in UK e-Science

NeSC Edinburgh developments e-Science Institute Infrastructure/set-up Projects Plans

NeSC Glasgow developments Infrastructure/set-up Projects Plans

Conclusions

Page 3: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

NeSC’s Role

Help coordinate and lead the UK e-Science ProgrammeCommunity building activities, regional support & outreachGrid building as a member of the Engineering Task ForceSkill building through training events & support centre

Help establish the UK’s international roleInternational meetings, standardisation work & presentations

Undertake R&D projectsTo deliver reliable middlewareTo engage industryTo stimulate the uptake of e-Science technology and methods

Run the e-Science InstituteKnowledge building through workshops and conferencesResearch visitors and events

Page 4: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

NeSC at Edinburgh:Recent Developments

Globus AllianceDigital Curation Centre

Edinburgh, Glasgow, UKOLN, CCLRC

New e-Science Lecturer (Particle Physics)Training Team

PPARC and EGEE fundingManager + 4 trainersEurope-wide role

DAI Two (Extension of OGSA-DAI)OGSA Test Grid

Page 5: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

Digital Curation Centre

Industry

research collaborators

standards bodies

testbeds& tools

communities of practice:

users

community support

& outreach

research

development

servicesmanagement

& co-ordination

curation organisations

Collaborative Associates Network of DataOrganisations

Page 6: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

e-Science Institute

A meeting placeThe focus for presenting UK e-Science

Visiting researchersCollaborate in our research and developmentEngage in and develop our event programmeBuild bridges with their communityVisits last between one week and six months

Research-oriented event programmee-Science research topics Training to e-Science research teams

Page 7: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

eSI Workshops

Space for real workCrossing communitiesCreativity: new strategies and solutionsWritten reports

Scientific Data Mining, Integration and VisualisationGrid Information SystemsPortals and PortletsVirtual Observatory as a Data GridImaging, Medical Analysis and Grid EnvironmentsOpen Issues in Grid SchedulingData Provenance & Annotatione-Science Workflow ServicesGeoSciences & Scottish Bioinformatics Forum

http://www.nesc.ac.uk/events/

Suggestionsalways

welcome!

Page 8: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

Projects

OGSA-DAI/DAIT, MS.NETGrid, SunDCG, GridWeaver, BRIDGES, PGPGrid, FirstDIG, ODD-GenesEGEE, NextGridOGSA Test Grid, IBM Early EvaluationediktPublishing Scientific DataGridPP, AstroGrid, QCDGrid, RealityGrid PortalBiological Spatio-Temporal DatabasesCoAKTinG, Grid-enabled Modelling Tools and Databases for Neuroinformatics, e-DiamondDynamic Configuration of Grid Fabrics, Dependable Grid Services, Deductive Synthesis Techniques, Inferring QoS Properties for Grid Applications, Mobile Resource GuaranteesTIES, TIES-II

Page 9: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

The Virtual Observatory

International Virtual Observatory Alliance

UK, Australia, EU, China, UK, Australia, EU, China, Canada, Italy, Germany, Japan, Canada, Italy, Germany, Japan, Korea, US, Russia, France, IndiaKorea, US, Russia, France, India

How to integrate manymulti-TB collections ofheterogeneous data distributed globally?

Sociological and technological challenges to be met

Page 10: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

Data Services

GGF Data Access and Integration Svcs (DAIS)OGSI-compliant interfaces to access relational and XML databasesNeeds to be generalized to encompass other data sources (see next slide…)

Generalized DAIS becomes the foundation for:

Replication: Data located in multiple locationsFederation: Composition of multiple sourcesProvenance: How was data generated?

Page 11: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

1a. Request to Registry for sources of data about “x”

1b. Registry responds with

Factory handle2a. Request to Factory for access to database

2c. Factory returns handle of GDS to client

3a. Client queries GDS with XPath, SQL, etc

3b. GDS interacts with database

3c. Results of query returned to client as XML

SOAP/HTTP

service creation

API interactions

Registry

Factory

2b. Factory creates GridDataService to manage access

Grid Data Service

Client

XML / Relational database

Data Access & Integration Services

Page 12: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

edikt

The team: 8 professional software engineers, support staff, project manager, commercialisation manager, architect, and SABSHEFC funded research and development grant

3 years funding: May 2002 – 2005+3 years funding upon successful project and review

Standards

Edikt project

Requirementsanalysis

Technologymatchmaking

Gap filling Rigorousengineering

CS Research

Grid Services fore-Science Data Management

Commercial SW components

and skills

E-Science Apps

Page 13: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

JavaFramework

ELDAS – Data Access Service

Implemented using Enterprise Java BeansData Access Components interface to distinct DBMSsAccessible as a grid data service or a web data service

ELDAS

DB2 DBMySQL DBXindice DB

Web User1

Oracle 9i DB

EJB - DAS

DACDACDACDAC

ELDAS runs anywhereWeb ServletGrid Proxy

Grid User1 Grid User2

Suitable for grid & web

Page 14: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

e-ScienceApplication

BinaryData File

BinaryData FileBinary

Data File

BinaryData FileBinary

Data File

BinaryData File

BinX – accessing legacy binary data

The Problem:Many binary data filesApplications must “know”the data formatBinary data formats are machine-specific

BinX Library

The Solution:Write a “stand-aside” format description in XMLProvide a library to

Interpret the description Provide file access across

different machines

Build higher-level services

BinX file describes binary file structure

BinX file describes binary file structure

simulations

Page 15: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

NeSC at GlasgowE-Science Hub

Externally Glasgow end of NeSC

– Involved in UK wide activities» ETF: In May 2003 became first UK e-Science Centre to

run integration tests across every site of the UK (Level 2) Grid. Therefore 100% access to UK Grid resources at this time

– Public visibility of NeSC» responsible for NeSC web site

Internally Focal point for e-Science research/activities at Glasgow Work closely with foundation departments

– Department of Computing Science– Department of Physics & Astronomy

Also working closely with other groups including– Bioinformatics Research Centre– Electronics and Electrical Engineering– Biostatistics, …

Page 16: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

Glasgow e-Science Investment

Major investment by university

230m2 of newly renovated floor space in Kelvin

Building offices access grid facility training room

– equipped with 20PCs/server for training courses

Funding Technical Director

Page 17: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

Resource Consolidation at Glasgow

Building around ScotGridProviding shared Grid resource for wide

variety of scientists inside/outside Glasgow Particle physicists, computer scientists,

electronic engineers, bioinformaticians, … Focal point, knowledge pool, primary resource

for e-Science activity at Glasgow Target shares

– 60% PP, 20% Bioinf, 20% open share…

Hardware• 59 IBM X Series 330 dual 1 GHz Pentium III with 2GB memory • 2 IBM X Series 340 dual 1 GHz Pentium III with 2GB memory • 3 IBM X Series 340 dual 1 GHz Pentium III with 2GB memory and 100 + 1000 Mbit/s ethernet • 1TB disk • LTO/Ultrium Tape Library • Cisco ethernet switches• IBM X Series 370 PIII Xeon with 32 x 512 MB RAM • 5TB FastT500 disk 70 x 73.4 GB IBM FC Hot-Swap HDD• eDIKT 28 IBM blades dual 2.4 GHz Xeon with 1.5GB memory• eDIKT 6 IBM X Series 335 dual 2.4 GHz Xeon with 1.5GB memory • CDF 10 Dell PowerEdge 2650 2.4 GHz Xeon with 1.5GB memory• CDF 7.5TB Raid disk

Shared Resources: Disk ~15TB

CPU ~ 330 1GHz CDF

LHC BIO

Page 18: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

Projects with NeSC Glasgow Involvement

DCC National Digital Curation Centre

AMUSEAutonomous Management of Ubiquitous Systems for e-Health

P2PoptPerformance measurement & mgt of 2-Layer Peer to Peer NWs…

PGPGridPeppers Ghost Productions

EquatorEnvironmental e-Science Interdisciplinary Research Project

BPSBiochemical Pathway Simulator

BRIDGES

Page 19: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

Overview of BRIDGES

Biomedical Research Informatics Delivered by Grid Enabled Services (BRIDGES)

NeSC (Edinburgh and Glasgow) and IBM 2 year project started 1st October 2003

Supporting project for CFG project Generating data on hypertensionRat, Mouse, Human genome databases

Variety of tools usedBLAST, FASTA, MPsrch, BLAT, Gene Prediction, visualisation, …

Variety of data sources and formatsMicroarray data, genome DBs, project partner research data, medical records, …

Aim is integrated infrastructure supportingData federationSecurity

Page 20: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

Shared data

CFG Partner Distribution

Glasgow Edinburgh

Leicester

Oxford

London

Netherlands

Public curated

data

Private data

Private data

Private data

Private data

Private data

Private data

Page 21: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

Problems specific to Bio-Community

PDB Content Growth

•DBs growing exponentially!!!•Biobliographic (MedLine, …)

•Amino Acid Seq (SWISS-PROT, …)

•3D Molecular Structure (PDB, …)

•Nucleotide Seq (GenBank, EMBL, …)

•Biochemical Pathways (KEGG, WIT…)

•Molecular Classifications (SCOP, CATH,…)

•Motif Libraries (PROSITE, Blocks, …)

•…

Page 22: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

More genomes …...

Arabidopsis thaliana

mouse

rat

Caenorhabitis elegans

Drosophilamelanogaster

Mycobacteriumleprae

Vibrio cholerae

Plasmodiumfalciparum

Mycobacteriumtuberculosis

Neisseria meningitidis

Z2491

Helicobacter pylori

Xylella fastidiosa

Borrelia burgorferi

Rickettsia prowazekii

Bacillus subtilis

Archaeoglobusfulgidus

Campylobacter jejuni

Aquifex aeolicus

Thermotoga maritima

Chlamydiapneumoniae

Pseudomonasaeruginosa

Ureaplasmaurealyticum

Buchnerasp. APS

Escherichia coli

Saccharomycescerevisiae

Yersinia pestis

Salmonellaenterica

Thermoplasmaacidophilum

Page 23: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

Complexity of Biological DataN

ucl

eoti

de

seq

uen

ces

Nu

cleo

tid

e st

ruct

ure

s

Gen

e ex

pre

ssio

ns

Pro

tein

Str

uct

ure

s

Pro

tei n

fu

nct

ion

s

Pro

tein

-pro

tein

inte

ract

ion

(p

ath

way

s)

Cel

l

Cel

l sig

nal

lin

g

Tis

sues

Org

ans

Ph

ysio

logy

Org

anis

ms

Page 24: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

BRIDGES Data Integration/Federation

Local repository being developedPopulated with data that cannot be federated

e.g. public data sets with no programmatic interface

Shared data sets of CFG scientistsSecurity through

X.509 PKI (authentication) PERMIS (authorisation)

Will make use of e-Science technologies (OGSA-DAI/DAIT, ELDAS, IBM’s DiscoveryLink)

Automatically keep fresh/updated data

Web (Grid) services offered that allow to make use of these local data sets

For example for visualising, searching, querying, …

Example usage scenario …

Page 25: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

System Usage Scenario

BRIDGES Portal

ClientSite X

Secure access for CFG VO

Shared/Private

Data Sets

Personalised Services

BLASTSmith

WSV

DLO

GS

A-

DA

I

Authorisation

Per user, per site

Re

mo

te d

ata

in O

racle

, DB

2,

Syb

ase

, Exce

l, flat file

s, XM

L...

Brow

ser based clients…

Java App downloaded (via WebStart)

Push relevant data onto ScotGrid for BLAST’ing

Secure Data Repository

Up to date results input to DB

wrappers

wrappers

Generic services used by other

projects

Page 26: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

Conclusions

NeSC continues to provide leadership in UK e-Science

Difficult with multitude of scientific research areas, heterogeneity of systems and fluidity of technologies,

GT2, GT3, WSRF, GT4…?

Closer working with GridPP beneficial for everyone

move towards Production Grid ScotGrid a good model for co-operation

Planning for soft landing through diversification and more integration into university

MRC bids, BBSRC bids, EPSRC bids, …UK e-Science operating as community for upcoming DTI funding opportunitiesPlans for developing Grid Computing teaching modules as part of advanced MSc

Page 27: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

WebsiteNational e-Science Centre http://www.nesc.ac.uk/

Mission, Background, FoundationLocations, Staff, Resources, ProjectsRegister interest, Mailing lists, NeSCForgeRegional associations and CollaborationsNews, NoticesPresentations & Lectures http://www.nesc.ac.uk/presentations/

e-Science Institute http://www.nesc.ac.uk/esi/Mission, Events (Future and Past)Register for Events, Visitor Programme

UK e-ScienceMap and Index of Centres http://www.nesc.ac.uk/centres/Technical Papers http://www.nesc.ac.uk/technical_papers/Index of >100 Projects http://www.nesc.ac.uk/projects/Task Forces http://www.nesc.ac.uk/teams/

General InformationGlossary, Bibliography, Who’s whoE-Science job vacancies

Page 28: Dr Richard Sinnott Dr Dave Berry 5 th February 2004 National e-Science Centre Local Developments Technical Director National e-Science Centre ||| Deputy.

Questions…?