ICAT Integration at DLS. Alun Ashton. What were the requirements? Integrate with current business...

24
ICAT Integration at DLS. Alun Ashton

Transcript of ICAT Integration at DLS. Alun Ashton. What were the requirements? Integrate with current business...

ICAT Integration at DLS.

Alun Ashton

What were the requirements?

• Integrate with current business system• Collect Data and Metadata relating to a proposal• Security• Long term storage of data• (Multi?) Institutional repository• Searchable Metadata• Potentially extendable with future e-science

infrastructure• Scalable

What were the restrictions?

• Only relies on Beamline Network.• Must not prevent scientist collecting data.• Data is collected by putting files onto a UNIX file system.• Use e-Science research output.

• Interest, vision, priorities and finances.

What was the solution? Diamond, Modified by e-Science

GDA

DArc andStoragedNexus

& Data

DUO DLS ICAT SRB

DataPortalDiamondProposal

Web pages

AtlasData Store

DUO Desk

Active Directory

People DB

IKittenData /

metadata

What was the solution? Diamond, Modified by e-Science

GDA

DArc and StoragedNexus

& Data

DUO DLS ICAT SRB

DataPortalDiamondProposal

Web pages

AtlasData Store

DUO Desk

Active Directory

People DB

IKittenData /

metadata

What goes into the ICAT/Ikitten

• Who• When• Where• ‘what’

– Abstract– Visit

• Triggers I think moves the data to the iKitten

DLS ICATDUO Desk

IKitten

What does the GDA read from the IKitten

• Metadata read only– Fedid – the GDA knows the users Fedid and asks the

DB for the current visit for that user, used for setting up data paths.

– Instrument, abstract

GDA

IKittenData /

metadata

Role Based access and application sharing.

• Definable access• Definable role• If this is not

right…

What does the GDA write out?

• GDA does not currently write a NeXus file,

– Detector files and anything…..

• Currently implemented only on MX beamlines

GDANexus& Data

‘GDA’ needs to initiate next step

• In the MX GDA whenever an image file is created it runs a script

• Bash script creates an XML file as input to DArc• The GDA can produce a script itself but no beamlines

currently do today… maybe next week?

GDANexus& Data

DArc

DArc

Watches a directory

Converts file to XML Ingest format

Initiates storagedCopies a filelistTo a location

Moves XML to A log

• Python Based EDNA • Server that checks <>5s• http://www.edna-site.org

Runs XMLingest

Success?

The XML file, minimum!<?xml version="1.0" ?><icat version="1.0 RC6"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="icatXSD.xsd">

<study> <investigation> <inv_number>MX307</inv_number> <visit_id>MX307-27</visit_id> <instrument>i04</instrument> <title>dont need it</title> <inv_type>experiment</inv_type> <dataset> <name>icr/fernando</name> <dataset_type>EXPERIMENT_RAW</dataset_type> <description>unknown</description> <datafile> <name>FG2_3_MS_3_154.img</name> <location>srb://srb-mcat-i18.esc.rl.ac.uk:5518/dls-2/i04/data/data/2009/mx307-27/icr/

fernando/FG2_3_MS_3_154.img</location> <description>unknown</description> <datafile_version>1.0</datafile_version> <datafile_create_time>2009-08-11T02:58:21</datafile_create_time> <datafile_modify_time>2009-08-11T02:58:21</datafile_modify_time> </datafile> </dataset>

What is our status?

• https://dataportal.diamond.ac.uk

New Eclipse Based GDA

Viewing and choosing the Metatdata

Relation to NeXus

By the by…https://ispyb.diamond.ac.uk

• Did try a view, now use a script

DLS ICATDUO Desk

ISPyB

IKitten

ISPyB and ICAT

• ICAT– Archive– Generic

• ISPyB– LIMS– Domain specific

How robust is the system?

GDA

DArc andStoragedNexus

& Data

DUO DLS ICAT SRB

DataPortalDiamondProposal

Web pages

AtlasData Store

DUO Desk

Active Directory

People DB

IKittenData /

metadata

How robust is the system?

GDA

DArc andstoragedNexus

& Data

DUO DLS ICAT SRB

DataPortalDiamondProposal

Web pages

AtlasData Store

DUO Desk

Active Directory

People DB

IKittenData /

metadata

Future Work

• Use it!!!

• Portal and API integration into in house tools• Sample information from user office (top level

information about the sample, and ISPyB (the instance of the sample).– Internal ICAT.

• Reduced data and richer metadata.

• Archive old data

Acknowledgements • Diamond Light Source: Tobias Richter, Stuart Campbell,

Karl Levik, Marc Basham, Karl Levik, Bill Pulford (and others in groups who have contributed at various times on various beamlines)

• STFC e-Science: Michael Gleaves, Roger Downing, Rik Tyer, Glen Drinkwater, Shoaib Sufi, Phil Couch, Kerstin Kleese Van Dam, Keir Hawker, Carmine Cioffe, Gordon Brown, Lisa Blanshard, Kevin O’Neill.

• STFC Daresbury: Steve Kinder and Karen Ackroyd

• But especially, Roger, Keir, Carmine, Michael, Mark and Tobias.

• BTW we have 64Tb of raw data already