iRODS 4.0 Tutorial I. Getting Started

38
iRODS 4.0 Tutorial I. Getting Started Leesa Brieger

description

iRODS 4.0 Tutorial I. Getting Started. Leesa Brieger. iRODS Tutorial Preview. iRODS Getting Started Unix client (icommands) U sage iRODS Data Grid Administration Installing server and iCAT Changing default settings Setting up users - PowerPoint PPT Presentation

Transcript of iRODS 4.0 Tutorial I. Getting Started

Page 1: iRODS 4.0 Tutorial I. Getting Started

iRODS 4.0 Tutorial

I. Getting Started

Leesa Brieger

Page 2: iRODS 4.0 Tutorial I. Getting Started

2

iRODS Tutorial Preview

I. iRODS Getting Started – Unix client (icommands)– Usage

II. iRODS Data Grid Administration– Installing server and iCAT– Changing default settings– Setting up users– Installing resource server and adding new resources

to a data grid/zone– Federating with other grids/zones, remote users– Microservices and rules for policy implementation

and enforcement

Page 3: iRODS 4.0 Tutorial I. Getting Started

3

I. iRODS Getting Started

Page 4: iRODS 4.0 Tutorial I. Getting Started

4

iRODS Info

• Main page: http://irods.org

• Github repositories: https://github.com/irods/

(Request a guest account to get access.)

• Github tickets (issues):

https://github.com/irods/irods/issues?state=open

• Chat list: [email protected]

• The original iRODS wiki: https://wiki.irods.org

• The iRODS Manual: http://irods.org/dev/wp-content/uploads/2014/03/irods-manual-4.0.0.pdf

Page 5: iRODS 4.0 Tutorial I. Getting Started

5

iRODS Books

Available from Amazon

• The integrated Rule-Oriented Data System (iRODS) Micro-service Workbookhttp://www.amazon.com/dp/1466469129

• An updated microservices reference book is coming out soon.

Page 6: iRODS 4.0 Tutorial I. Getting Started

6

iRODS Download

• Download link for binary packages from the iRODS main page

http://irods.org/download/

• Open source, BSD license

• Binary packages – rpm and deb – available for CentOS, Red Hat, Debian, Ubuntu, SUSE. Planned: Mac OSX, Windows– See http://irods.org/2014/03/irods-4-0-0-released/ – 4.0 tested platforms: CentOS 5 & 6, SuSE 11 & 12, Ubuntu 10 & 12

• iRODS source is available via GitHubhttps://github.com/irods/irods

• Other iRODS packages and plug-ins are available via GitHub athttps://github.com/irods

Page 7: iRODS 4.0 Tutorial I. Getting Started

7

Install iRODS Unix Client (icommands)

• Download the client package for your platform at https://irods.org/download/

• Install the package– Ubuntu: sudo dpkg –i irods-icommands-4.0.0-64bit.deb– CentOS 6: sudo rpm –i irods-icommands-4.0.0-64bit-centos6.rpm

• This will:– install the icommands to your path– advise you to create an environment file: ~/.irods/.irodsEnv

Page 8: iRODS 4.0 Tutorial I. Getting Started

8

The iRODS Environment File

• ~/.irods/.irodsEnv

• The .irodsEnv file determines which data grid (zone) the icommands client connects to - it determines your whole iRODS environment– server host– iRODS user name– port number– default resource– home directory– working directory– …

• Use multiple environment files to choose from among many grids– only one file at a time has the name .irodsEnv– such environment changes are only feasible for Unix accounts that

are not the “irods” service account

Page 9: iRODS 4.0 Tutorial I. Getting Started

9

Working with a Demo Data Grid

• If you have an account on an iRODS data grid, find

– your account name and password

– the environment file (.irodsEnv) that goes with it

• Make directory .irods/ in your home directory

> mkdir ~/.irods

• Copy the .irodsEnv file into ~/.irods; edit if necessary to insert your user name, zone name, default resource, etc

• This will direct your client to the intended data grid, as the intended user.

Page 10: iRODS 4.0 Tutorial I. Getting Started

10

Sample .irodsEnv file (partial)

# iRODS server host name: irodsHost ‘tutoricat.renci.org.renci.org'

# iRODS server port number: irodsPort 1247

# Default storage resource name: irodsDefResource ’resc1'

# Home directory in iRODS: irodsHome '/centosgrid/home/leesa'

# Current directory in iRODS: irodsCwd '/centosgrid/home/leesa'

# Account name: irodsUserName ’leesa'

# Zone: irodsZone ’centosgrid’…

In this example, user name is “leesa”

iRODS data grids are referred to as “zones”

Page 11: iRODS 4.0 Tutorial I. Getting Started

11

Some iRODS Clients

• iDrop web – iDrop, iDrop-litehttp://iren-web.renci.org:8080/idrop-web/login/login

• PHP web browserhttp://iren-web.renci.org/rodsweb

• icommands – Unix clienthttps://www.irods.org/index.php/icommands

• FUSE (Filesystem in Userspace) clienthttps://www.irods.org/index.php/iRODS_FUSE

• Many others supplied by user communities

Page 12: iRODS 4.0 Tutorial I. Getting Started

12

Unix client: icommands

Seehttps://wiki.irods.org/index.php/icommands for the complete

list and full documentation.

Unix-like

ils

ipwd

icd

ichmod

irm

imkdir

ipasswd

irsync

ichksum

imv

icp

ienv

FTP-like

iinit

iexit

iput

iget

These slides do not contain an exhaustive list of the icommands.

Page 13: iRODS 4.0 Tutorial I. Getting Started

13

icommands (continued)

Metadata-related

imeta

iquest

idbo

Informational

ienv

ilsresc

iuserinfo

ihelp – list all icommands

Functional

ireg

ibun

irepl

Rule-oriented

irule

iqstat

iqdel

iqmod

idbug

The –h option to any icommand gives documentation on that command.

Page 14: iRODS 4.0 Tutorial I. Getting Started

14

icommands - examples

> iinitEnter your current iRODS password: *****

> ienvNOTICE: Release Version = rods4.0.0, API Version = dNOTICE: irodsHost=tutoricat.renci.orgNOTICE: irodsPort=1247NOTICE: irodsDefResource=resc1NOTICE: irodsHome=/centosgrid/home/leesaNOTICE: irodsCwd=/centosgrid/home/leesaNOTICE: irodsUserName=leesaNOTICE: irodsZone=centosgrid…

> iexit full

This will store scrambledpw in ~/.irods/.irodsA

View your environment

This deletes the~/.irods/.irodsAauthentication file.

Page 15: iRODS 4.0 Tutorial I. Getting Started

15

icommands - examples

> ipwd/centosgrid/home/leesa

> ils/centosgrid/home/leesa: irods-resource-4.0.0-64bit.deb C- /centosgrid/home/leesa/slides

Directories in iRODS are called collections.

This entry is a collection.

Listing the contents of your current directory.

Collection naming convention: /zone_name/home/user_name/collection_name

See your current working directory.

Page 16: iRODS 4.0 Tutorial I. Getting Started

16

> iput –K notes (-K: to calculate and store checksums)

> iput iquest-attributes (no checksums)

> ils /centosgrid/home/leesa:

iquest-attributes irods-resource-4.0.0-64bit.deb notes C- /centosgrid/home/leesa/slides

> iget –k irods-resource-4.0.0-64bit.deb (-k: to verify checksum without

storing)

> irm –f notes(-f to force the remove, physically. Without –f, the file is simply moved to the Trash collection on the same resource.)

icommands – to ingest and retrieve data

-k and –K options forchecksum calculation

and comparison

Page 17: iRODS 4.0 Tutorial I. Getting Started

17

icommands - examples

> ils -L/centosgrid/home/leesa:

leesa 0 demoResc 1883 2014-05-01.13:58 & iquest-attributes demoResc generic

/var/lib/irods/iRODS/Vault/home/leesa/iquest-attributes

leesa 0 resc1 28937358 2014-04-30.14:47 & irods-resource-4.0.0-64bit.deb

02b7dd912af6e615f4680594fe3e2dc8 resc1 generic /var/irodsVault/home/

leesa/irods-resource-4.0.0-64bit.deb

C- /centosgrid/home/leesa/slides

Long listing of my current directory

Shows owner, resource, file, checksum (if calculated),physical path, …

Page 18: iRODS 4.0 Tutorial I. Getting Started

18

ilsresc – list available resources

See the resources available on your data grid

On centosgrid:

> ilsrescresc1

demoResc resc2

Resources can share hosts but typically don’t.

Each resource has its own unique physical location

(path).

Page 19: iRODS 4.0 Tutorial I. Getting Started

19

> ils -A

/centosgrid/home/leesa: ACL - leesa#centosgrid:own Inheritance - Disabled

iquest-attributes ACL - leesa#centosgrid:own

irods-resource-4.0.0-64bit.deb ACL - leesa#centosgrid:own

C- /centosgrid/home/leesa/slides

icommands – ACLs (access control lists) and ichmod

> ichmod read guest iquest-attributes

> ils –A

/centosgrid/home/leesa: ACL - leesa#centosgrid:own Inheritance - Disabled

iquest-attributes ACL - leesa#centosgrid:own

guest#centosgrid:read object

irods-resource-4.0.0-64bit.deb ACL - leesa#centosgrid:own

C- /centosgrid/home/leesa/slides

Page 20: iRODS 4.0 Tutorial I. Getting Started

20

ichmod and group “public”

> ichmod -r read public slides

> ils -A slides

/centosgrid/home/leesa/slides:

ACL - leesa#centosgrid:own g:public#centosgrid:read object Inheritance - Disabled

irods4.0-intro.pptx ACL - public#centosgrid:read object leesa#centosgrid:own

Every user in the data gridis a member of user group

“public”

“ichmod –h” for help

Page 21: iRODS 4.0 Tutorial I. Getting Started

21

> ils /centosgrid/home/rods: hello

> irepl -R resc1 hello

> ils/centosgrid/home/rods: hello

> ils -L/centosgrid/home/rods: rods 0 demoResc 22 2014-04-16.12:28 & hello demoResc generic /var/lib/irods/iRODS/Vault/home/rods/hello

rods 1 resc1 22 2014-05-01.17:03 & hello resc1 generic /var/irodsVault/home/rods/hello

irepl – replicating data objectsa replica ≠ a copy

Replicating does not create a new logical object; copying does.

Use the long listing to see all replicas and

multiple physical locations

The collection doesn’t change with replication of

data object “hello” which is still the same logical object

as before.

Page 22: iRODS 4.0 Tutorial I. Getting Started

22

irepl

• See https://irods.sdsc.edu/index.php/irepl for complete list of options

• Some options

-B: Backup mode - if a good copy already exists in selected resource, don't make another copy

-P: show progress

-r: recursive replication

-R: destination resource

-X restart_filename: turn on restart option and accumulate restart info into restart_filename

“irepl –h” for help

Page 23: iRODS 4.0 Tutorial I. Getting Started

23

Ingest data into iRODS without making an additional copy and with no data transfer

Example• Unix directory /projects/state-data (not an iRODS collection) contains

500TB of state LiDAR data.• We want to manage the LiDAR data with iRODS. • How to ingest the data into iRODS without copying or transferring

500TB?!

continued…

ireg – register data into iRODS

In-place ingestion of data into iRODS

“ireg –h” for help

Page 24: iRODS 4.0 Tutorial I. Getting Started

24

Set-up• /projects/state-data must be mounted on the iRODS server host

• Data admin sets up /projects/state-data as an iRODS resource: > iadmin mkresc stateResc unixfilesystem

tutorres.renci.org:/projects/state-data

Creating resource:Name: "stateResc"Type: "unixfilesystem"Host: "tutorres.renci.org"Path: "/projects/state-data"Context: ""

continued …

ireg example (continued)

stateResc is now in listed as a resource:> ilsresc

resc1 demoResc resc2 stateResc

Page 25: iRODS 4.0 Tutorial I. Getting Started

25

Usage

• Data admin registers existing data object or collection into iRODS:> ireg -C -R stateResc /projects/state-data/ /centosgrid/home/state-data/

> ils /centosgrid/home/state-data /centosgrid/home/state-data: lidar-d23720431300go20040824.asc

Register incoming files rigorously OR modify a directory only through iRODS

once it has been registered to keep the iCAT consistent with the directory.

ireg example (continued)

If new files are added to the directory, ireg can be run again, with option –f, to pick up as-yet unregistered files.

/centosgrid/home/state-data/Is now an iRODS collection

contained in stateResc

Page 26: iRODS 4.0 Tutorial I. Getting Started

26

ibun – for bundling files

Tar up and expand files for efficient iput/iget

• iput a tarball and expand it within iRODS> tar -chlf tutorials.tar -C irods-4.0 .

> iput -Dtar tutorials.tar

> ibun -x tutorials.tar irods-4.0

icommand> ils/centosgrid/home/leesa: iquest-attributes irods-resource-4.0.0-64bit.deb notes tutorials.tar

> ils/centosgrid/home/leesa: iquest-attributes irods-resource-4.0.0-64bit.deb notes tutorials.tar C- /centosgrid/home/leesa/irods-4.0 C- /centosgrid/home/leesa/slides

icommand

Unix command to tar up a directory

Page 27: iRODS 4.0 Tutorial I. Getting Started

27

ibun – for bundling files

Tar up and expand files for efficient iput/iget

• Tar up files in iRODS for iget

> ibun -cDtar slides.tar slides

> iget slides.tar

> tar –xvf slides.tar –C slides

“ibun –h” for help

icommand

Unix

> ils/centosgrid/home/leesa: iquest-attributes irods-resource-4.0.0-64bit.deb notes slides.tar tutorials.tar C- /centosgrid/home/leesa/irods-4.0 C- /centosgrid/home/leesa/slides

icommand

Page 28: iRODS 4.0 Tutorial I. Getting Started

28

iquest – querying the iCAT

• To see attributes that can be queried:> iquest attrs

• See https://wiki.irods.org/index.php/iquest for examples.

• > iquest "SELECT DATA_PATH, DATA_NAME, DATA_CHECKSUM WHERE

DATA_RESC_NAME like 'resc2'”

DATA_PATH = /opt/irodsVault/home/leesa/notesDATA_NAME = notesDATA_CHECKSUM = 0ac4e079aa230d87353fec33f1ec71cd------------------------------------------------------------DATA_PATH = /opt/irodsVault/home/rods/yum-fuse-libraries-notesDATA_NAME = yum-fuse-libraries-notesDATA_CHECKSUM = ------------------------------------------------------------

(Checksums displayed only when they have

already been calculated and stored, of course)

Page 29: iRODS 4.0 Tutorial I. Getting Started

29

iquest – querying the iCAT

• Useful when you want to remove a resource and you discover it isn’t empty:> iquest "SELECT DATA_NAME, USER_NAME, COLL_NAME WHERE

DATA_RESC_NAME like ‘tempResc'"DATA_NAME = slide-list-htmlUSER_NAME = rodsCOLL_NAME = /centosgrid/trash/home/rods-----------------------------------------------------------------

• Administrator can add SQL strings to be invoked by users of the data grid

• Data grid-specific queries (added by admin)> iquest --sql 'pre-defined SQL string' [format] [arguments]

A file was removed

with irm but notphysically deleted

(moved to Trash on

the resource)

Page 30: iRODS 4.0 Tutorial I. Getting Started

30

imeta – add, view, modify metadata

• > imeta add –d hello “Date” “2 february 2012”

• > imeta ls –d hello

AVUs defined for dataObj hello:attribute: Meta1value: hellounits: ----attribute: Datevalue: 2 february 2012units:

• > imeta rm –d hello “Meta1” “hello”

“imeta –h” for help

Page 31: iRODS 4.0 Tutorial I. Getting Started

31

iadmin – administrative functions

• h for help, “h <command>” for help for that command

• q to exit

• Add new users, modify passwords, add new resources, federate to remote zones, create resource groups,…– mkresc/rmresc– mkuser/rmuser, moduser (modify passwords)– mkzone/rmzone, modzone (for federation)

• Information on users, resources, tokens, etc– lt (el-tee), “lt <token type>” for listing of that token’s possible

values– lu, lr, lz,…

Page 32: iRODS 4.0 Tutorial I. Getting Started

32

1. Admin user must set up the cloud resource, for example:– S3 Resource: s3Resc

2. This is just another iRODS resource for the users, who can manage their cloud data just as all other iRODS data:

> iput –K –R s3Resc my_file (put data into the cloud resource)

> irepl –R s3Resc another_file (replicate into the cloud resource)

> ichmod read public my_file (give public access to data in the cloud)

> ichmod write user1 my_collection (give user1 permission to write

to a collection in the cloud)

Cloud Resources

Page 33: iRODS 4.0 Tutorial I. Getting Started

33

Realizable Objects – coming in 4.1???

• Typical iRODS objects contain data

• Realizable objects: – symbolic links to iRODS objects– symbolic links to external data sources– workflow procedures to regenerate data

• Implemented in community iRODS, still to come for the merged release (planned for 4.1)

Page 34: iRODS 4.0 Tutorial I. Getting Started

34

Database Resources????

• Database Resource (DBR): a database, queried and updated via SQL (or other, for non-SQL)

• Database object (DBO): an interface to a DBR, typically a (SQL) query that returns results

• iRODS agent will open and close DB as needed; results of the query are directly returned to user

• Query results are stored to an iRODS data object, a DBO Results file (DBOR).

• iRODS access controls are applied on the DBR and DBO.

Page 35: iRODS 4.0 Tutorial I. Getting Started

35

Microservices and Rules

Microservices• C code• the unit of work within iRODS• called by rules – can only be run by a rule

Rules• execute the microservices• compose microservices and other rules into workflows

Page 36: iRODS 4.0 Tutorial I. Getting Started

36

Rules

• Rules can run workflows of microservices (and other rules)

• Implement computer actionable policies

• Provide implementation of data policies– Retention, distribution, arrangement– Authenticity, provenance, description– Integrity, replication, synchronization– Deletion, trash cans, versioning– Archiving, staging, caching– Authentication, authorization, sharing– Access, approval, IRB, audit trails, report generation– Assessment criteria, validation– Derived data product generation, format parsing

Page 37: iRODS 4.0 Tutorial I. Getting Started

37

Running Rules

• Triggered by events/policy points

• Contained in the (distributed) rule base: – /etc/irods/core.re– many rules with the same name; the first rule with satisfied

condition is executed; others are skipped

• Can be run with irule: manual execution

• Delayed execution– iqstat– iqdel

Page 38: iRODS 4.0 Tutorial I. Getting Started

38

irule – to run a rule manually

• Example rules to tweak and run in the software distribution iRODS/clients/icommands/test/rules3.0

• irule -F rulemsiListEnabledMS.r – list enabled microservices

Works with 4.0.1, not 4.0

• irule -F rulemsiAdmShowCoreRE.r - can only be run by admin users

• NB: not all the test rules in iRODS/clients/icommands/test/rules3.0 are running yet in 4.0.