Short-term storage and data documentation Mari Wigham COMMIT

35
Short-term storage and data documentation Klik op het pictogram als u een afbeelding wilt toevoegen Klik op het pictogram als u een afbeelding wilt toevoegen Mari Wigham COMMIT/

Transcript of Short-term storage and data documentation Mari Wigham COMMIT

Page 1: Short-term storage and data documentation Mari Wigham COMMIT

Short-term storage and data

documentation

Klik op het pictogram als u een afbeelding wilt toevoegenKlik op het pictogram als u een afbeelding wilt toevoegen

Mari Wigham

COMMIT/

Page 2: Short-term storage and data documentation Mari Wigham COMMIT

Information Management @ WUR

Organising, sharing, finding and reusing data

Expertise in:

●Modelling data and how people work with it

●Semantic technology – making the data ‘understandable’ for computers so that they can better support the users

Emphasis on applications for use in real life

Page 3: Short-term storage and data documentation Mari Wigham COMMIT

Information Management @ WUR

Page 4: Short-term storage and data documentation Mari Wigham COMMIT

This presentation

What data do we store?

With what technology?

Where do we store it?

Why document your data?

Example: Study on the effects of diet on health

Questions

Page 5: Short-term storage and data documentation Mari Wigham COMMIT

What data are we storing?

Raw data

Final data

Papers

but also

Intermediate data

Drafts of papers

Methods

Equipment and materials

Labnotes

...

Page 6: Short-term storage and data documentation Mari Wigham COMMIT

What data are we storing?

Everything you need to be able to do your work

Everything your colleagues need to do their work

Everything required by your funding organisation

Everything necessary to reproduce your work

Page 7: Short-term storage and data documentation Mari Wigham COMMIT

With what technology?

Smartphone/tablet/laptop/PC...

Specialised hardware and software

Dedicated e-labnotebook software or standard software

What do you want to be able to do with it?

●Take notes?

●Access internet?

●Log on to your network?

●Write documents?

●Give presentations?

●Use in the lab?

●Link up with lab systems?

Page 8: Short-term storage and data documentation Mari Wigham COMMIT

Where do we store it?

Storage solutions

Advantages Disadvantages Suitable for

Personal computer /laptop

• Always available• Portable

• What if it breaks/is stolen?

• What if you are ill or away?

Temporary storage

Network driveManaged file servers

• Regularly backed up and maintained

• Stored securely• Stored centrally

• Costs• May not be

accessible from everywhere/by everyone

Master copy (if enough space is provided)

External storage devices – USB, flash etc.

• Low cost• Portable

• Easily damaged or lost

• Insecure

Temporary storage

Cloud services – Dropbox, Figshare, SkyDrive etc.

• Automatic sync (some services)

• Easy access

• Is it secure?• No control over

backup procedure

Data sharing

Page 9: Short-term storage and data documentation Mari Wigham COMMIT

Sharepoint and OneDrive

"tussen droom en daadstaan wetten in de weg en praktische bezwaren"

“there’s many a slip twixt cup and lip”

• Proof of Concept – end of April• If a green light, available end September

Page 10: Short-term storage and data documentation Mari Wigham COMMIT

Features

Application

Data classification

Social functions

Search facility

Costs

Work offline

AnyTime

AnyWhere

AnyDevice

Share with 3rd parties

User-friendly

Sourcing

Microsoft Sharepoint

2013

Enterprise info – department, project

Up to Confidential

Social

Search facility

High costs

No sync client linux

7x24 via App, Web, Explorer

Internal / External WUR

Windows, Mac, iOS, Android

Yes, via Federation

Self & simple

On Premises

Microsoft Onedrive business

Personal data

Up to Confidential

Social integration

Search integration

Costs very low

No sync client linux

7x24 via App, Web, Explorer

Internal / External WUR

Windows, Mac, iOS, Android

Yes, via Federation

Self & simple

Public Cloud

Sharepoint 2010

Enterprise info – department, project

Up to Confidential

No Social

Mediocre Search

High costs

No sync client

7x24 via App, Web, Explorer

Internal / External WUR

Windows, Mac, iOS, Android

X-Account

Complex

On Premises

M: W: drive

All types of data

Up to Secret

No Social

Limited Search

High costs

Limited sync with M:\drive on laptop

7x24 Via Citrix

Via Citrix

Via Citrix / WURclient

Not possible

Not self/ simple

On Premises

Sharepoint and OneDrive

CurrentEnvironment

Page 11: Short-term storage and data documentation Mari Wigham COMMIT

Short term storage – what are the issues?

Space

Access

●From where?

●By who?

Versioning

Backups

Finding it again!

Page 12: Short-term storage and data documentation Mari Wigham COMMIT

Short term storage: Basic tips

Space

●Try to estimate how much you will need

●How will you monitor use?

●What do you do if you need more?

●What is your procedure for deletion?

Access

●Think about who will need access and from where

●What is your alternative if there is temporarily no access?

●Does everyone have the same access and edit rights?

Page 13: Short-term storage and data documentation Mari Wigham COMMIT

Short term storage: Basic tips

Versioning

●use a file in one (online) location as the “master”, and do all your modifications and processing on copies of that master

●When you have consolidated your changes and do not want to lose them, replace the master file by the consolidated file

● Indicate versions clearly – especially which is the master!

●Use a naming convention that includes date or number (e.g. ..._v1, ..._v2)

●Keep track of ‘milestone files’

Page 14: Short-term storage and data documentation Mari Wigham COMMIT

Short term storage: Basic tips

Backups

●As soon as possible

●Regularly

●How easily can you get hold of the backup?

●Make sure the backup is as independent as possible from the main storage

Finding

●Use descriptive names (descriptive for others than just yourself!)

●Document your data

Page 15: Short-term storage and data documentation Mari Wigham COMMIT

Why document your data?

Page 16: Short-term storage and data documentation Mari Wigham COMMIT

For yourself

For data processing and analysis

Help in writing reports and papers

Reference for the future

●Will you still understand it in 2 months, 6 months, 2 years..?

Page 17: Short-term storage and data documentation Mari Wigham COMMIT

Include failures and dead ends!

“On 19 September 1994, on the verge of giving up, Wiles had a flash of insight that the proof could be saved by returning to his original Horizontal Iwasawa theory approach, which he had abandoned in favour of the Kolyvagin–Flach approach, this time strengthening it with expertise gained in Kolyvagin–Flach's approach”

Page 18: Short-term storage and data documentation Mari Wigham COMMIT

For others

Your research colleagues – the ‘lone genius’ is very rare.

Provenance and traceability

●Patents

●Fraud

Journals are starting to ask for the data behind the paper

Research institutes and funding institutions such as the EU and NWO also increasingly want the data

Page 19: Short-term storage and data documentation Mari Wigham COMMIT

The importance of good documentation

“I have discovered a truly marvellous proof of this, which this margin is too narrow to contain”

Page 20: Short-term storage and data documentation Mari Wigham COMMIT

Documentation = paper?

Page 21: Short-term storage and data documentation Mari Wigham COMMIT

Data documentation

Structure is essential!

Page 22: Short-term storage and data documentation Mari Wigham COMMIT

The structure comes from you!

Page 23: Short-term storage and data documentation Mari Wigham COMMIT

A hierarchy of different files...

Page 24: Short-term storage and data documentation Mari Wigham COMMIT

...or everything in one program

Page 25: Short-term storage and data documentation Mari Wigham COMMIT

Example

Study to examine the effects of diet on health

- Conducted over 3 years by 3 researchers – Peter, Lisa and Anna

There are many ways to organise the data. We will look at three:

- By researcher

- By year

- By activity

Page 26: Short-term storage and data documentation Mari Wigham COMMIT

Example

It is now the summer holidays in 2014. Peter and Anna are on holiday, and Lisa has received some urgent questions from the reviewers. They need to know:

the procedure used to produce the high protein diet

which bureau measured the data

what sort of preprocessing was carried out on the data.

Page 27: Short-term storage and data documentation Mari Wigham COMMIT

Organisation by year/researcher

Need to know what was done when or by who

Page 28: Short-term storage and data documentation Mari Wigham COMMIT

Example – Organising by activity

Easy to navigate through, for each question you quickly find the right folder

- even if you had no prior knowledge.

Page 29: Short-term storage and data documentation Mari Wigham COMMIT

Example – Organising by activity

Still need to do quite a lot of detective work to find the information

– have to rely on good names, guesswork, and ...

...read through the content of the files.

Page 30: Short-term storage and data documentation Mari Wigham COMMIT

Structure AND metadata

Enter a brief description for each activity (folder)

It may help to identify inputs and outputs, or types of files (e.g. dataset, procedure, sample, document)

Linking to items produced in other activities allows you to:

● follow the workflow

● reuse items

● avoid problems due to multiple copies

Page 31: Short-term storage and data documentation Mari Wigham COMMIT

Example – Organising by activity plus extra structure

Easy to navigate through, for each question you quickly find the right folder

- even if you had no prior knowledge.

Descriptions and structure help you to find and understand the data

Links make the whole process traceable

Page 32: Short-term storage and data documentation Mari Wigham COMMIT

Electronic lab notes

Notes taken in the lab are often unstructured

May also cover different projects

Splitting the notes per activity and structuring them helps

How far you go depends on the time you have and what is necessary for understanding the data

The same goes for other large, unstructured files

Page 33: Short-term storage and data documentation Mari Wigham COMMIT

Structuring data

It takes time!

But it’s an investment – not time lost

Page 34: Short-term storage and data documentation Mari Wigham COMMIT

Why document your data?

If you store your files in a structure with description and links, you can:

See your research in context

Find – and understand - information more easily

Make your research traceable

Make your research reusable

Page 35: Short-term storage and data documentation Mari Wigham COMMIT

Questions?