2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria:...

37
2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1) University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s: David Giblin and Joe Ammirati) 2) Oregon State University Herbarium (Co-PI: Aaron Liston) 3) University of Idaho, Stillinger Herbarium (Co-PI: Dave Tank) 4) Montana State University Herbarium (Co-PI: Matt Lavin) Submitted to NSF in July, 2009; awarded this spring; funds arrived in June. Combined grant funds: $1,340,879 Grant covers imaging and digitization of herbarium specimens from multiple PNW herbaria.

Transcript of 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria:...

Page 1: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

2010 Collaborative NSF Grant

Collaborative grant between four Pacific Northwest herbaria:

1) University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s: David Giblin and Joe Ammirati)

2) Oregon State University Herbarium (Co-PI: Aaron Liston)3) University of Idaho, Stillinger Herbarium (Co-PI: Dave Tank)4) Montana State University Herbarium (Co-PI: Matt Lavin)

Submitted to NSF in July, 2009; awarded this spring; funds arrived in June.

Combined grant funds: $1,340,879

Grant covers imaging and digitization of herbarium specimens from multiple PNW herbaria.

Page 2: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

What we proposed to do

1. Expand the taxonomic breadth of the portal by databasing 200,000 PNW non-vascular plant, fungal, and lichen specimens at OSC and WTU.

2. Image, database and provide online access to 185,000 vascular plant specimens from the region’s remaining large herbaria (ID, MONT).

3. Image, database, and provide online access to the PNW vascular plant specimens from small herbaria in Idaho, Oregon, and Washington.

4. Create connections to the Portal for other PNW herbaria with existing specimen databases (WS, UBC, MONTU, SOU, SRP, CIC). Also, improve existing connections (WTU, OSC, ALA).

5. Provide Portal data to GBIF, USVH, and other data aggregators for those collections that lack their own data access points.

6. Develop portal-based web applications and expand the PNW Herbaria web site.

Page 3: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Grant Budget

1. Collaborative grant between four institutions in Washington (WTU), Oregon (OSC), Idaho (ID), and Montana (MONT).

2. Each institution receives a portion of the funds.

Institution Funds

University of WA (WTU) $642,611

Oregon State U. (OSC) $325,141

U. of Idaho (ID) $301,351

Montana State U. (MONT) $71,776

TOTAL $1,340,879

WTU Funds ($)Informatics Specialist: 184,202

Databasing coordinator: 41,357Imaging/databasing: 162,512

Equipment/Supplies: 17,815 Travel: 6,000

2010 PNW Herbaria meeting: 3,250Indirect costs: 227,475

Page 4: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Grant Budget

1. Collaborative grant between four institutions in Washington (WTU), Oregon (OSC), Idaho (ID), and Montana (MONT).

2. Each institution receives a portion of the funds.

Institution Funds

University of WA (WTU) $642,611

Oregon State U. (OSC) $325,141

U. of Idaho (ID) $301,351

Montana State U. (MONT) $71,776

TOTAL $1,340,879

OSC Funds ($) Databasing coordinators: 68,926

Imaging/databasing: 140,000Equipment/Supplies: 9,969

Travel: 3,500Indirect costs: 102,746

Page 5: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Grant Budget

1. Collaborative grant between four institutions in Washington (WTU), Oregon (OSC), Idaho (ID), and Montana (MONT).

2. Each institution receives a portion of the funds.

Institution Funds

University of WA (WTU) $642,611

Oregon State U. (OSC) $325,141

U. of Idaho (ID) $301,351

Montana State U. (MONT) $71,776

TOTAL $1,340,879

ID Funds ($)Databasing coordinator: 39,676

Imaging/databasing: 158,050Equipment/Supplies: 9,921

Travel: 2,500Indirect costs: 91,204

Page 6: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Grant Budget

1. Collaborative grant between four institutions in Washington (WTU), Oregon (OSC), Idaho (ID), and Montana (MONT).

2. Each institution receives a portion of the funds.

Institution Funds

University of WA (WTU) $642,611

Oregon State U. (OSC) $325,141

U. of Idaho (ID) $301,351

Montana State U. (MONT) $71,776

TOTAL $1,340,879

MONT Funds ($)Coordinator & imaging: 44,000

Equipment/Supplies: 5,869Travel: 500

Indirect costs: 21,407

Page 7: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Organization and Administration

WTU (lead institution)Dick Olmstead (PI)

WTU: David Giblin (Co-PI)

Joe Ammirati (Co-PI)

OSC: Aaron Liston (Co-PI)

ID: Dave Tank (Co-PI)

MONT: Matt Lavin (Co-PI)

Project FacilitatorProject Facilitator Project Facilitator Project Facilitator

Imaging/DatabasingPersonnel

Imaging/DatabasingPersonnel

Imaging/DatabasingPersonnel

Imaging Personnel

Each institution has a high degree of autonomy in how their funds are used.

Page 8: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Organization and Administration

WTU: David Giblin (Co-PI)

Joe Ammirati (Co-PI)

OSC: Aaron Liston (Co-PI)

ID: Dave Tank (Co-PI)

MONT: Matt Lavin (Co-PI)

Project FacilitatorProject Facilitator Project Facilitator Project Facilitator

Imaging/DatabasingPersonnel

Imaging/DatabasingPersonnel

Imaging/DatabasingPersonnel

Imaging Personnel

Ben Legler (IS, WTU) WTU (lead institution)Dick Olmstead (PI)

Page 9: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Organization and Administration

My duties as Informatics Specialist:

1. Develop, configure, and deploy the specimen imaging equipment.

2. Develop the databasing software to capture label data from imaged specimens directly into the Portal server.

3. Provide training to the Project Facilitators in each state so they can assume responsibility for imaging and databasing within their state.

4. Work with each institution as needed to assist with imaging and databasing challenges.

5. Assist herbaria in setting up data access points to connect their data to the PNW Portal, GBIF, USVH, and other data aggregators.

(continued…)

Page 10: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Organization and Administration

My duties (continued):

6. Connect the collections hosted on the Portal Server to GBIF, USVH, and other data aggregators

7. Enhance the PNW Herbaria web site with new features and web applications.

8. Manage the PNW Herbaria web site, databases, and web server.

Page 11: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Organization and Administration

Project Facilitator Duties:

1. Learn to use the imaging equipment, and how to deal with problems that may arise.

2. Transport the imaging equipment between collections. At each institution, set up the equipment and train the imaging personnel.

3. Oversee specimen databasing, including training and managing data entry personnel.

4. Image and data quality control.

Page 12: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Organization and Administration

Imaging & Databasing Personnel:

1. Will consist of hourly, work study, and volunteers.

2. Most will be stationed at WTU, OSC, and ID.

3. The small herbaria being imaged will recruit their own personnel to do imaging; however, funds from the grant are available to pay these personnel if needed.

Page 13: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

The Details of Our “To-Do” List

An in-depth look at what we proposed in our grant

Page 14: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

The Details of Our “To-Do” List

1. Expand the taxonomic breadth of the portal by databasing 200,000 PNW non-vascular plant, fungal, and lichen specimens at OSC and WTU.

2. Image, database and provide online access to 185,000 vascular plant specimens from the region’s remaining large herbaria (ID, MONT).

3. Image, database, and provide online access to the PNW vascular plant specimens from small herbaria in Idaho, Oregon, and Washington.

4. Create connections to the Portal for other PNW herbaria with existing specimen databases (WS, UBC, MONTU, SOU, SRP, CIC). Also, improve existing connections (WTU, OSC, ALA).

5. Provide Portal data to GBIF, USVH, and other data aggregators for those collections that lack their own data access points.

6. Develop portal-based web applications and expand the PNW Herbaria web site.

Page 15: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

1. Non-Vasculars

Database non-vascular plants, fungi, and lichens at OSC and WTU.

Approximately 100,000 specimens will be databased at each.

Each will use their existing databases for this component of the project. OSC: Specify 6, WTU: FileMaker.

Timeline: start now, anticipated completion date of summer 2011.

TO-DO: Create data access points to allow the Portal to harvest these data.

Page 16: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

2. ID and MONT

Image and database the collections at ID and MONT.

About 120,000 specimens will be imaged and databased from ID.

• ID has an existing FileMaker database that will be used.

• We need to integrate the image capture process with this database.

• Timeline: begin imaging August 2010; can potentially finish summer 2011.

About 65,000 specimens will be imaged and databased from MONT.

• Only image capture will occur at MONT. Images will then be transferred to the Portal server.

• Databasing will be done from the images at WTU, using the same database software developed for the smaller herbaria.

• Will begin imaging in August, 2010.

Page 17: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

3. Smaller Herbaria

Image and database smaller herbaria in Idaho, Oregon, and Washington.

4-5 smaller herbaria will be selected from each of these states.

For each herbarium, the entire PNW vascular plant collection will be imaged and databased.

Images will be transferred to the Portal server for processing and storage.

Databasing will occur from the images using personnel stationed at WTU, OSC, and ID.

More about the imaging and databasing process in a moment…

Page 18: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

3. Smaller Herbaria

Image and database smaller herbaria in Idaho, Oregon, and Washington.

Washington:Western WA University: 26,000

Whitman College: 17,000Central WA University: 25,000Eastern WA University: 7,000

Pacific Lutheran University: 5,000+TOTAL: 78,000?

Idaho:Lewis & Clark State College: 10,000

Northern Idaho College: 10,000Forest Service Herbaria: 5,000

TOTAL: 25,000

Oregon:Reed College: 10,000

Portland State University: 11,000Linfield College: 2,000

Southern Oregon University: 14,000TOTAL: 37,000

Montana: (none)

GRAND TOTAL: 142,000+

Page 19: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

3. Smaller Herbaria

Small herbaria house valuable specimens not present in the larger herbaria. But they are often not examined by specialists.

Limitations in staff, budgets, and computer infrastructure make it difficult for these herbaria to manage a specimen database or provide online access.

It is often challenging to justify the existence of these collections to university administrators. Can we increase their utility and prominence?

There is a recognition among the general herbarium community that larger herbaria and institutions should assist these smaller herbaria.… this is what we’ve proposed to do.

Benefits of digitizing these herbaria:

Page 20: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

4. Portal Connections

Create connections to the Portal for PNW herbaria with existing specimen databases.

The following herbaria have already databased (or are databasing) their PNW vascular plant collections. We need to connect their data to the Portal:

• Washington State University (165,000)• University of British Columbia (440,000)• Southern Oregon University (14,000?)• University of Montana (90,000?)• Boise State University (30,000?)• Albertson College of Idaho (35,000?)

The connections with our existing providers can be improved:

• University of Alaska, Fairbanks (210,000)• Oregon State University (194,000)• University of Washington (198,000)

Page 21: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

4. Portal Connections

Improving Portal connections:

• Currently using DiGIR and Darwin Core data schemas.• Should switch to TAPIR or IPT, and use current data schemas.• I will investigate ways to bypass these options to create custom

connections allowing faster data transfer and richer data structures.

Page 22: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

5. Provide data to GBIF & Others

Provide Portal data to GBIF, USVH, and other data aggregators for those collections that lack their own data access points.

Page 23: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

6. Portal Applications

Develop portal-based web apps and expand the Portal web site.

Public interface:1. Search interface improvements: browse taxonomy, image viewer, search by polygon,

search by shapefile, search by a list of values, return results as a checklist, etc.2. Specimen-based, synonymized regional checklists for each organismal group.3. Atlas pages with dot maps for each taxon for the region, including print-quality maps.4. Specimen-based, dynamically generated, county level checklists for the region.5. Dot map of the entire region showing all collection sites, with color coding.6. A version of the search pages targeted for mobile devices.7. Static datasets that can be downloaded and copied to mobile devices for field use.

Back-end & administrative:1. Improve Portal’s data harvesting processes.2. Use the Portal to host ALL Pacific Northwest herbaria to GBIF?3. Add data quality controls such as synonymy checks, flagging records with

inconsistencies, and reporting data problems back to the originating herbarium.4. Create mechanism for automated dispatch of loan requests to participating herbaria.5. Improve data usage tracking, and mechanisms to report statistics back to herbaria.6. Create a GIS Web Service providing access to georeferenced specimen data.

Any more ideas?

Page 24: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Imaging & Databasing

Imageprocessing

scripts

Transfer RAW imagesand metadata on a portable hard drive

Populatedatabase

from images& metadata

Imaging Workstation: Portal Server: Data Entry:

Data entry form displays a blank record for an image

Adjacent image viewer shows label data and annotations

Overview of the imaging & databasing workflow

RIA using AXAJ

JPEG

Dropbox

RAW Tiles

Page 25: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Imaging and Databasing

Imaging equipment.

• Ortery Lightbox• Canon EOS 5D Mark II• 16-32 GB compact flash• AC adaptor for camera• 50 mm macro lens• Custom camera mount• Custom specimen holder• 6 inch ruler• Laptop computer• USB cable• Canon EOS Utility• Image metadata form

Page 26: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Imaging and Databasing

Imaging equipment.

Custom mount on top of box to hold camera

Page 27: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Imaging and Databasing

Imaging equipment.

Page 28: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Imaging and Databasing

Images are stored in several formats:

Digital Negative (.DNG): This is a publicly documented RAW format developed by Adobe as an alternative to the numerous proprietary RAW formats from each camera manufacturer. 22 MB per image.

JPEG: Conversion from RAW formats to TIFF or JPEG is a hassle, so we will store high-quality JPEG copies for immediate access. 7 MB per image.

Tiled images: These are used by the online specimen image viewer. They function in the same way as map tiles in Google Maps. 3.5 MB per image.

Storage requirements for 327,000 specimen images:

DNG 6.8 TerabytesJPEG 2.2 TerabytesTiles 1.1 Terabytes

TOTAL 10.1 Terabytes

(we’ll have 13.5 Terabytes available, and can purchase more if needed)

(for comparison, an 8-bit TIFF with LZW compression is larger than the DNG, JPEG, and tiles combined)

Page 29: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Imaging and Databasing

Canon: Photoshop: dcraw:

Page 30: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Imaging and Databasing

Canon: Photoshop: dcraw:

Page 31: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Imaging and Databasing

Specimen image viewer:

Page 32: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Imaging & Databasing

Imageprocessing

scripts

Transfer RAW imagesand metadata on a portable hard drive

Populatedatabase

from images& metadata

Imaging Workstation: Portal Server: Data Entry:

Data entry form displays a blank record for an image

Adjacent image viewer shows label data and annotations

Overview of the imaging & databasing workflow

RIA using AXAJ

JPEG

Dropbox

RAW Tiles

Page 33: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Imaging and Databasing

A web-accessible data entry interface using specimen images.

A set of databases hosted by the Portal server. Each herbarium will have its own separate database.

The data entry interface will be accessed through a web browser (or similar stand-alone client app), with access restricted to authorized personnel.

Data entry will be performed by personnel at WTU, OSC, and ID.

Following the grant period, managers and curators at these collections can use these databases as their primary database if they choose, or migrate the data into their own in-house database.

This design eliminates the need for smaller herbaria to manage their own in-house databases.

Page 34: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

Data entry personnel simply click a button to pull up a blank record, database from the image, and repeat.I may add OCR-assist to the data entry interface.

Imaging and Databasing

Page 35: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

End Results

What will we have accomplished by the end of the grant?

Imaged and databased ca. 327,000 vascular plant specimens from at least 14 regional herbaria, with these available online.

Databased ca. 200,000 non-vascular plant, lichen, and fungi specimens.

The PNW Herbaria Portal will host nearly 2,000,000 specimens.

Developed a means of efficiently digitizing specimens from smaller herbaria using staff and resources at the larger herbaria. This can serve as a model for use in other regions.

Expanded the features of the Portal web site, and created new ways of accessing and using the data.

Page 36: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

End Results

What is happening first?

Database non-vascular plants, lichens, and fungi at WTU and OSC.

Finish configuring the imaging equipment, train facilitators, and deploy to:

• Western Washington University• University of Idaho• University of Montana• Reed College (Oregon)

Develop and deploy the portal-based data entry system by early fall.

Starting in the fall, I will begin connecting additional collections to the Portal; help improve existing connections; and begin developing web site features.

Page 37: 2010 Collaborative NSF Grant Collaborative grant between four Pacific Northwest herbaria: 1)University of Washington Herbarium (PI: Dick Olmstead, Co-PI’s:

6. Portal Applications

Develop portal-based web apps and expand the Portal web site.

Public interface:1. Search interface improvements: image viewer, search by polygon, search by shapefile,

search by a list of values, return results as a checklist, etc.2. Specimen-based, synonymized regional checklists for each organismal group.3. Atlas pages with dot maps for each taxon for the region, including print-quality maps.4. Specimen-based, dynamically generated, county level checklists for the region.5. Dot map of the entire region showing all collection sites, with color coding.6. A version of the search pages targeted for mobile devices.7. Static datasets that can be downloaded and copied to mobile devices for field use.

Back-end & administrative:1. Improve Portal’s data harvesting processes.2. Use the Portal to host ALL Pacific Northwest herbaria to GBIF?3. Add data quality controls such as synonymy checks, flagging records with

inconsistencies, and reporting data problems back to the originating herbarium.4. Create mechanism for automated dispatch of loan requests to participating herbaria.5. Improve data usage tracking, and mechanisms to report statistics back to herbaria.6. Create a GIS Web Service providing access to georeferenced specimen data.

Any more ideas?