Case Study: OSS at the National Cancer Institute The ... · PDF fileCase Study: OSS at the...
Transcript of Case Study: OSS at the National Cancer Institute The ... · PDF fileCase Study: OSS at the...
Case Study: OSS at the National Cancer Institute The Cancer Bioinformatics Grid (caBIG) Program
Workshop Open Source Software and the Military Health System September 22-23, 2011 Virginia Tech Arlington Research Building
Fred Prior, PhD
cancer Biomedical Informatics Grid®
An open source, open science program begun in 2004
GOAL: design and develop a collaborative IT infrastructure to link the NCI designated cancer centers and other NCI programs to accelerate the pace of biomedical research focused on the detection, diagnosis, treatment, and prevention of cancer.
Between 2004 and 2010 caBIG grew into one of the largest NCI funded programs with a total cost of at least $350 million for fiscal years 2004 to 2010.
The program is on-going.
Source: AN ASSESSMENT OF THE IMPACT OF THE NCI CANCER BIOMEDICAL INFORMATICS GRID (caBIG®), 2011 Available from: http://arc.georgetown.edu/BSAcaBIGAssessment.pdf
Phase 1: Pilot (2004-2007) Establish the community (workspaces) Develop core tools
Over 70 open source software packages were developed
Link the NCI cancer centers Over 149 nodes on the production Grid
Phase 2: Enterprise Deployment (2007+)
caBIG® supported Deployment Program, at 56 NCI cancer centers
6 caBIG® supported Knowledge Centers to provide demonstrations, training material, answers to frequently asked questions
6 caBIG® supported In Silico Research Centers of Excellence for research using data-mining and other in silico
19 Support Service Providers licensed commercial entities to assist users in installing,
modifying and using caBIG® tools
Experience with caBIG®
Primary involvement with the In-vivo Imaging workspace Liaison to Tissue Bank and Pathology Tools workspace Multiple interactions with Vocabularies & Common Data Elements
cross cutting workspace
Roles: Subject Matter Expert (funded) to facilitate community input and
provide guidance on imaging priorities Contract Development and Maintenance of 2 Software packages Adoption and deployment of caBIG tools Open Source development of extensions
Ref: Prior F, Erickson B, Tarbox L. Open Source Software Projects of the caBIG™ In Vivo Imaging Workspace Software Special Interest Group. Journal of Digital Imaging, 2007; 20(supl 1):94-100.
caBIG® IVI Workspace The In Vivo Imaging Workspace was added to the caBIG program in April of 2005: 1. To advance imaging informatics for treatment of
patients with cancer
2. To leverage caBIG technology such as the caGRID, Internet tools such as XML, and existing DICOM standards by creating “middleware,” to facilitate sharing of images in a variety of settings
3. To strive towards a standardized way to evaluate and annotate images, especially for evaluation of tumor burden and response
4. To facilitate secure and easy sharing of images and image analysis & visualization algorithms with an emphasis on the cancer community
Adoption of Open Source XIP Platform caBIG Open Source imaging libraries and XIP Builder Tool caBIG AVT Project (Algorithm Validation Toolkit) DoD TATRC/ACR’s Interoperability in Medical Imaging DARPA deep-bleeder acoustic coagulation Beth Israel Intraoperative Fluorescent Imaging NTR Optical Imaging for Drug Therapy Monitoring, multi-modality imaging caBIG AIM Project (Annotation Imaging Markup) - Northwestern University. CenSSIS collaboration with RPI on Cellular Imaging for XIP UPENN collaboration on multi-resolution histopathology
Pre-Clinical In Vivo In Vitro
Lessons Learned A very large open source community is difficult
to manage if the goal is to achieve consistency in implementation and interoperability of components. Development teams largely worked in isolation Despite global architecture, interface and vocabulary
standards, stove-pipe development was all too frequent
Interoperability “certification” processes were complex, not well communicated and not flexible enough to deal with widely disparate domains.
Architecture and interoperability were NOT community driven.
The Five Steps to Compatibility There are five steps in developing a caBIG®
compatible application : 1. Creating an Information Model 2. Performing Semantic Integration (Vocabularies) 3. Transforming the Information Model into Metadata (Common Data Elements) 4. Generating Code and Messaging Interfaces (API’s) 5. Generating a caGrid Interface
Generate Code and Messaging Interfaces using the caCORE SDK Code Generator
PerformSemantic Integration using the Semantic Integration Workbench (SIW)
Create an Information Model in a Modeling Tool
Transform the Information Model into Metadata using the UML Loader
y
Generate a caGrid Interface using “Introduce”
y
Information Models
Vocabularies CDEs APIs
Lesson 5: Making a Tool caBIG™ Compatible
Lessons Learned Contract Development and Maintenance without
an open source community and mechanisms for accepting and integrating community developed software lead to missed opportunities and multiple independent development paths. Open source enhancements and extensions can not be
inserted into the code base Change proposals can be submitted to the contract
development team – IFF one exists End users and Support Service Providers are denied
important features because they are not in the official release package.
The National Biomedical Image Archive becomes The Cancer Imaging Archive
NBIA as hosted by CBIIT TCIA hosted by Washington Univ.
Open Source Enhancements without a Home Deployment Enhancements
Virtual Machine hosting Database Clustering High-availability and load-balancing clusters
User Interface extensions New look and feel Integrated wiki LDAP and Password Management
Performance improvements Improved download applet Multi-pipeline de-identification Enhanced upload performance
Findings of NCI Board of Scientific Advisors
Source: AN ASSESSMENT OF THE IMPACT OF THE NCI CANCER BIOMEDICAL INFORMATICS GRID (caBIG®), 2011 Available from: http://arc.georgetown.edu/BSAcaBIGAssessment.pdf
A "cart-before-the-horse" overly broad vision for the program Technology driven focus rather than end user requirements and needs In some instances a confusion between clinical practice needs and the
mandated research focus of the program.
A "build it and they will come" mentality. Free software will always be accepted even if it is complicated, costly to
integrate into a user’s environment and poorly supported
A business model that is unsustainable and not cost-effective Software has a life cycle and requires ongoing evolution. An active,
engaged open source community can facilitate this. Without a self sustaining business model, when the government funding stops, the products die.
In Summary caBIG® is a large and productive open source, open access,
open science program funded and managed by the National Cancer Institute
caBIG® is a contract development program with open source licensing.
Lots of community engagement but little support for an actual open source development process.
Stovepipe development and inefficient certification process lead to poorly integrated components and poor interoperability.
Open Source is largely a bottom up process, caBIG® is largely a top down program.