CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools...

31
caIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard

Transcript of CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools...

Page 1: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

caIntegrator2 – Part 1:

Create a Study with Clinical Data

Fan Lin, Ph. D

Molecular Analysis Tools Knowledge Center

Columbia University

and

The Broad Institute of MIT and Harvard

Page 2: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

caIntegrator2 Overview

Integrator2 is a web-based software application that allows researchers to set up custom, caBIG®-compatible web portals to conduct integrative research, without requiring programming experience.

These portals bring together heterogeneous clinical, microarray and medical imaging data to enrich multidisciplinary research.

User may install a local instance or use the public instance hosted by NCI:

https://caintegrator2.nci.nih.gov/caintegrator2/login.action

Page 3: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Getting Start with caIntegrator2

caIntegrator2 is a web-based application. It is available only to the registered users.

If you are new to caIntegrator2, you should first request a user account, by clicking on “Register Now”

Page 4: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Online Registration Request

A new user may select one or more of the three available roles for caIntegrator2 users: Study Manager (Create & Manage the Studies), Study Investigator (Query & analyze the data in a study)

After the online request form is submitted, the system administrator will review and create the user account via UPT.

Page 5: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

caIntegrator2 Workspace1. Overview

Upon login, user will be directed to caIntegrator2’s workspace. The workspace can be divided into 3 sections:

1. My Studies - a drop down list.

2. Functions Side Bar Section – All available functions in caIntegrator2.3. Study Section - This area will change based on the selection on the left side bar. It

is for study summary, uploading, editing. A default welcome screen will be shown for the new user.

1

3

2

Page 6: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

caIntegrator2 Workspace2. My Studies List

The drop down list includes all the studies that are created & deployed by the owner or the studies subscribed by the user.If a study is not deployed, it will not be shown on this list.

Page 7: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

caIntegrator2 Workspace3. Side Bar Section

The Side Bar section can be further divided into two subsections:

Study Based Section

The content in this section displays only when a study is selected and will change from study to study. It includes:

Logo for the study Study Name: Home: link to Homepage for the selected study Search [StudyName]: to launch queries within the study Study Data – to open previously saved queries Analysis Tools

Static Section: Study Management: create new or modify all

existing studies, including undeployed ones. Application Management: Manage the array

design files (platforms) associated with genomic data.

caIntegrator2 menu: support, tutorial & user guide

Page 8: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

A caIntegrator2 Study 1. Overview

A caIntegrator2 study can integrate three kinds of patient related data together:

Clinical Data – The clinical data in spreadsheet CSV files can be uploaded to the study. It should have a unique patient identifier in one column, with relevant data in other columns. One patient per row.

Genomic – The array data can be pulled from caArray of either local installation or the CBIIT installation. A mapping file in CSV format is needed to indicate the correlations between the array data and the clinical subjects in the clinical data files.

Imaging – The Image data from NBIA grid node can also be integrated into a caIntegrator2 study. An image mapping file in CSV format is necessary to indicate the correlations between the images in NBIA and the clinical subjects in the clinical data files.

Page 9: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

A caIntegrator2 Study 2. Structure of a Study

There are 4 sections in each study designed to upload different data:1. Study Logo – An image to distinguish the study

2. Clinical Data Source – To add clinical data

3. Genomic Data Source – To add genomic data from caArray.

4. Imaging Data Source – To add image data from NBIA network

Page 10: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data1. Objectives

In this tutorial: caIntegrator2: Create a Study – Part 1, we will demonstrate how to create a study with the clinical data uploaded:

How to create a study How to add logo to a study How to upload the clinical data to a study How to define the clinical data How to deploy a study How to modify an existing study

We will demonstrate how to upload genomic and image data and map those data to the clinical data in our next tutorial (caIntegrator2: Create a Study – Part 2).

Page 11: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data2. Clinical Data Requirement

Requirement for the Clinical Data:

The clinical data should be available in spreadsheet of CSV format.There should be a unique patient identifier in one column, one patient per row. Patient relevant data can be supplied in other columns and to be identified as annotations in the file from within caIntegrator2.The clinic data can be uploaded from a user’s local or a network drive.

Page 12: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data3. Create a New Study

To create a new study, following the steps below: In the Study Management section of the left sidebar, click Create New Study. In the Create New Study dialog box that opens, provide a name and description

for the study you are creating Click Save. This will open the Edit Study page

Page 13: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data 4. Edit Study Page

On the Edit Study page, verify the Name and Description of the study, modify if it is needed.

Page 14: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data5. Add a Logo to the Study

Add a logo for the study:

1. Click the Browse button

2. Navigate for the file

3. Click Upload Now.

4. Save the study.

Once the logo is added, user may replace the image with a new one, but will not be able to delete the logo from the study

On the home page for the study deployed, the logo displays in the upper left, above the sidebar.

Page 15: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data6. Add the Clinic Data

Once a file is upload, caIntegrator2 will open the Edit page to Define fields for Clinic Data. An user may also save the study now and choose to edit the file in a later time by:

1

To add or edit the clinical metadata, follow these steps: To add a new file, click the Browse button in the Clinical Data Sources section (arrow 1).

Navigate to locate the CSV format data file, and upload.

2

3

Select “Manage Studies” (Arrow 2) select the study interested (Arrow 3) to open the study

Page 16: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data7. Define Fields for Clinical Data

There are 3 columns displayed on the Define Fields for Clinical Data page:

Field Definition: This is for the field name in caIntegrator2 database that column from the clinical data file should be mapped to.

The Field Header from File: It displays column headers taken from the source CSV file.

Data from File: Up to three sample data for each column are taken from the file and displayed.

Page 17: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data8. Defining a New Clinical Data File

There are two important steps required to ensure the successful loading of a clinical data file:

1. One column in the file must be designated as a unique “identifier” column type

2. Every column in clinic study file must be mapped to a known column definition in the caIntegrator2 database.

Page 18: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data9. Column Annotation

There are no pre-defined column annotations in caIntegrator2 database.

When the clinical study file is loaded for the first time, User MUST review and define column annotation for every single column in file.

Each column name is needed to be defined only ONCE.

Once the column is defined, caIntegrator2 can “recognize” the same column headers and share with all other clinic study files

Page 19: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data10. Unmapped and Recognized Columns

Picture below shows a clinic study just uploaded into caIntegrator2: There is no unique identifier assigned Some of the columns are new to the system and not mapped to the columns in caInteger2

(arrow 1). Some of the columns are recognized by the caIntegrator2 and automatically mapped to the

known columns as displayed (arrow 2).

1

2

Page 20: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data11. Assign a Unique Identifier

In this study, we will assign Patient_ID to be the unique identifier of the study by the steps below:

Click on “Assign Annotation Definition”Change Column type from “Annotation” to “Identifier”Save

1

2

Page 21: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data12. Identify the unmapped columns

After an identifier is defined, we are now to ensure that ALL other fields have a field definition assignment.

In the picture below, DC_STUDY_ID and MICROARRAY are the columns in the study not recognized by the caIntegrator2.

SITE is the column that has been recognized and automatically mapped to SITE by the caIntegrator2.

We now map DC_STUDY_ID by clicking on “Assign Annotation Definition”

Page 22: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data13. Annotation information

Annotation Information:

By default, the current annotation information will be kept. The fields are shown as un-editable fields (arrow 1)

If a new name annotation is desired, click the New button (arrow 2) and following fields will become editable:

Name, Definition, Keyword, Data Type

1

We now create a new definition by click on “New” as shown by the arrow 2

2

Page 23: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data14. Select Permissible values

The permissible values are the values that are acceptable for this data field. For example, The permissible values for a patient’s GENDER are: Male,

Female or UnknownWhen a file is first loaded, the permissible value panel might be blank (arrow 1). If the column header is “recognizable” by caIntegrator2, the system would “guess” the data type and assign the values to the data type in the newly uploaded file and display in the Non-permissible values sections initially. Use the Add and Remove buttons to move the values shown from one list to the other, as appropriate.Save

1

2

Page 24: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data15. Map New Columns

An alternative way to creating a new definition is to search for annotation definitions already present in caIntegrator2 studies or in caDSR.

Picture on the left is the page to create a new annotation definition for the column DC_STUDY_ID, by using “New” button , as described in the previous slides.

To do so, enter a keyword and click on “Search” (see arrow)

Page 25: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data16. Search for the definition

In this example, we will be searching for the definition for “microarray”: Enter “MICROARRAY” and click on “Search” Matching annotation definitions are found in both caIntegrator2 and

caDSR. Choose one of definitions by clicking on “Select”.

Page 26: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data17. Finish the definition

To proceed the clinic data study, make sure:Every Field Definition entry on the Fields for Clinical Data screen has

A unique entry: one as an Identifier The remainder as annotations

Click Done. This saves the current settings and description and return to “Edit Study” page.

Page 27: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data18. Load the Data Source

If the Column Definitions are completed, a “Not Loaded” will appear under Status (arrow 1). “Definition Incomplete” will remain under Status, if the definition is not completed based on the requirement (slide 26)

Once “Load All Clinical” is pressed, the status will change to “Loaded”

The status of “Loaded” indicates that the clinic study data has been loaded into caIntegrator2 staging area and can be used to map with the data from other Clinic Studies, Genomic studies or Image Studies.

Page 28: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data 19. Multiple Clinical Data Files

More than one clinical data files can be added to the same study. As long as the unique identifier IDs are defined correctly, caIntegrator2 can recognize the

data from multiple clinical data files. For example, Patients 1‐20 in first file, 21‐40 in second file etc.

Page 29: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data 20. Deploy the Study

Genomic and imaging data are not mandatory to the study. They can be added independently and/or in a later stage. Once the clinic data is loaded, the study should be Saved (arrow 1) and can be Deployed (arrow 2). Upon the deployment,, caIntegrator2 commit the data from the file to the caIntegrator2 database.

12

Page 30: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Create a Study with Clinical Data 21. Find & Modify a Study Deployed

Once the study is deployed, it can be found: Drop down list of My Studies Manage Studies under Study Management

To open the study, Click on “Edit” Data files can be added, deleted or replaced Column definitions can be modified “Deploy Study” must be re-applied, after the change is completed.

Page 31: CaIntegrator2 – Part 1: Create a Study with Clinical Data Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute.

Need More Information?

NCI is developing an extensive knowledge base to support various NCI molecular

analysis tools. Visit us at NCI’s Molecular Analysis Tool Knowledge center at: https://wiki.nci.nih.gov/x/R5GNAg

For more information on how to use caIntegrator2, visit the caIntegrator wiki, https://wiki.nci.nih.gov/x/c4b3Ag

Have a caArray related question? Find the answers in the caIntegrator FAQ at https://wiki.nci.nih.gov/x/-pGNAg

Need more help? Post in caIntegrator2 Forum at https://cabig-kc.nci.nih.gov/Molecular/forums/viewforum.php?f=6