Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for...

56
Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen

Transcript of Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for...

Page 1: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Vaccination data analyses with R and R commander

Lars T. Fadnes & Halvor SommerfeltCentre for International Health

University of Bergen

Page 2: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.
Page 3: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Aim for this session

• Introduce a brilliant tool

• Give examples on use of the tool

• Guide to further knowledge

Page 4: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

What is R?

• R is a free software environment that includes a set of base packages for graphics, math, and statistics.

• You can make use of specialized packages contributed by R users or write your own new functions.

Page 5: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Why R?

• Very powerful

• Developing extremely quickly

• Working on different platforms (not only Microsoft Windows…)

• Free of all costs

Page 6: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Why don’t all use R?

Page 7: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

What is R commander?

Page 8: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.
Page 9: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Why R commander?

• Powerful

• Free of all costs

• Working on different platforms (not only Microsoft Windows…)

• Easy to learn and to use…

Page 10: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

How to install R?For Windows:• http://cran.r-project.org/bin/windows/base/

– Easy• If here at UiB, the information department will fix it for you if you just

ask them to add it for you

For Linux (Ubuntu etc):- Very Easy

- Just search for R-base-core in Synaptic Package Manager and add it

- http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html

Page 11: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

How to install R commander?

- install.packages("Rcmdr", dependencies=TRUE)

Page 12: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Some things to note first

• R is case-sensitive – help, Help, HELP and HELF are different…– Recommendation:

• Choose one style and stick to it

• Avoid using shortcut keys in R commander!

• If it’s something you don’t know?– There are lot’s of good information on the web

• Particularly for R

Page 13: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

How to start?

• Open R– Load packages

• Rcmdr

or write

• library(Rcmdr)

Page 14: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Menu• File Menu:

– items for loading and saving script files; – for saving output and the R workspace; – and for exiting

• Edit Menu: – items (Cut, Copy, Paste, etc.) for editing the contents of the script and output

windows. – Right clicking in the script or output window also brings up an edit “context” menu.

• Data – Submenus containing menu items for reading and manipulating data.

• Statistics – Submenus containing menu items for a variety of basic statistical analyses.

Page 15: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Menu• Graphs

– Menu items for creating simple statistical graphs.

• Models – Menu items and submenus for obtaining numerical summaries, confidence intervals,

hypothesis tests, diagnostics, and graphs for a statistical model, and for adding diagnostic quantities, such as residuals, to the data set.

• Distributions – Probabilities, quantiles, and graphs of standard statistical distributions (to be used, for

example, as a substitute for statistical tables) and samples from these distributions.

• Tools – Menu items for loading R packages unrelated to the Rcmdr package (e.g., to access data

saved in another package), and for setting some options.• Help Menu

– items to obtain information about the R Commander (including this manual). As well, each R Commander dialog box has a Help button (see below).

Page 16: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

• Script Window – R commands generated by the R

Commander – You can also type R commands directly

into the script window or the R Console– The main purpose of the R

Commander, however, is to avoid having to type commands.

• Output Window – Printed output

• Messages Window– Displays error messages, warnings, and

notes

• Graphics Device window – When you create graphs, these will

appear in a separate window outside of the main R Commander window.

Page 17: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Easily available function:

Page 18: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.
Page 19: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Let’s get started…

• Change directory– Find the folder where

you placed your data file

• Import data– Give it the name: Vaccine

• Save workspace as– Give a name to your file

Page 20: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Data Types

• Vectors – including continuous variables

• Factors – Nominal/ categorical

• Matrices, arrays and data frames

• Lists

http://www.statmethods.net/input/datatypes.html

Page 21: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Define the datatypes

Variables in dataset• IDNO

– ID number: categorical = factor

• Adjuvant– Categorical = factor:

• 0: Placebo 1: Adjuvant

• Day0– Continuous (vector)

• Day30– Continuous (vector)

Page 22: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

– Variables are now defined as vectors

– Factors (categorical variables needs to be defined)

Page 23: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Now we’re ready to answer some scientific questions…

Page 24: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Vaccination response

• Effect of vaccination is often calculated by measuring antibodies before and after vaccination

– Response = concentration after vaccination concentration before vaccination

• Compute new variable: response – concentration before vaccination Day0– concentration after vaccination Day30

Page 25: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Does the new variable look reasonable?

• View data set

• Histogram

Page 26: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Summarize variable

• Calculate mean, median and standard deviation for– response

• for each group

• Numerical summaries

Page 27: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

• Placebo: adjuvant = no

• Adjuvant: adjuvant = yes• Mean, standard deviation, the percentiles and

number of cases

Page 28: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Doing calculations for subset group

Page 29: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

• Before making the ”Placebo” subset, remember to select main dataset (Vaccine)– You can now easily change between the datasets

Page 30: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Make a histogram of the response

• First for the adjuvant dataset

• Then for the adjuvant dataset

• And for the placebo dataset

– The histograms will be printed in the R window (not inside R commander)

– Right click on the graph and you can copy it as metafile to paste it into a document, print it or save it

Page 31: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Box plot

• Box plot for response– By AdjuvantCat– (for complete Vaccine dataset)

no yes

05

01

00

15

0

AdjuvantCat

resp

on

se

Page 32: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

• Does it look normally distributed?

Page 33: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

• We might need to do a logarithmic transformation of the variable – Compute new variable:

• logresponse

• Does it look more normally distributed (histogram)?

Page 34: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Check the logresponse

• Are the means and medians similar for both groups (placebo and adjuvant)?– Numerical summary (as earlier, but with logrespons)

• Check with an independent sample t-test– Adjuvant vs logresponse– Is the response different among those getting adjuvant?

Page 35: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

• Will this be confirmed with a robust test not assuming normal distribution?

• Check with a – non-paramethric

» two-sample wilcoxon test (log rank test)» Use the ’Exact’ test

– Will the test be different when using response or logresponse» Why or why not?

Page 36: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Recoding and making new variables

Different types of examples on how to recode in the box:– Data manage variables

• value = ”factor”– Factor can be either number or word

• value, value, value = ”factor”– Listed with comma

• value:value = ”factor”– From lowest to highest values

• else all other values• NA missing

Page 37: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.
Page 38: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

• How was the baseline immunological response (before vaccination)?– Variable: Day0

• Numerical summary

• Let us take a look into the upper quartile specifically– Make subsets:

• ”UpperQuartile”• ”OtherQuartiles”

Page 39: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Is baseline response associated with response after intervention?

• We can first repeat the t-tests for the subsets

• ”UpperQuartile”• ”OtherQuartiles”

– Independent sample t-test» logresponse» AdjuvantCat

Page 40: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

• What is the results?– What are the differences between the means

• What does it mean?

Page 41: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Is intervention associated with response?

• Linear regression analysis with the full dataset– Statistics Fit model Linear model

• Dependant variable (left box):– logresponse

• Independent variable:– AdjuvantCat

Page 42: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Is baseline response associated with response after intervention?

• Linear regression analysis with the full dataset– Statistics Fit model Linear model

• Dependant variable (left box):– logresponse

• Independent variables (right box with ’+’ between each):

– AdjuvantCat– UpperQuartileDay0

Page 43: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Is baseline response associated with response after intervention?

• Linear regression analysis with the full dataset– Statistics Fit model Linear model

• Dependant variable (left box):– logresponse

• Independent variables (right box with ’+’ between each):

– AdjuvantCat– Day0 (continuous)

Page 44: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

What will you conclude?

• Is adjuvant/placebo associated with response?

• Is initial immune response associated with response after vaccination?

Page 45: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.
Page 46: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

How to save?

• Save R workspace as…– This will save your data (in the R format)

• Save output as…– This will save your output– Another strategy is to cut and paste what you want to save

• Always save the commands– essential if you want to re-run the analyses later– WordPad is a better option than Word etc (does not autocorrect -

change to upper case etc)

Page 47: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

How to write and a command?

• Simply write the command in the script window, mark it and click ’Submit’ or press Ctrl+R

Page 48: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Nice to know:

• When writing comments in the syntax, start with the following sign #

• If you are uncertain about a function, use google or help(name-of-function)

Page 49: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Are there differenses in syntax between R and R commander?

• Some few:– Commands that extend over more than one line should have the second and

subsequent lines indented by one or more spaces or tabs; all lines of a multiline command must be submitted simultaneously for execution.

– Commands that include an assignment arrow (<-) will not generate printed output, even if such output would normally appear had the command been entered in the R Console [the command print(x <- 10), for example]. On the other hand, assignments made with the equals sign (=) produce printed output even when they normally would not (e.g., x = 10).

– Commands that produce normally invisible output will occasionally cause output to be printed in the output window. This behaviour can be modified by editing the entries of the log-exceptions.txt file in the R Commander’s etc directory.

– Blocks of commands enclosed by braces, i.e., {}, are not handled properly unless each command is terminated with a semicolon (;). This is poor R style, and implies that the script window is of limited use as a programming editor. For serious R programming, it would be preferable to use the script editor provided by the Windows version of R itself, or – even better – a programming editor.

Page 50: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

True or false quiz

• R was the program that gave their shareholders most profit last year?

• R commander only works for Ms Windows?

• R has even more functions than R commander?

• R is built by a few people with a secret source-code?

• There are a lot of enthusiastic people working with R providing help to their peers in the R forum?

Page 51: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Further reading:

• The R Commander A Basic-Statistics Graphical User Interface to R - John Fox 2005.pdf– http://www.jstatsoft.org/v14/i09/paper

• Getting started with the R Commander: a basic-statistics graphical user interface to R– http://socserv.mcmaster.ca/jfox/Getting-Started-with-the-Rcmdr.pdf

• Quick-R: magnificent guide– http://www.statmethods.net/

• http://cran.r-project.org/manuals.html

Page 52: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

R help forum:

• http://r.789695.n4.nabble.com/

Page 53: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.
Page 54: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

• You have learnt some basic skills and can now experiment with the program yourself

Page 55: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.

Questions and comments

Page 56: Vaccination data analyses with R and R commander Lars T. Fadnes & Halvor Sommerfelt Centre for International Health University of Bergen.