Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon...

44
Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004

Transcript of Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon...

Page 1: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

Calling C code from R

an introduction

Sigal BlayDept. of Statistics and Actuarial Science

Simon Fraser UniversityOctober 2004

Page 2: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

Motivation:

• Speed

• Efficient memory management

• Using existing C libraries

Page 3: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

The following functions provide a standard interface to compiled code that has been linked into R:

.C

.Call

.External

Page 4: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

We will explore using .C and .Call with 7 code examples:

Using .C

I. Calling C with an integer vector

II. Calling C with different vector types

Using .Call

III. Sending R integer vectors to C

IV. Sending R character vectors to C

V. Getting an integer vector from C

VI. Getting a character vector from C

VII. Getting a list from C

And lastly, tips on creating an R package with compiled code

Page 5: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

I.

Calling C with an integer vector

using .C

Page 6: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

/* useC1.c */

void useC(int *i) {

i[0] = 11;

}

The C function should be of type void.

The compiled code should not return anything except through its arguments.

Page 7: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

To compile the c code, type at the command prompt:

R CMD SHLIB useC1.c

The compiled code file name is useC1.so

Page 8: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

In R:

> dyn.load("useC1.so")

> a <- 1:10 # integer vector

> a

[1] 1 2 3 4 5 6 7 8 9 10

> out <- .C("useC", b = as.integer(a))

> a

[1] 1 2 3 4 5 6 7 8 9 10

> out$b

[1] 11 2 3 4 5 6 7 8 9 10

Page 9: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

• You have to allocate memory to the vectors passed to .C in R by creating vectors of the right length.

• The first argument to .C is a character string of the C function name.

• The rest of the arguments are R objects to be passed to the C function.

Page 10: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

• All arguments should be coerced to the correct R storage mode to prevent mismatching of types that can lead to errors.

• .C returns a list object.

• The second .C argument is given the name b. This name is used for the respective component in the returned list object (but not passed to the compiled code).

Page 11: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

II.

Calling C with different vector types

using .C

Page 12: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

/* useC2.c */

void useC(int *i, double *d, char **c, int *l) {

i[0] = 11;

d[0] = 2.333;

c[1] = "g";

l[0] = 0;

}

Page 13: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

To compile the c code, type at the command prompt:

R CMD SHLIB useC2.c

to get useC2.so

To compile more than one c file:

R CMD SHLIB file1.c file2.c file3.c

to get file1.so

Page 14: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

In R:

> dyn.load("useC2.so")

> i <- 1:10 # integer vector

> d <- seq(length=3,from=1,to=2) # real number vector

> c <- c("a", "b", "c") # string vector

> l <- c("TRUE", "FALSE") # logical vector

> i

[1] 1 2 3 4 5 6 7 8 9 10

> d

[1] 1.0 1.5 2.0

> c

[1] "a" "b" "c"

> l

[1] "TRUE" "FALSE"

Page 15: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

> out <- .C("useC", i1 = as.integer(a), d1 = as.numeric(d), c1 = as.character(c), l1 = as.logical(l))> out$i1 [1] 11 2 3 4 5 6 7 8 9 10$d1 [1] 2.333 1.500 2.000$c1 [1] "a" "g" "c“$l1 [1] FALSE FALSE

Page 16: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

• Other R objects can be passed to .C but it is better to use one of the other interfaces.

• With .C, the R objects are copied before being passed to the C code, and copied again to an R list object when the compiled code returns.

• Neither .Call nor .External copy their arguments.

• You should treat arguments you receive through these interfaces as read-only.

Page 17: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

Advantages to using .Call() instead of .C()

(Posted by Prof Brian Ripley on R-help, Jun 2004)

1) A lot less copying.

2) The ability to dimension the answer in the C code.

3) Access to other types, e.g. expressions, raw type and

the ability to easily execute R code (call_R is a pain).

4) Access to the attributes of the vectors, for example the names.

5) The ability to handle missing values easily.

Page 18: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

III.

Sending R integer vectors to C

using .Call

Page 19: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

/* useCall1.c */

#include <R.h>

#include <Rdefines.h>

SEXP getInt(SEXP myint, SEXP myintVar) {

int Imyint, n; // declare an integer variable

int *Pmyint; // pointer to an integer vector

PROTECT(myint = AS_INTEGER(myint));

Page 20: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

• Rdefines.h is somewhat more higher level then Rinternal.h, and is preferred if the code might be shared with S at any stage.

• SEXP stands for Simple EXPression

• myint is of type SEXP, which is a general type, hence coercion is needed to the right type.

• R objects created in the C code have to be reported using the PROTECT macro on a pointer to the object. This tells R that the object is in use so it is not destroyed.

Page 21: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

Imyint = INTEGER_POINTER(myint)[0];

Pmyint = INTEGER_POINTER(myint);

n = INTEGER_VALUE(myintVar);

printf(“ Printed from C: \n“);

printf(“ Imyint: %d \n", Imyint);

printf(“ n: %d \n", n);

printf(“ Pmyint[0], Pmyint[1]: %d %d \n",

Pmyint[0], Pmyint[1]);

UNPROTECT(1);

return(R_NilValue);

}

Page 22: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

• The protection mechanism is stack-based, so UNPROTECT(n) unprotects the last n objects which were protected. The calls to PROTECT and UNPROTECT must balance when the user's code returns.

• to work with real numbers, replace int with double and INTEGER with NUMERIC

Page 23: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

In R:

> dyn.load("useCall1.so")

> myint<- c(1,2,3)

> out<- .Call("getInt", myint, 5)

Printed from C:

Imyint: 1

n: 5

Pmyint[0], Pmyint[1]: 1 2

> out

NULL

Page 24: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

IV.

Reading an R character vector from C

using .Call

Page 25: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

/* useCall2.c */

#include <R.h>

#include <Rdefines.h>

SEXP getChar(SEXP mychar) {

char *Pmychar[5]; // array of 5 pointers

// to character strings

PROTECT(mychar = AS_CHARACTER(mychar));

Page 26: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

// allocate memory:

Pmychar[0] = R_alloc(strlen(CHAR(STRING_ELT(mychar, 0))), sizeof(char));

Pmychar[1] = R_alloc(strlen(CHAR(STRING_ELT(mychar, 1))), sizeof(char));

// ... and copy mychar to Pmychar:

strcpy(Pmychar[0], CHAR(STRING_ELT(mychar, 0)));

strcpy(Pmychar[1], CHAR(STRING_ELT(mychar, 1)));

printf(“ Printed from C:”);

printf(“ %s %s \n",Pmychar[0],Pmychar[1]);

UNPROTECT(1);

return(R_NilValue);

}

Page 27: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

In R:

> dyn.load("useCall2.so")

> mychar <- c("do","re","mi", "fa", "so")

> out <- .Call("getChar", mychar)

Printed from C: do re

Page 28: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

V.

Getting an integer vector from C

using .Call

Page 29: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

/* useCall3.c */

#include <R.h>

#include <Rdefines.h>

SEXP setInt() {

SEXP myint;

int *p_myint;

int len = 5;

// Allocating storage space:

PROTECT(myint = NEW_INTEGER(len));

Page 30: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

p_myint = INTEGER_POINTER(myint);

p_myint[0] = 7;

UNPROTECT(1);

return myint;

}

// to work with real numbers, replace

// int with double and INTEGER with NUMERIC

Page 31: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

In R:

> dyn.load("useCall3.so")

> out<- .Call("setInt")

> out

[1] 7 0 0 0 0

Page 32: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

VI.

Getting a character vector from C

using .Call

Page 33: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

/* useCall4.c */

#include <R.h>

#include <Rdefines.h>

SEXP setChar() {

SEXP mychar;

PROTECT(mychar = allocVector(STRSXP, 5));

SET_STRING_ELT(mychar, 0, mkChar("A"));

UNPROTECT(1);

return mychar;

}

Page 34: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

In R:

> dyn.load("useCall4.so")

> out <- .Call("setChar")

> out

[1] "A" "" "" "" ""

Page 35: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

VII.

Getting a list from C

using .Call

Page 36: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

/* useCall5.c */

#include <R.h>

#include <Rdefines.h>

SEXP setList() {

int *p_myint, i;

double *p_double;

SEXP mydouble, myint, list, list_names;

char *names[2] = {"integer", "numeric"};

Page 37: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

// creating an integer vector:

PROTECT(myint = NEW_INTEGER(5));

p_myint = INTEGER_POINTER(myint);

// ... and a vector of real numbers:

PROTECT(mydouble = NEW_NUMERIC(5));

p_double = NUMERIC_POINTER(mydouble);

for(i = 0; i < 5; i++) {

p_double[i] = 1/(double)(i + 1);

p_myint[i] = i + 1;

}

Page 38: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

// Creating a character string vector // of the "names" attribute of the// objects in out list:

PROTECT(list_names = allocVector(STRSXP,2));

for(i = 0; i < 2; i++) SET_STRING_ELT(list_names,i,mkChar(names[i]));

Page 39: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

// Creating a list with 2 vector elements:

PROTECT(list = allocVector(VECSXP, 2));

// attaching myint vector to list:

SET_VECTOR_ELT(list, 0, myint);

// attaching mydouble vector to list:

SET_VECTOR_ELT(list, 1, mydouble);

// and attaching the vector names:

setAttrib(list, R_NamesSymbol, list_names);

UNPROTECT(4);

return list;

}

• SET_VECTOR_ELT stands for Set Vector Element

Page 40: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

In R:

> dyn.load("useCall5.so")

> out <- .Call("setList")

> out

$integer

[1] 1 2 3 4 5

$numeric

[1] 1.00000 0.50000 0.33333 0.25000 0.20000

Page 41: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

If you are developing an R package:

copy useC.c to myPackage/src/

The user of the package will not have to

manually load the compiled c code with dyn.load(), so:

add zzz.R file to myPackage/R

zzz.R should contain the following code:

.First.lib <-function (lib, pkg) {

library.dynam("myPackage", pkg, lib)

}

Page 42: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

If you are developing an R package (cont.),

modify the .C call: After the argument list to the C function,

add PACKAGE="compiled_file".

For example, if your compiled C code file name is

useC1.so, type:

.C("useC", b = as.integer(a), PACKAGE="useC1")

If you are using a Makefile, look at the output from

R CMD SHLIB myfile.c for flags that you may

need to incorporate in the Makefile.

Page 43: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

Even if your R package perfectly passes an 'R CMD check':

• Try to compile your C code with 'gcc -pedantic -Wall'

(you should get only warnings that you have reasons

not to eliminate)

• check the R code with 'R CMD check --use-gct'

(It uses 'gctorture(TRUE)' when running examples/tests,

and it's slow)

If you won't, CRAN will do that for you and

will send you back to the drawing board.

Page 44: Calling C code from R an introduction Sigal Blay Dept. of Statistics and Actuarial Science Simon Fraser University October 2004.

This work has been made possible by the Statistical Genetics Working Group at the

Department of Statistics and Actuarial Science, SFU.