R Tip – Directly Access the REDCap API from R

Note: As will be obvious, this example comes from a Windows XP platform.

I stumbled across this earlier and thought it would be a useful statistics tip to share with the world. Those of you working in health care research may already know about REDCap, which stands for Research Electronic Data Capture. It’s a project out of Vanderbilt University that’s designed to improve research by allowing easy creation of “databases”. I say that loosely because each project is basically a single table, with little in the way of relational structures. For more on REDCap, check out http://www.project-redcap.org/.

REDcap is free, but it’s not available at every institution. You might want to see if you can find it at your location. It’s a nice option for when you need a simple, reliable, and non-relational option to store data. It’s not a magic bullet by any means, but it’s definitely a nice tool to have in your “statistical / data management toolbox”.

REDCap has a pretty robust GUI front-end to enter data, export records, and log transactions. However, as you become a more advanced user, it’s possible to use the REDCap API to interface directly with the data. This can be a real time-saver, especially when you need to export data on a weekly basis to create reports. The API would allow this to be done by a script, which could be scheduled using Task Scheduler in Windows, or as a cron job in *nix.

It’s actually quite simple to do this using the R package RCurl, which I discuss here. For more on RCurl, check out http://cran.r-project.org/web/packages/RCurl/index.html. More on cURL generally is at http://curl.haxx.se/. Effectively, RCurl makes cURL and libcurl available in R. You can use the postForm() function from RCurl to access the API and get data in to or out of R. Here is an example of a data export:

library(RCurl)
out <- postForm("https://redcap.url.org/redcap/api/",
                token="INSERT TOKEN HERE",
                content="record",
                type="flat",
                format="csv",
                .opts=curlOptions(ssl.verifypeer=FALSE))
write(out,file="C:/wherever/out.csv")

Let me break that down a little bit for you. First, you use library(RCurl) to load the package RCurl, which you should be sure you’ve installed. Next, we create an object out by using the postForm() function from RCurl. Finally, we write that object out using a simple write() – which may need spruced up if your data is complex.

A few places I ran into trouble on this:

1) Make sure you confirm that you have the correct token. You can get the token while in REDCap by going to your project, clicking “Other Functionality”, and then checking out the section on the API (useful in general if you’re going to use the API).

2) Ensure the URL is https :// redcap, not http :// redcap, as HTTP just ends up redirecting you (at least in our environment), which breaks the POST.

3) .opts=curlOptions(ssl.verifypeer=FALSE) is set because, by default, RCurl comes with no CA cert bundles at all. Actually, cURL in general is set this way, as you can see at http://curl.haxx.se/docs/sslcerts.html. Since it has no CA bundles to reference, it cannot verify the authenticity of a certificate offered by a host, and therefore, SSL fails every time. Because I am working internally and because I was only downloading data, I was comfortable with disabling peer verification for this test. However, it is possible to identify the certificate you need and have RCurl refer to it, which is more ideal.

I find that it’s easiest to go to the web face for REDCap (https://redcap.url.org/, usually) and find the certificate there. Find whatever the top-level certificate is in the chain used to verify the site, and export that. (Be sure you use PEM). Alternatively, export the site’s certificate, but ensure that the chain is exported too. In any case, place the exported .crt file in some location, and then use one of the cURL options to specify the path.

(These instructions are likely different on *nix boxes, but if you’re on a *nix box, I’m sure you can figure out what to do…)

The final code would be:

out <- postForm("https://redcap.url.org/redcap/api/",
                token="INSERT TOKEN HERE",
                content="record",
                type="flat",
                format="csv",
                .opts=curlOptions(cainfo="C:/path/aaa.crt"))

This will allow you to export while maintaining SSL security, which is important – especially if you start using the API to WRITE data, and not just read it.

So, there you have it – a simple way to access data in REDCap using R, without using the GUI frontend. Definitely useful for scripting and other automated data management. Let me know if it helps!

Comments are closed.