Ensuring the quality and correctness of statistical or scientific software in general constitute as one for the main responsibilities of scientific software developers and scientists who provide a code to solve a specific computational task. Sometimes tasks could be mission critical. For example, in drug trails, clinical research or designing an aviation related component, a wrong outcome would risk human life. ACM award holder John Chambers, creator of S language, which R project is inspired from, stated in his brilliant book Software for data analysis programming with R :
Both the data analyst and the software provider therefore have a strong responsibility to produce a result that is trustworthy, and, if possible, one that can be shown to be trustworthy.
One of the tools in fulfilling this responsibility in practise is implementing unit tests for the functionality of our R code. Usually in a package setting. Aim of unit tests are to ensure that installed or the code in use are producing what is designed for in the smallest individual atomic unit, i.e., functions. RUnit package on CRAN provides set of tools to archive this. Set of assertions can be tested to see if they are all TRUE, leading to a passing test. Instead of talking about what coverage of unit tests are sufficient and what to test in the first place. I would like to ask a different question: What happens if our assertions on the outcome of the function are based on the data? For example, a numerical value that represents a quantity extracted from data by the function in test. Normally, we hard code that numeric value in our unit tests. But if we have lots of different data inputs we may end up re-writing, essentially multiple copies of the same unit test. One way to avoid doing this is utilising what is called in general parametrised unit testing (PUT). In this post I would like to use PUT in a strategy mixed with meta-programming. See my recent post on how to do basic meta-programming with R.
The essential idea of PUT is to write a function in R that produces the unit tests on the fly based on the parameters, like the name of the data and the value of the function. Let's have one trivial example that demonstrates this idea. Imagine that we would like to check the volatility of a stock price in the given time interval. A naive way of doing this is to check stock price standard deviation using R's own sd function, So we are unit testing sd function here. And remember that our unit test must not contain any numerical value hard coded in its body, while that is the thing we would like to avoid. Let's use the following as a 'template' unit test function:
The essential idea of PUT is to write a function in R that produces the unit tests on the fly based on the parameters, like the name of the data and the value of the function. Let's have one trivial example that demonstrates this idea. Imagine that we would like to check the volatility of a stock price in the given time interval. A naive way of doing this is to check stock price standard deviation using R's own sd function, So we are unit testing sd function here. And remember that our unit test must not contain any numerical value hard coded in its body, while that is the thing we would like to avoid. Let's use the following as a 'template' unit test function:
vecSD <- function(vec, expectedSD) { sdComputed <- sd(vec) checkEquals(sdComputed, expectedSD, tol=1e-6) }
A metaprogramming exercise will come into picture. Let's assume we would like to check volatility of Google, Microsoft and Apple's stock prices from the beginning of 2005 till the end of 2012. So to speak we need to re-write the vecSD function with 3 different parameter sets. An important note that, we can not call vecSD as a unit test function. Recall that unit tests do not contain any arguments. They are only executed for pass or error, c.f., Runit vignette. We need to generate 3 unit test functions that re-writes vecSD function using Google, Microsoft and Apple stock prices. Let's make this a generic exercise, and write a function that generates n such functions:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | vecSDmeta <- function(vecList, expectedSDvec) { nFunctions <- length(expectedSDvec) for(i in 1:nFunctions) { fName <- paste("vecSDunit", i, sep="") print(fName) assign(fName, eval( substitute( function() { sdComputed <- sd(vec) checkEquals(sdComputed, expectedSD, tol=1e-3) }, list(vec=vecList[[i]], expectedSD=expectedSDvec[i]) ) ), envir=parent.frame() ) } } |
We have used the following functions provided by R, substitute and assign. One can think of using body as well but substitution should be performed similarly. Let's use this function with real data, we mentioned above. First we get the stock price data using quandmod and put them in a form which is suitable to call vecSDmeta:
1 2 3 4 5 6 7 8 9 10 11 12 13 | require('quantmod') require('RUnit') data.env <- new.env() getSymbols(c("MSFT", "GOOG", "AAPL"), src='yahoo', env=data.env, from '2005-01-01', to='2012-12-31'); openGoog <- data.env$GOOG[1:dim(data.env$GOOG)[1], 1] openMS <- data.env$MSFT[1:dim(data.env$MSFT)[1], 1] openAP <- data.env$AAPL[1:dim(data.env$AAPL)[1], 1] vecList <- list( vGoog = as.vector(openGoog), vMS = as.vector(openMS), vAP = as.vector(openAP) ) expectedSDvec <- c(126.5066, 3.391194, 169.4987) # this is expected |
Now if we run vecSDmeta, 3 new unit test functions will be defined automatically. And if we run those generated unit test functions, they should produce all TRUE,
otherwise you have a problem with either the volatility value or the
data itself.
1 2 3 4 5 6 7 8 9 10 | > vecSDmeta(vecList, expectedSDvec) [1] "vecSDunit1" [1] "vecSDunit2" [1] "vecSDunit3" > vecSDunit1() [1] TRUE > vecSDunit2() [1] TRUE > vecSDunit3() [1] TRUE |
Note that vecSDunitX functions has no parameters and they will be available only after vecSDmeta execution. Which is suitable to use with RUnit. One nice extension to this exercise would be to add a parameter that tells us a file path of the data, if it is read from a file. A more complex case is to have a description file that maps parameters, data and unit test generator can be established, for example using XML, YAML or JSON formats for descriptions. Of course the approach shown here appears to be quite primitive compare to templates provided by C++. Moreover, symbolic manipulation we made using substitute can be extended and put in to general framework, similar to .Net PUT, so we would only provide the template function.
In summary, the technique I have described above would save a lot of time if you deal with testing lots of data and specially in often changing code base. You could be sure that the code does not make funny things if you use many different datasets. Take home message is: if you are testing your functions with a data, don't use hard-coded unit tests, those unit tests will surely not last much longer if you have incoming datasets to test and code base is in constant review.
In summary, the technique I have described above would save a lot of time if you deal with testing lots of data and specially in often changing code base. You could be sure that the code does not make funny things if you use many different datasets. Take home message is: if you are testing your functions with a data, don't use hard-coded unit tests, those unit tests will surely not last much longer if you have incoming datasets to test and code base is in constant review.