Showing posts with label programming. Show all posts
Showing posts with label programming. Show all posts

Thursday, 13 June 2013

Practicing static typing in R: Prime directive on trusting our functions with object oriented programming

The creator of S language which R is derived from John Chambers said in one of his books  Software for data analysis programming with R
...This places an obligation on all creators of software to program in such a
way that the computations can be understood and trusted. This obligation
I label the Prime Directive.
He was referring to prime directive from Star Trek. One of the practice in this direction is to have a proper checks in place for the types we use. We can trust that if we pass for example a wrong type to our function, it will fail gracefully. So a type system of a programming language is quite important in mission critical numerical computations. Since R language is weakly typed language, or dynamically typed similar to Perl, Python or Matlab/Octave, most of R users omit to place type checks in their functions if not rarely. For example take the following function that takes arguments of a matrix, a vector and a function name. It applies the named function to each columns of the matrix listed in the given vector. Assuming named function is returning a single number our function will return a vector of numbers.

myMatrixOperation  <-  function(A, v, fName) {
  sliceA <-  A[, v];
  apply(sliceA, 2, fName);
}

One obvious way to put if statements for each argument in our function. So, function may look like:

myMatrixOperation <- function(A, v, fName) {
  if(!is.matrix(A)) {
   stop("A is not a matrix");
  }
  if(!is.vector(v)) {
   stop("v is not a vector");
  }
  if(!is.funcion(fName)) {
   stop("fName is not a function");
  }
  sliceA <- A[, v];
  apply(sliceA, 2, fName);
}
The problem with this approach appears to be the fact that it is too verbose and if we have a repeating pattern of arguments in many functions and many arguments, we would copy and paste code many times. It would not only look ugly but wastes our time. Luckily there is a mechanism to address this: S4-class system. Let's define an S4 class for our set of arguments, following an example instantiation.

setClass("mySlice", representation(A="matrix", v="vector", fName="function"))

myS <- new("mySlice",, A=matrix(rnorm(9),3,3),v=c(1,2), fName=mean)
str(myS)
Formal class 'mySlice' [package ".GlobalEnv"] with 3 slots
  ..@ A    : num [1:3, 1:3] 0.356 -0.34 -0.642 -0.466 2.915 ...
  ..@ v    : num [1:2] 1 2
  ..@ fName:function (x, ...)

Now if we re-write the function that uses our S4 class with type checking only to passing object once.
is.mySlice <- function(obj) {
  l <- FALSE 
  if(class(obj)[1] == "mySlice") { l <- TRUE } 
  l 
} 

myMatrixOperation <- function(mySliceObject) { 
  if(!is.mySlice(mySliceObject)) { 
    stop("argument is not class of mySlice") 
  }  
  sliceA <- mySliceObject@A[, mySliceObject@v]; 
  apply(sliceA, 2, mySliceObject@fName); 
} 
This simple example demonstrates how we can introduce a good organization to our R codes, that obeys the prime directive. Further more modern approach to object orientation is introduced by John Chambers called Reference classes. If you practice this kind of approach in your R codes than I can only say; Live long and prosper.

Wednesday, 10 October 2012

Matlab/Octave: imagesc with variable colorbar

Plotting a discrete matrix with a colour map is a common task that one would need in many different fields of sciences. To plot a discrete matrix with MATLAB, a possibility is to use imagesc, while it handles discrete points better then other density utilities, where it can handle (0,0) placement better. However, if there are limited number of values in the matrix entries, you may want to keep only existing ones on the color labels. One way to achieve this is to re-write your 'colormap' via 'jet', that maps existing colours on the current matrix. However, this is not sufficient, you need to re-write your matrix as well. See example code below. So called a variable 'colormap' situation is handled with this approach. (This is tested with MATLAB, on Octave, I think one needs to be careful on 'colormap', values behaviour of 'jet' is little different.)

% bare minimum - example -
clear all
close all
allLevelNames   = {'one', 'two', 'three', 'four'};
levelValues     = [1 2 3 4];
myMatrix        =  [3 3 3 3; 3 3 3 3; 4 4 1 1; 4 4 1 1];
% Now re-write colormap
aV              = length(levelValues);
jj              = jet(aV);
availableValues = unique(myMatrix);
getIndexes      = arrayfun(@(xx) find(levelValues == xx), availableValues); 
colormap(jj(getIndexes,:)); 
% Re-write myMatrix in the availableValues range 
myMatrix = arrayfun(@(x) find(availableValues == x), myMatrix);
% Now  Plot
h    = imagesc(myMatrix);

% Handle if there is more then one color is in use
maxV = max(availableValues); 
minV = min(availableValues); 
if(length(availableValues) > 1) caxis([minV maxV]); end; 
hcb=colorbar; 
set(hcb, 'YTick', availableValues); 
set(hcb, 'YTickLabel', allLevelNames(getIndexes));
set(gca,'YDir','normal') % Not to invert x-y but putting (0,0) on the bottom left 
xlabel('x values');
ylabel('y values (care on (0,0) location from imagesc)');
title({['Variable colormap handling'];['by msuzen at gmail']});





Saturday, 14 May 2011

Operating in Multiple Repository within ant

Ant script can not change its base directory after it is invoked, for that reason if you are handling two repositories under same directory it would be wise to write a secondary wrapper script.

Friday, 8 April 2011

Ant task with wildcard expansion for svn.

Ant is a popular tool among Java developers but one can use it for some other generic tasks. For example using svn as a task inside an ant script is provided by a library. However wildcard expansion won't work under this task. One way to over come this to use an exec task as follows (in windowz):
<exec executable="cmd.exe">
<arg line=" /C svn --force delete pathtofiles\*.dat
--username=username --password=password">
</arg>
</exec>

Tuesday, 12 January 2010

GNU R Garbage Collection

On GNU R if you delete your objects by

rm(list=ls())

This action won't free any memory associated with those objects. For this reason, R provides a utility for garbage collection, simply apply

gc()

This will free not used allocations. If you like to apply this whenever R allocates memory, apply this:

gctorture(on = TRUE)

Consult with R Internals manual for further details.
(c) Copyright 2008-2024 Mehmet Suzen (suzen at acm dot org)

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License