Wednesday, 7 May 2014

Euclid Algorithm for Set of Integers: 'Reduce' vs. trees in R

The Euclid Algorithm provides a solution to the greatest common divisor (GCD) of two natural numbers $x_{1}$ and $x_{-2}$, denoted by $GCD(x_{1}, x_{2})$. This will produce the largest integer that divides $x_{1}$ and $x_{2}$. Solution is proposed by Euclid of Ancient Greece. This can be formulated as a recurrence relation:
$$x_{k} = x_{k-1}  modulo  x_{k-2}.$$
'modulo' binary operation returns the remainder from a given division. Stopping criterion for the recurrence relation is reached when $x_{k-2}=0$ and the result of GCD will be the current value of $x_{k-1}$. This process can be visualised as successive divisions. Let's implement this in R in a naive way.

# Naive Euclid algorithm by msuzen
gcd <- function(a, b)
{
  rk_1 <- a;
  rk_2 <- b;
  # Recurrence Formula:  r_k =  r_k-1 modulo r_k-2
  # Increment k until r_k-2 == 0 
  while(rk_2 != 0) {
    rk      <- rk_1%%rk_2; # remainder
    rk_1    <- rk_2;       # proceed in recurrence
    rk_2    <- rk;
  }
  return(rk_1)
}

This is a straight forward task. Let's make the problem little more generic. What happends if we would like to know GCD of $n$ natural numbers, $x_{1},..., x_{n}$? Than, a solution is to apply GCD operation pairwise, for example if $n=3$:
$$GCD(x_{1}, GCD(x_{2}, x_{3})) = GCD(GCD(x_{1}, x_{2}), x_{3})$$
How can we implement this for a vector of non-negative integers?

Tree Approach

The simplest way to reach GCD of $n$ numbers is probably thinking of this process as a binary tree, formed by pairing elements of set of integers as we obtain GCDs. It is relatively easy to implement this because ordering of pairs is not important. We can start from the beginning and pair up as we obtain the results. Here is the naive implementation.

1
2
3
4
5
6
7
8
gcdN <- function(X) {
  n      <- length(X)
  gcdEnd <- X[1]
  for(i in 2:n) {
     gcdEnd <- gcd(gcdEnd, X[i])
  } 
  return(gcdEnd)
}

'Reduce' operation 

So called tree approach we have given above is actually noting but a Reduce operation in the context of MapReduce. The function gcdN can be replaced with a single line.
Reduce("gcd", X)
This example looks trivial but having Reduce and friends in our programming toolbox makes our life a little easier. Noting so novel here! But it could saves us time not to implement tree like algorithm.