Chapter 14 Speed: Thinking in Vectors

R is interpreted at runtime, making it relatively slow compared to compiled or semi-compiled languages such as C++ or Java. To compensate for that, most functions are implemented in a faster language (often C++) and R just calls these functions to do the job. If this feature is exploited well, R might turn out to be actually quite fast.

In this section we will discuss how to make R code run fast by thinking in vectors (or matrices) rather than loops. If you are familiar with other programming languages this may feel strange at first. And if you are new to programming know at least this: it is an R thing.

14.0.1 Loops are slow

To measure the benefit of thinking in vectors, let us first introduce a way to measure execution time. This is done using the function system.time().

> system.time(runif(10^6))
   user  system elapsed 
  0.018   0.002   0.020

The function system.time() returns three measures of time:

The total user time the CPU spent on the task.
The total system time the CPU spent on the task.
The total time that elapsed from start to end.

Usually, the elapsed time is much larger than the CPU time as the CPU is also used by other processes on the system or may need to wait for some data to be loaded. The difference between the user and system time is a bit arbitrary and OS specific, but that is not of concern here.

What is of concern, is that generating $10^{6}$ random numbers using a for loop will be much slower than the runif(10^6) call above, regardless of the measure used:

> getRand <- function(n){
+   x <- numeric(n)
+   for(i in 1:n){
+     x[i] <- runif(1)
+   }
+ }
> system.time(getRand(10^6))
   user  system elapsed 
  2.326   0.377   2.725

Why is that? It is not due to the random number generation per se as both calls generate $10^{6}$ of those. Instead, the difference is due to R intrinsics: the for loop makes $10^{6}$ calls to the function runif(), while in the first case there is a single call. And in many cases, function calls are much slower than the actual mathematical problem to solve.

Here is another example.

> x <- runif(10^6)
> system.time(x*x)
   user  system elapsed 
  0.001   0.000   0.002 
> system.time(for(i in seq_along(x)){x[i]*x[i]})
   user  system elapsed 
  0.039   0.000   0.039

You might be tempted to think that the loop version also needs to increment the variable i. However, any implementation of the piece-wise product x*x will also need to do that. The difference is therefore only whether this happends in R or “inside” the function, i.e. in a faster language.

What that means for us as programmers: avoid loops in R whenever possible. Instead, try to execute commands on whole vectors (or matrices).

Since many R functions take vectors as arguments, this possible for many cases by just being a little creative. Here is an example to generate a 10x1000 matrix of 10 data sets (rows) that are all drawn from a normal distribution but with means 0, 1, …, 9 and standard deviations 1, 2, …, 10.

> X <- matrix(rnorm(10*1000, mean=rep(0:9, 1000), sd=rep(1:10, 1000)), nrow=10)
> mean(X[2,])
[1] 1.044023
> sd(X[5,])
[1] 4.858478

The trick here is to provide the matching mean and sd for each element of the matrix in one call, rather than using a loop. So, let’s be creative!

14.0.2 Exercises: Loops are slow

See Section 18.0.36 for solutions.

A particularly expensive function is exp(). Generate a vector $x$ of $10^{6}$ uniform random numbers $x_{i} \sim U (0, 1)$ to see how much slower exp(x) is compared to x+x when these functions were called on the entire vector.
Write a function that takes a vector and returns its sum $\sum_{i} x_{i}$ using a for loop. Time your function using the same vector $x$ from above and compare it to using sum(x).
Write a function similar to that above to calculate $\sum_{i} \exp (x_{i})$ using a for loop. Is it much slower than the function returning just the sum from the question above?
Generate a 100x10 matrix with 100 data sets (columns) each drawn from a Poisson distribution but with different $λ = 1, 2, \dots, 10$ (one per column).