Chapter 18 Solutions to Exercises

18.0.1 Arithmetic Operators

Corresponding exercises: see Section 2.0.1

```
1+2+3  
```
```
1/(1+2)  
```
```
2^(1+2)  
```

18.0.2 Functions

Corresponding exercises: see Section 2.1.2

Type ?round or help(round) into the console. Purpose: rounds the values in its first argument to the specified number of decimal places. Arguments: 1) x = number that should be rounded. 2) digits = how many decimal places (how many numbers after the comma should remain). Default: the second argument (digits) has default 0.

round(1.23456789)  
round(1.23456789, 3)  
round(1.23456789, 7)

```
5.7  
5.7  
5.7  
1  
```
Searching in the help pages does not help much. Try googling something like “R binomial coefficient”.
choose(5,3)

18.0.3 Variables

Corresponding exercises: see Section 2.2.2

```
value_1 <- 6.7  
value_2 <- 56.3  
```
```
sqrt(value_1)  
sqrt(value_2)  
```

x <- ((2*value_1)/value_2)+(value_1*value_2)

18.0.4 R-Scripts

Corresponding exercises: see Section 3.0.1

#-------------------------------  
# Experimenting with logarithms  
#------------------------------

# the data  
my_data <- pi

# different bases of the logarithm  
base_1 <- 10  
base_2 <- exp(1)

# print the logarithm of my_data for the different bases  
print(log(my_data, base=base_1))  
print(log(my_data, base=base_2))

Execute your script using the buttons “Run” and “Source”, or the corresponding shortcuts Ctrl + Enter and Ctrl+ Shift+ Enter, respectively.

18.0.5 Introduction to vectors

Corresponding exercises: see Section 4.0.4

x <- c(-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5)

```
c(-6, x, 6)  
```

vec1 <- c(-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5)  
vec2 <- c(6, 7, 8, 9, 10)  
vec3 <- c(vec1, vec2)  
print(vec3)  
length(vec3)

```
-5:5  
10:6  
```

vec1 <- rep(1:5, times = 3)  
vec2 <- rep(1:5, each = 3)  
vec3 <- rep(1:5, times = 1:5)  
vec4 <- rep(1:5, each = 3, times = 2)

vec1 <- seq(0, 5, by=0.5)  
vec2 <- seq(1, 8, length.out=6)  
vec3 <- seq(10, -0.4, by=-0.8)

```
x <- -5:5  
x[8]  
```

x[c(2,4,6,8,10)]  
# or  
x[seq(2,10,2)]

x[9:11] <- c(30, 40, 50)  
# or  
x[(length(x)-2):length(x)] <- c(30, 40, 50)

```
x <- numeric(5)  
x[1:5] <- 1:5  
```

x <- -5:5  
x[c(-2,-4,-6,-8,-10)]  
# or  
x[seq(-2, -10, -2)]

18.0.6 Numeric vectors

Corresponding exercises: see Section 4.1.2

a <- -5:5  
b <- 10:0  
a-b  
a+b  
a*b

```
sum(c(a, b))  
```
```
min(a)  
max(a)  
mean(a)  
```
```
(a - mean(a)) / sd(a)  
```

a * c(1, 1, 5)  
# You get a warning message because recycling expects that the longer vector is a multiple of the shorter vector. Here, a has 11 elements, which is not a multiple of 3.

18.0.7 Logical vectors

Corresponding exercises: see Section 4.2.1

```
sum(y > x)  
```
```
x[x %% 2 == 0] = 1  
```
```
x[x >= 0 & x %% 2 == 1]  
```
```
x[x < -10 | x > 13]  
```

x[rep(c(F, F, F, T), length.out=length(x))] <- 1

18.0.8 Character vectors

Corresponding exercises: see Section 4.3.1

paste0(c("A", "B"), 1:100, collapse = " ")

paste(paste0("A", 1:100), paste0("B", 1:100), collapse = " ")

paste(hello, world, bang)  
paste0(hello, world, bang)  
paste0(paste(hello, world), bang)  
paste0(paste(hello, world, sep = "-"), bang)  
paste0(paste(hello, world), paste0(rep(bang, 3), collapse = ""))  
paste0(paste(hello, world), rep(bang, 3), collapse = " ")

18.0.9 Factors

Corresponding exercises: see Section 4.4.1

```
Levels: 3 4 5 7 8  
```

months <- c("july", "december", "july", "july", "may", "december", "july", "september", "may", "october", "december", "december", "august", "may", "september", "december", "september")  
fMonths <- factor(months)  
print(fMonths)  
str(fMonths)  
# There are 6 levels: august december july may october september.

18.0.10 Sorting and shuffling vectors

Corresponding exercises: see Section 4.6.1

x <- c(3, 4, -6, 2, -7, 2, -1, 0)  
sort(x)  
sort(x, decreasing = T)

x <- c(3, 4, -6, 2, -7, 2, -1, 0)  
sort(x[1:3])

v <- seq(8,37,length.out=98)  
V <- sample(v,10)  
sort(V, decreasing = TRUE)

18.0.11 Creating matrices

Corresponding exercises: see Section 5.1.1

```
matrix(1, nrow=5, ncol=10)  
```
```
diag(1:2, nrow=10)  
```

A <- matrix(c(1:10, 10:1), nrow=2, byrow=TRUE)

```
A <- rbind(1:10,10:1)  
```

b <- 1:80  
B <- matrix(b, ncol=5)  
k <- nrow(B)

A <- matrix(2, ncol=3, nrow=3)  
diag(A) <- 1

18.0.12 Accessing Matrix Elements

Corresponding exercises: see Section 5.2.1

A <- matrix(1:16, nrow=4, byrow=TRUE); B <- A[-4,]

```
sum(A[2,])  
sum(A[,3])  
```

# The trick is to use the  
A[which(A + 1 < (A/3)^2)] <- -1

18.0.13 Matrix Arithmetic

Corresponding exercises: see Section 5.3.1

Z <- matrix(c(-1,1,1,-1), nrow=2); Z %*% t(Z)

matrix(1:4, nrow=2, byrow=TRUE) %*% matrix(c(-1,1), nrow=2, ncol=3, byrow=TRUE)

solve(matrix(c(1,-2,-1,-3,0,1,7,1,0), nrow=3))

E <- matrix(1:4, ncol = 2, nrow = 2)  
G <- matrix(5:8, ncol = 2, nrow = 2)  
E  
G  
E * G  
E / G  
E %*% G  
E + G  
E - G  
E == G

18.0.14 Data Frames

Corresponding exercises: see Section 6.1.1

x <- c(5:0)  
y <- c(0:5)  
d <- data.frame(x,y)

w <- c(10:6)  
z <- c(6:10)  
e <- data.frame(z,w)  
names(e)<-names(d)  
d <- rbind(d,e)

```
d <-cbind(d,compared=c(x>y,z>w))  
```

18.0.15 Accessing data frames

Corresponding exercises: see Section 6.2.1

d <- data.frame(x=1:10,y=c(1,10,-2,3,8,-7,2,1,9,4))  
d[d$x==d$y,]

mean((d$y/d$x)[(length(d$x)-4):length(d$x)])

18.0.16 Manipulating data frames

Corresponding exercises: see Section 6.3.1

id <- paste(c("X"),1:10,sep="")  
x <- sample(-10:10, 10, replace=TRUE)  
y <- sample(-10:10, 10, replace=TRUE)  
data <- data.frame(id,x,y)  
data$y <- data$y^2

data <- cbind(data, ok = data$y > 20)  
other <- data[which(data$ok & x>3), c("id", "y")]

18.0.17 Lists

Corresponding exercises: see Section 7.3.1

holiday <- list(country=c("CH", "USA", "Japan"), continent=c("Europe", "The Americas", "East Asia"))

x <- list(holiday=holiday, place=c("mountain", "beach"), day=as.numeric(1:20), like=c(TRUE,FALSE))

```
 x[[1]][[1]][2]  
```

# There are two solutions:  
x$holiday[[2]][1]  
# or  
x[[1]]$continent[1]

```
x[[c(1,1,3)]]  
```
```
name <- "day"; x[[name]]  
```
```
x[c(2,4)]  
```

Chapters <- c("Chapter 3. R-Scripts","Chapter 4. Vectors","Chapter 5. Matrices", "Chapter 6. Data Frames")  
strsplit(Chapters,"\\. ")[[c(4,2)]]

18.0.18 Conditionals

Corresponding exercises: see Section 8.1.1

x <- 6  
if(x %% 2 == 0){  
  x <- x + 1  
}  
print(x)

x <- c(1,4,3)  
if(x[1]<x[2]&&x[2]<x[3]){ print("increasing") }else{ print("not increasing") }

x <- 1:5; y <- 3:-1  
if(length(x) > length(y)){  
  print("x")  
} else if(length(y) > length(x)){  
  print("y")  
} else {  
  print("equal")  
}

18.0.19 Loops

Corresponding exercises: see Section 8.2.3

x <- sample(10)  
for(i in 2:length(x)){  
if(x[i] < x[i-1]){  
    print("Vector is not strictly increasing!")  
  }  
}

v <- c()  
for(i in 1:10){  
  for(j in 1:i){  
     v <- c(v, i)  
  }  
}  
# Note that the inner loop could be replaced by `rep(i, i)`.

m <- matrix(1:120, nrow=3)  
v <- integer(dim(m)[2])  
for(i in 1:dim(m)[2]){  
  v[i] <- sum(m[,i])  
}

x <- 1  
while(x <= 10){  
  print(paste("I like number", x))  
  x <- x + 1  
}

F <- numeric(100)  
F[1] <- 0  
F[2] <- 1  
for(i in 3:100){  
  F[i] <- F[i-1] + F[i-2]  
}

x <- sample(100); y <- sample(100)  
for(i in 1:length(x)){  
  if(x[i] > y[i]){  
    print(x[i]*y[i])  
  } else if(x[i] < y[i]){  
    print(x[i]+y[i])  
  }  
}

a <- c(5,6,8,2,3,7,8,1,2,3,4)  
i <- 1  
while(a[i] >= 2){  
  print(a[i])  
  i <- i + 1  
}

a <- matrix(data=NA,nrow=3,ncol=3)  
for(i in 1:nrow(a)){  
  for(j in 1:ncol(a)){  
      a[i,j] <- i+j  
   }  
}

i <- 1; s <- i  
while(s < 5000000){  
  i <- i + 1  
  s <- s * i  
}  
print(i)

18.0.20 Basic Plotting

Corresponding exercises: see Section 9.0.2

x <- seq(-5, 5, length.out=20)  
y <- x^2*2  
plot(x, y, type='b', col='red', pch=19, lty=2, xlab="My variable x", ylab="Some transformation of x", main="The plot of a noob")

par(mar=c(2,2,0,0))  
plot(x, y, type='l', col=colors()[sample(length(colors()), 1)], lwd=10, xlab="", ylab="")

x <- rep(1:26, each=26)  
y <- rep(1:26, 26)  
plot(x, y, pch=15, col=colors(), cex=1.6)

#There are at least two solutions:  
plot(1:100, pch=19, col=c(rep("orange", 6), "red"))  
# or  
plot(1:100, pch=19, col=c("orange", "red")[1 + (1:100 %% 7 == 0)])

18.0.21 Plotting

Corresponding exercises: see Section 9.1.6

x <- seq(1, 100, by=0.5)  
d <- data.frame(x=x, y=sqrt(x), z=log(x))  
plot(d$x, d$y, type='l', col='red', xlab="x", ylab="A transformation of x")  
lines(d$x, d$z, col='blue', lty=2)  
legend('bottomright', legend=c("sqrt(x)", "log(x)"), col=c('red', 'blue'), lty=c(1,2))

plot(0, type='n', xlim=c(-10, 10), ylim=c(-1,1), xlab="x", ylab="y")  
abline(v=0, col='red', lty=1)  
abline(h=0, col='red', lty=1)  
text(c(5, -5, -5, 5), c(0.5, 0.5, -0.5, -0.5), labels=c("I", "II", "III", "IV"))

av.high <- c(32, 34, 34, 32, 32, 30, 32, 30, 30, 30, 32, 32)  
av.rain <- c(16, 31, 104, 131, 161, 155, 193, 225, 192, 199, 76, 27)  
par(las=1, mar=c(4,4,0.5,4)) # las changes the orientation of the labels  
# of the axis. The plot becomes more readable this way  
plot(1:12, av.high, ylim=c(0, max(av.high)), type='b', col='red', pch=19, xlab="Month", ylab="Average temperature high [°C]")  
scale <- max(av.rain) / max(av.high)  
lines(1:12, av.rain / scale, type='b', col='blue', pch=17)  
labels <- pretty(av.rain)  
axis(side=4, at=labels / scale, labels=labels)  
mtext("Average rainfall [mm]", side=4, line=3, las=3)  
legend('bottom', bty='n', col=c('red', 'blue'), lwd=1, pch=c(19, 17), legend=c("Average temperature", "Average rainfall"))

18.0.22 Plotting Multiple Pannels

Corresponding exercises: see Section 9.2.1

x <- seq(1, 100, by=0.5)  
d <- data.frame(x=x, y=sqrt(x), z=log(x))  
par(mfrow=c(1,2))  
plot(d$x, d$y, type='l', col='red', xlab="x", ylab="A transformation of x", main="sqrt(x)")  
plot(d$x, d$z, col='blue', lty=2, xlab="x", ylab="A transformation of x", main="log(x)")

x <- 0:100  
bases <- 2:13  
par(mfrow=c(3,4))  
for(i in 1:length(bases)){  
  plot(x, log(x, base=bases[i]), col=colors()[i*5], pch=i, xlab="x", ylab="log(x)")  
  text(par("usr")[1], par("usr")[3] + 0.1*(par("usr")[4]-par("usr")[3]), paste("base =", bases[i]), pos=2)  
}

layout(matrix(c(1,1,2,3,4,4), nrow = 3, ncol = 2, byrow = TRUE))  
Colors <- c("yellow","orange","blue","black")  
for(i in 1:4){  
 plot(x,a,pch=16,col=Colors[i])  
}

18.0.23 Probability distributions

Corresponding exercises: see Section 10.0.4

```
dexp(0.15, 2)  
```
```
dpois(3, 2)  
```
```
qnorm(0.99, mean=5, sd=4)  
```
```
dbeta(0.72, 0.7, 0.7)  
```
```
qnorm(1-0.3)  
```
```
rpois(100, lambda=2)  
```
```
rexp(13, rate=0.5)  
```

x <- rnorm(1000)  
sum(x < -1)/length(x)  
# theoretical expectation  
pnorm(-1)  
# with more random draws, we get closer to theoretical expectation  
x <- rnorm(100000)  
sum(x < -1)/length(x)

x <- rbinom(1000, size = 10, prob = 0.1)  
sum(x > 2)/length(x)  
# theoretical expectation: take inverse (we are interested in >2)  
1 - pbinom(2, size = 10, prob = 0.1)

height <- seq(140, 200, length.out = 1000)  
females <- dnorm(height, 164.7, 7.1)  
males <- dnorm(height, 178.4, 7.6)

plot(height, females, col = 'dodgerblue', type = "l", ylab = "density")  
lines(height, males, col = 'orange2')

myHeight <- 175  
abline(v = myHeight, col = 'dodgerblue', lty = 2)  
dnorm(myHeight, 164.7, 7.1) # for females

height <- seq(140, 200, length.out = 1000)  
females <- pnorm(height, 164.7, 7.1)  
males <- pnorm(height, 178.4, 7.6)

plot(height, females, col = 'dodgerblue', type = "l", ylab = "cumulative density")  
lines(height, males, col = 'orange2')

myHeight <- 175  
abline(v = myHeight, col = 'dodgerblue', lty = 2)  
pnorm(myHeight, 164.7, 7.1) # for females

quantiles <- c(0.05, 0.32, 0.5, 0.68, 0.95)  
femalesQuantiles <- qnorm(quantiles, 164.7, 7.1)  
malesQuantiles <- qnorm(quantiles, 178.4, 7.6)

# plot probability density function  
height <- seq(140, 200, length.out = 1000)  
females <- dnorm(height, 164.7, 7.1)  
males <- dnorm(height, 178.4, 7.6)

plot(height, females, col = 'dodgerblue', type = "l", ylab = "density")  
lines(height, males, col = 'orange2')

# add lines  
abline(v = femalesQuantiles, col = 'dodgerblue', lty = 4)  
abline(v = malesQuantiles, col = 'orange2', lty = 4)

lambda <- c(0.5, 1, 1.5)  
color <- c("orange", "purple", "cadetblue")  
par(mfrow=c(2,1), mar=c(4,4,1,0.1))  
x <- seq(0, 5, length.out=100)  
# plot probability density function  
plot(0, type='n', xlim=range(x), ylim=c(0, 1.5), xlab="x", ylab="P(x)")  
for(l in 1:length(lambda)){  
  lines(x, dexp(x, rate=lambda[l]), col=color[l])  
}  
legend('topright', bty='n', col=color, lwd=1, legend=lambda)  
# plot cumulative distribution function  
plot(0, type='n', xlim=range(x), ylim=c(0, 1), xlab="x", ylab="P(X<=x)")  
for(l in 1:length(lambda)){  
  lines(x, pexp(x, rate=lambda[l]), col=color[l])  
}  
legend('bottomright', bty='n', col=color, lwd=1, legend=lambda)

x <- numeric(100)  
for(i in 2:100){  
  x[i] <- rnorm(1, x[i-1], 1)  
}  
plot(x, type='b', pch=19, col='purple', xlab="index", ylab="x")

nRep <- 10000  
net <- numeric(nRep)  
for(r in 1:nRep){  
  # start of one game  
  invest <- 1  
  expenses <- invest  
  # simulate first round  
  win <- runif(1) < 18/37  
  # simulate all subsequent rounds  
  while(!win & invest <= 50){  
      invest <- 2 * invest  
      expenses <- expenses + invest  
      win <- runif(1) < 18/37  
  }  
  if(win){  
    # we won -> invested money is doubled  
    net[r] <- 2 * invest - expenses  
  } else {  
    # we lost -> no income, only expenses  
    net[r] <- 0 - expenses  
  }  
}  
sum(net>0)/length(net)  
mean(net)

# The net benefit is very likely to be negative, since the probability of winning is < 0.5 and since we cut the investment at 100 CHF.

18.0.24 Empirical Distributions

Corresponding exercises: see Section 10.1.3

x <- rbinom(1000, size=5, prob=0.2)  
par(mfrow=c(1,2))  
hist(x)  
plot(density(x))  
# the histogram represets the discrete nature of the data

x <- seq(0, 20, length.out=100)  
plot(x, dnorm(x, mean=10, sd=2), col='red', type='l')  
n <- 10^(1:4)  
for(i in 1:length(n)){  
  lines(density(rnorm(n[i], mean=10, sd=2)), lty=i)  
}  
legend('topleft', lty=1:length(n), legend=n)

data("iris")

# plot histograms  
par(mfrow = c(1,3))  
hist(iris$Petal.Length[iris$Species == "setosa"], col = "dodgerblue", xlab = "petal length", main = "setosa")  
hist(iris$Petal.Length[iris$Species == "versicolor"], col = "orange2", xlab = "petal length", main = "versicolor")  
hist(iris$Petal.Length[iris$Species == "virginica"], col = "firebrick", xlab = "petal length", main = "virginica")

# plot kernel  
par(mfrow = c(1,1))  
plot(density(iris$Petal.Length), ylim = c(0, 2.6), main = "Distribution of petal length")  
lines(density(iris$Petal.Length[iris$Species == "setosa"]), col = "dodgerblue")  
lines(density(iris$Petal.Length[iris$Species == "versicolor"]), col = "orange2")  
lines(density(iris$Petal.Length[iris$Species == "virginica"]), col = "firebrick")  
legend("topright", legend = c("global", "setosa", "versicolor", "virginica"), col = c("black", "dodgerblue",  "orange2", "firebrick"), lty = rep(1, 4))

18.0.25 Data Input and Export

Corresponding exercises: see Section 11.0.1

```
getwd()  
```
Create a directory using any means your file system offers. Then switch there using setwd() or with RStudio. Check where you are with getwd().

18.0.26 Exporting and Importing

Corresponding exercises: see Section 11.1.6

d <- data.frame(x=rnorm(100), y=rnorm(100), z=rnorm(100))  
write.table(d, file="my_data_frame.txt", row.names=FALSE)  
d <- read.table("my_data_frame.txt", header=TRUE)

```
read.csv2("your_file.txt")  
```

d <- read.csv2(url("https://www.bfs.admin.ch/bfsstatic/dam/assets/12647632/appendix"), stringsAsFactors=FALSE)  
d <- d[-1,]  
plot(d$pop1930, d$pop2018, xlab="1930", ylab="2018")

18.0.27 Listing files

Corresponding exercises: see Section 11.2.1

length(list.files(path="~/Downloads",pattern="*.pdf"))/length(list.files(path="~/Downloads"))*100

18.0.28 Exporting graphics

Corresponding exercises: see Section 11.3.1

x <- rnorm(100)  
y <- 4 + 1.5*x + rnorm(100, sd=sqrt(0.2))  
pdf("x_vs_y.pdf")  
par(mar=c(4,4,0.1,0.1))  
plot(x, y)  
dev.off()

pdf("x_vs_y.pdf", width=2, height=2)  
par(mar=c(4,4,0.1,0.1))  
plot(x, y)  
dev.off()

jpeg("x_vs_y_low.jpeg", width=480, height=480, pointsize=12)  
par(mar=c(4,4,0.1,0.1))  
plot(x, y)  
dev.off()  
jpeg("x_vs_y_high.jpeg", width=1280, height=1280, pointsize=12*1280/480)  
par(mar=c(4,4,0.1,0.1))  
plot(x, y)  
dev.off()

18.0.29 RStudio Workspace

Corresponding exercises: see Section 11.4.1

18.0.30 Installing packages

Corresponding exercises: see Section 12.0.1

install.packages("RColorBrewer")  
library(RColorBrewer)  
x <- seq(-4, 4, by = 0.1)  
numSds <- 10  
cols <- brewer.pal(numSds, "Spectral")  
# generate empty plot  
plot(0, type = "n", xlim=range(x), ylim=c(0,4), xlab="x", ylab = "density")  
for (i in 1:numSds){  
  y <- dnorm(x, 0, i/numSds)  
  lines(x, y, col = cols[i])  
}

18.0.31 Writing Functions

Corresponding exercises: see Section 13.0.1

square <- function(x){  
  return(x*x)  
}

convertFahrenheitToCelsius <- function(temp_F) {  
  temp_C <- (temp_F - 32) * 5/9  
  return(temp_C)  
}

calculateSampleVariance <- function(x){  
  n <- length(x)  
  res <- 1/(n-1) * sum((x - mean(x))^2)  
  return(res)  
}  
calculateSkew <- function(x){  
  n <- length(x)  
  nominator <- 1/(n-2) * sum((x - mean(x))^3)  
  denominator <- var(x)^(3/2)  
  return(nominator / denominator)  
}

calculateBothNA <- function(x, y){  
  return(sum(is.na(x) & is.na(y)))  
}

18.0.32 Input to functions

Corresponding exercises: see Section 13.1.3

takePower <- function(x, y = 2){  
  return(x^y)  
}

18.0.33 Checking values

Corresponding exercises: see Section 13.2.1

convertKelvinToFahrenheit <- function(temp_K) {  
  if (temp_K < 0) stop("Kelvin can not have negative values!")  
  temp_F <- temp_K * 9/5 - 459.67  
  return(temp_F)  
}

18.0.34 Return Values

Corresponding exercises: see Section 13.3.4

computeFactorial <- function(n){  
  if (n == 0)  
    return(1)  
  else {  
    prod <- 1  
    for (p in 1:n){  
      prod <- p * prod  
    }  
    return(prod)  
  }  
}

computeFactorial <- function(n){  
  if (n == 0)  
    return(1)  
  else  
    return(prod(1:n))  
}

simulateDiceNum6 <- function(){  
  # first simulate  
  simul <- sample(1:6, 100, replace=T)  
  # now count number of 6's  
  num6 <- 0  
  for (i in simul){  
    if (i == 6)  
      num6 <- num6 + 1  
  }  
  return(list(simulation = simul, num6 = sum))  
}

simulateDiceNum6 <- function(){  
  # first simulate  
  simul = sample(1:6, 100, replace=T)  
  # now count number of 6's  
  num6 <- sum(simul == 6)  
  return(list(simulation = simul, num6 = sum))  
}

fizzbuzz <- function(x){  
  if (x %% 3 == 0 & x %% 5 == 0)  
    return("fizzbuzz")  
  else if (x %% 3 == 0)  
    return("fizz")  
  else if (x %% 5 == 0)  
    return("buzz")  
  else return(x)  
}

18.0.35 Environment

Corresponding exercises: see Section 13.4.1

10

18.0.36 Loops are slow

Corresponding exercises: see Section 14.0.2

x <- runif(10^6)  
system.time(exp(x))/system.time(x+x)

mySum <- function(x){  
  s <- 0  
  for(i in 1:length(x)){  
    s <- s + x[i]  
  }  
  return(s)  
}  
system.time(mySum(x))  
system.time(sum(x))

myExpSum <- function(x){  
  s <- 0  
  for(i in 1:length(x)){  
    s <- s + exp(x[i])  
  }  
  return(s)  
}  
system.time(myExpSum(x))

# As you will notice, calculating the sum of exp(x) takes about twice as long as calculating the sum of x when using a for loop.  
# When using functions on vectors (first exercise), it took about 4 times as long.  
# This suggests, that the majority of the calculation time is now caused by the for loop, not the mathematical function used.

m <- matrix(rpois(100*10, lambda=rep(1:10, each=100)), ncol=10)

18.0.37 Apply

Corresponding exercises: see Section 14.1.6

lapply(sample(1:1000, size=100, replace=TRUE), rep, 10)

X <- matrix(rnorm(11*1000, mean=rep(-5:5, 1000), sd=rep(seq(0.1, 2, length.out=11), 1000)), nrow=11)  
apply(X, 1, mean)  
apply(X, 1, sd)  
# Note that since these are simulations, the means and standard deviations should be close to those used, but not necessarily identical.

means <- rowMeans(X)  
sds <- apply(X, 1, sd)  
X <- sweep(X, 1, means, "-")  
X <- sweep(X, 1, sds, "/")  
apply(X, 1, mean)  
apply(X, 1, sd)  
# Note that the means may be just very close to zero (e.g. 10^-15) but not actually zero. This is due to numerical issues when dealing with very small numbers and perfectly normal.

sumBelow <- function(x, a){  
  return(sum(x[x<a]))  
}  
apply(X, 1, sumBelow, a=0)  
apply(X, 1, sumBelow, a=1)

par(mfrow=c(2,2))  
apply(X[4:7,], 1, hist)

18.0.38 Functional programming

Corresponding exercises: see Section 14.2.4

A <- matrix(rnorm(100*100), nrow=100)  
sum_of_squares <- function(x){  
  return(sum(x*x))  
}  
apply(A, 1, sum_of_squares)

```
apply(A, 1, function(x) sum(x*x))  
```

my_apply <- function(x, f){  
  res <- numeric(nrow(x))  
  for(i in 1:nrow(x)){  
    res[i] <- f(x[i,])  
  }  
  return(res)  
}  
my_apply(A, function(x) sum(x*x))

apply_all <- function(x, funcs){  
  res <- matrix(NA, nrow=nrow(x), ncol=length(funcs))  
  for(i in 1:nrow(x)){  
    for(f in seq_along(funcs)){  
      res[i,f] <- funcs[[f]](x[i,])  
    }  
  }  
  return(res)  
}  
apply_all(A, list(function(x) sum(x*x), function(x) sum(x)^2))

18.0.39 S3 classes

Corresponding exercises: see Section 15.5.1

circle <- function(radius){  
  if (radius < 0)  
    stop("Radius must be positive!")

  c <- list(radius = radius)

  class(c) <- "circle"  
  return(c)  
}

smallCircle <- circle(0.01)  
largeCircle <- circle(1000)  
# invalidCircle <- circle(-1)

print.circle <- function(x){  
  output <- paste0("Here's a circle with radius = ", x$radius, ".")  
  print(output)  
}

# call print  
print(smallCircle)  
print(largeCircle)

# Note that you should never call the S3 function itself (e.g. never call print.circle(smallCircle))!

plot.circle <- function(x){  
  grid.circle(r = x$radius)  
}

# call plot  
plot(smallCircle)  
plot(largeCircle) # the circle is so large that it doesn't fit the window, so don't worry if you can't see it

# This is nice, isn't it? Now we can plot circles with plot().  
# No one needs to remember a complicated name for plotting a circle.

# generic function  
perimeter <- function(x){  
  UseMethod("perimeter", x)  
}

# S3 function  
perimeter.circle <- function(x){  
  return(2 * pi * x$radius)  
}

# We need to write the generic function perimeter() because this is not a built-in  
# generic function of R.  
# On the other hand, print() and plot() are built-in generic functions, so we didn't need  
# to write them again.

# generic function  
area <- function(x){  
  UseMethod("area", x)  
}

# S3 function  
area.circle <- function(x){  
  return(pi * x$radius * x$radius)  
}

# small circle  
perimeter(smallCircle)  
area(smallCircle)

# large circle  
perimeter(largeCircle)  
area(largeCircle)

# constructor for class rectangle  
rectangle <- function(width, height){  
  if (width < 0) stop("Width must be positive!")  
  if (height < 0) stop("Height must be positive!")

  r <- list(width = width, height = height)

  class(r) <- "rectangle"  
  return(r)  
}

# create some objects class rectangle  
smallRec <- rectangle(0.01, 0.05)  
largeRec <- rectangle(50, 100)  
# invalidRec <- rectangle(-1, 0.01)

# S3 function for print  
print.rectangle <- function(x){  
  output <- paste0("Here's a rectangle with width = ", x$width, " and height = ", x$height, ".")  
  print(output)  
}

# call print  
print(smallRec)  
print(largeRec)

# S3 function for plot  
plot.rectangle <- function(x){  
  grid.rect(x$width, x$height)  
}

# call plot  
plot(smallRec)  
plot(largeRec) # the rectangle is so large that it doesn't fit the window, so don't worry if you can't see it

# S3 function for perimeter (Note: we don't need to define the perimeter() generic function again!)  
perimeter.rectangle <- function(x){  
  return(2 * (x$width + x$height))  
}

# S3 function for area (Note: we don't need to define the area() generic function again!)  
area.rectangle <- function(x){  
  return(x$width * x$height)  
}

# check if perimeter() and print() are working!  
perimeter(smallRec)  
area(smallRec)

perimeter(largeRec)  
area(largeRec)

print() is a generic function and will call UseMethod("print", x). UseMethod() will then search for a print method for class “mammal”. However, we haven’t defined such a method. It will therefore call the print.default() method, which will do some default printing.

print.mammal <- function(x, ...){  
  print(paste0("This is ", x$name, ". It's a ", x$color, " ", x$sex, " mammal."))  
}

print(bobby)  
# print() is a generic function and calls UseMethod("print", x).  
# UseMethod() searches for a print method for class "mammal".  
# As we've named our function print.mammal() - which is the correct syntax for S3 function -  
# R will call print.mammal()!

class(bobby) <- c(class(bobby), "cow")

print(bobby)  
# still prints the function print.mammal()  
# Reason: class "mammal" comes before class "cow" in the vector of class attributes.

class(bobby) <- rev(class(bobby))

print(bobby)  
# still prints the function print.mammal().  
# Although class "cow" comes before class "mammal" in the vector of class attributes,  
# class "cow" does not have a printing function yet.  
# Therefore, R goes to the second element in the vector - "mammal" - and calls the corresponding function print.mammal().

print.cow <- function(x, ...){  
    print(paste0("This is ", x$name, ". It's a ", x$color, " ", x$sex, " cow."))  
}

print(bobby)  
# As we've now defined a print function for class "cow", R calls this one!  
# Make sure that "cow" comes before "mammal" in the vector of classes.

18.0.40 Tidyverse

Corresponding exercises: see Section 16.1.9

data("iris")  
filter(iris, Sepal.Length %in% c(5.0, 6.0, 7.0))

data("ChickWeight")  
ChickWeight %>%  
  filter(Diet != 4) %>%  
  group_by(Time) %>%  
  summarise(mean_weight = mean(weight))

18.0.41 ggplot2

Corresponding exercises: see Section 17.0.2

library(datasets)  
data("CO2")

ggplot(CO2,aes(x=conc, y=uptake))+  
  geom_point(aes(color=Type))+  
  ggtitle("CO2 uptake in plants") + xlab("carbon dioxide concentrations (mL/L)") + ylab("carbon dioxide uptake rates (µmol/m^2 sec)")+  
  theme_bw()

library(datasets)  
data("CO2")

ggplot(CO2,aes(x=conc, y=uptake))+  
  geom_point(aes(color=Type))+  
  ggtitle("CO2 uptake in plants") + xlab("carbon dioxide concentrations (mL/L)") + ylab("carbon dioxide uptake rates (µmol/m^2 sec)")+  
  facet_wrap(~Treatment)+  
  geom_smooth(aes(group=Treatment))+  
  theme_bw()