6.1 Creating data frames

Data frames are constructed using the function data.frame(), which takes the vectors of the data frame as arguments.

> x <- c(1,-7,10,12,6,8,6,5,-5,-10)
> d <- data.frame(x=x, negative=x<0, 1:10)
> d
     x negative X1.10
1    1    FALSE     1
2   -7     TRUE     2
3   10    FALSE     3
4   12    FALSE     4
5    6    FALSE     5
6    8    FALSE     6
7    6    FALSE     7
8    5    FALSE     8
9   -5     TRUE     9
10 -10     TRUE    10

If names are provided to data.frame(), they are used as column names. Otherwise, R creates a unique name automatically, as is the case for the third column in the example above. That example also illustrates that vectors can either be provided as variables (e.g. x) or created on the fly (e.g. x<0).

You may add additional columns to a data frame using the function cbind() that “binds” columns.

> a <- 1:10; b <- rep(c("x", "y"), 5)
> d <- cbind(a, b)
> d <- cbind(d, a<5)
> d
      a    b          
 [1,] "1"  "x" "TRUE" 
 [2,] "2"  "y" "TRUE" 
 [3,] "3"  "x" "TRUE" 
 [4,] "4"  "y" "TRUE" 
 [5,] "5"  "x" "FALSE"
 [6,] "6"  "y" "FALSE"
 [7,] "7"  "x" "FALSE"
 [8,] "8"  "y" "FALSE"
 [9,] "9"  "x" "FALSE"
[10,] "10" "y" "FALSE"

On the other hand, the function rbind()returns a data frame by binding rows.

> e <- data.frame(word=c("one", "two"), number=c(1, 2))
> f <- data.frame(word=c("three", "four","five","six"), number=3:6)
> d <- rbind(e, f)
> d <- rbind(d,data.frame(word=c("seven"), number=7))
> d
   word number
1   one      1
2   two      2
3 three      3
4  four      4
5  five      5
6   six      6
7 seven      7

When using rbind(), the data frames should have the same column names, otherwise merging the data frames produces an error. You can, however, use names() to check if the column names are the same and rename the column names of one of the data frames to match.

> d <- data.frame(words=paste("word", 1:4, sep=""), numbers=c(1,2,3,4))
> a <- data.frame(word=c("word5"), number=c(5))
> names(d) == names(a)
[1] FALSE FALSE
> names(a) <- names(d)
> d <- rbind(d,a)
> d
  words numbers
1 word1       1
2 word2       2
3 word3       3
4 word4       4
5 word5       5

Just as with matrices, the dimensionality of a data frame is checked with the function dim(), nrow()and ncol().

> d <- data.frame(1:5, paste("Test", letters[1:5]))
> dim(d)
[1] 5 2
> nrow(d)
[1] 5
> ncol(d)
[1] 2

6.1.1 Exercises: Data Frames

See Section 18.0.14 for solutions.

  1. Create two numeric vectors x and y, with numbers from 0 to 5 (x decreasing, y increasing) and merge them to a data frame.

  2. Create a new data frame as in the previous exercise but with numbers from 6 to 10 (vector z decreasing, vector w increasing) and merge this new data frame to your data frame from the previous exercise.

  3. Add a logical column (“compared”) to your previous data frame that states if the first column is bigger than the second.