15.2 Generic functions

We’ve previously discussed how to define attributes, especially the class attribute, of an object. We will now discuss how to define methods of a class. As an example, we’re going to investigate the function print().

print() is not a normal function; it is a generic function. This means that the function is written in a way that lets it do different things in different cases. You’ve already seen this behavior in action (although you may not have realized it). print() does one thing for numeric vectors:

> x <- c(1,2,3,4)
> print(x)
[1] 1 2 3 4

However, if we print a data frame, the output looks different:

> df <- data.frame(x = x)
> print(df)
  x
1 1
2 2
3 3
4 4

and so does the output of a factor:

> fac <- factor(x)
> print(fac)
[1] 1 2 3 4
Levels: 1 2 3 4

How can print() do this? You may imagine that print() looks up the class attribute of its input and then uses multiple if statements to pick which output to display. print() does something very similar, but much more simple. When you call print(), R examines the class of the input that you provide for print(). Then, R passes all arguments to another function that is specifically designed to handle that class of input.

For example, when you give print() a data.frame object, R will call another function, print.data.frame(). This function is specifically designed to print data frames. However, when you give print() a factor object, R will call another function, print.factor(). This function now knows how to print a factor object. Hence, print.data.frame() and print.factor() work like regular R functions, just like the ones we’ve seen before. There’s nothing magic in them. However, each was written specifically so print() could call it to handle a specific class of print input.

In conclusion, the output of print() looks different for different class objects. Let’s now have a closer look on the mechanisms that are behind this process.

A generic function usually has a very simple code. For example, the code for the generic function print() looks like this:`

> print <- function (x, ...) {
+   UseMethod("print", x)
+ }

When you call print(), this calls a special function, UseMethod(). UseMethod() examines the class of the input that you provide for print(), and calls another function designed to handle that class of input. For example, when you give print() a data.frame object, UseMethod() will call print.data.frame().

How does UseMethod() find the functions that can handle a specific class of input? Or, in other words, how does UseMethod() know that print.data.frame() is the function that it should call when printing a data frame?

The answer is that functions that handle a specific class of input (let’s call them S3 methods) have a special name. In fact, every S3 method has a two-part name. The first part of the name will refer to the function that the method works with (e.g. “print”). The second part will refer to the class (e.g. “data.frame”). These two parts will be separated by a period (.). So for example, the print() method that works with data frames is called print.data.frame(). The print() method that works with factors is called print.factor(). And so on.

When UseMethod() is called, it searches for an R function with the correct S3-style name. The function does not have to be special in any way; it just needs to have the correct name.