Skip to content
hadley edited this page Oct 18, 2010 · 24 revisions

The S4 object system

Compared to S3, the S4 object system is much stricter, and much closer to other OO systems. I recommend you familiarise yourself with the way that S3 works before reading this document - many of underlying ideas are the same, but the implementation is much stricter. There are two major differences from S3:

  • formal class definitions: unlike S3, S4 formally defines the representation and inheritance for each class

  • multiple dispatch: the generic function can be dispatched to a method based on the class of any number of argument, not just one

Here we introduce the basics of S4, trying to stay away from the esoterica and focussing on the ideas that you need to understand and write the majority of S4 code. This document will hopefully give you the scaffolding to make better sense of documentation and other detailed resources such as: ...

Classes and instances

In S3, you can turn any object into an object of a particular class just by setting the class attribute. S4 is much stricter: you must define the representation of the call using setClass, and the only way to create it is through the constructer function new.

A class has three key properties:

  • a name: an alpha-numeric string that identifies the class

  • representation: a list of slots (or attributes), giving their names and classes. For example, a person class might be represented a character name and a numeric age, as follows: representation(name = "character", age = "numeric")

  • a character vector of classes that it inherits from, or in S4 terminology, contains. Note that S4 supports multiple inheritance, but this should be used with extreme caution as it makes method lookup extremely complicated.

You create a class with setClass:

setClass("Person", representation(name = "character", age = "numeric"))
setClass("Employee", representation(boss = "Person"), contains = "Person")

and create an instance of a class with new:

hadley <- new("Person", name = "Hadley", age = 31)

Unlike S3, S4 checks that all of the slots have the correct type:

hadley <- new("Person", name = "Hadley", age = "thirty")
# invalid class "Person" object: invalid object for slot "age" in class
#  "Person": got class "character", should be or extend class "numeric"

hadley <- new("Person", name = "Hadley", sex = "male")
# invalid names for slots of class "Person": sex

If you omit a slot, it will initiate it with the default object of the class. Note that to access slots of an S4 object you use @, not $.

hadley <- new("Person", name = "Hadley")
hadley@age
# numeric(0)

This is likely not what you want, so you can also assign a default prototype for the class:

setClass("Person", representation(name = "character", age = "numeric"), 
  prototype(name = NA_character_, age = NA_real_))
hadley <- new("Person", name = "Hadley")
hadley@age
# [1] NA

To access a slot given by a string you can use slot, and getSlots will return a description of all the slots of a clas:

slot(hadley, "name")
# [1] "Hadley"

getSlots("Person")
#        name         age 
# "character"   "numeric" 

You can also provide an optional method that applies additional restrictions. This function should have a single argument called object and should return TRUE if the object is valid, and if not it should return a character vector giving all reasons it is not valid.

check_person <- function(object) {
  errors <- character()
  length_age <- length(object@age)
  if (length_age != 1) {
    msg <- paste("Age is length ", length_age, ".  Should be 1", sep = "")
    errors <- c(errors, msg)
  }

  length_name <- length(object@name)
  if (length_name != 1) {
    msg <- paste("Name is length ", length_name, ".  Should be 1", sep = "")
    errors <- c(errors, msg)
  }
  
  if (length(errors) == 0) TRUE else errors
}
setClass("Person", representation(name = "character", age = "numeric"), 
  validity = check_person)

new("Person", name = "Hadley")
# invalid class "Person" object: Age is length 0.  Should be 1
new("Person", name = "Hadley", age = 1:10)
Error in validObject(.Object) : 
  invalid class "Person" object: Age is length 10.  Should be 1
  
# But note that the check is not automatically applied when we modify 
# slots directly
hadley <- new("Person", name = "Hadley", age = 31)
hadley@age <- 1:10

# Can force check with validObject:
validObject(hadley)
# invalid class "Person" object: Age is length 10.  Should be 1

There's some tension between the usual interactive functional style of R and the global side-effect causing S4 class definitions. In most programming languages, class definition occurs at compile-time, while object instantiation occurs at run-time - it's unusual to be able to create new classes interactively. In particular, note that the examples rely on the fact that multiple calls to setClass with the same class name will silently override the previous definition unless the first definition is sealed with sealed = TRUE.

Generic functions and methods

Generic functions and methods work similarly to S3, but dispatch is based on the class of all arguments, and there is a special syntax for creating both generic functions and new methods.

The setGeneric function provides two main ways to create a new generic. You can either convert and existing function to a generic function, or you can create a new one from scratch.

sides <- function(object) 0
setGeneric("sides")

If you create your own, the second argument should be a function that defines all the arguments that you want to dispatch on and contains a call to standardGeneric("genericName").

setGeneric("sides", function(object) {
  standardGeneric("sides")
})

The following example sets up a simple hierarchy of shapes to use with the sides function.

setClass("Shape")
setClass("Polygon", representation(sides = "integer"), contains = "Shape")
setClass("Triangle", contains = "Polygon")
setClass("Square", contains = "Polygon")
setClass("Circle", contains = "Shape")

Defining a method for polygons is straightforward: we just use the sides slot. The setMethod function takes three arguments: the name of the generic function, the signature to match for this method and a function to compute the result. Unfortunately R doesn't offer any syntactic sugar for this task so the code is a little verbose and repetitive.

setMethod("sides", signature(object = "Polygon"), function(object) {
  object@sides
})

For the others we supply exact values. Note that that for generics with few arguments you can can simplify the signature by not explicitly giving the argument names. This saves spaces at the expensive of having to remember which position corresponds to which argument - not a problem if there's only one argument.

setMethod("sides", signature("Triangle"), function(object) 3)
setMethod("sides", signature("Square"),   function(object) 4)
setMethod("sides", signature("Circle"),   function(object) Inf)

You can optionally also specify valueClass to define the expected output of the generic. This will raise a run-time error if a method returns output of the wrong class.

setGeneric("sides", valueClass = "numeric", function(object) {
  standardGeneric("sides")
})
setMethod("sides", signature("Triangle"), function(object) "three")
sides(new("Triangle"))
# invalid value from generic function "sides", class "character", expected
# "numeric"

Signature matching

Note that it's possible to create methods that are ambiguous - i.e. it's not clear which method to pick. In this case R will pick the method that was defined first and return a warning message about the situation:

setGeneric("foo", function(a, b) {
  standardGeneric("foo")
}) 
setMethod("foo", signature("Triangle", "Polygon"), function(a, b) {
  "TriPoly"
})
setMethod("foo", signature("Polygon", "Triangle"), function(a, b) {
  "PolyTri"
})

foo(new("Triangle"), new("Triangle"))
# Note: Method with signature "Triangle#Polygon" chosen for function
# "foo", target signature "Triangle#Triangle". "Polygon#Triangle" would
# also be valid
# [1] TriPoly

setMethod("foo", signature("Triangle", "Triangle"), function(a, b) {
  "TriTri"
})
foo(new("Triangle"), new("Triangle"))
# [1] TriTri

Inheritance

Let's develop a fuller example. This is inspired by an example from the Dylan language reference, one of the languages that inspired the S4 object system. In this example we'll develop a simple model of vehicle inspections that vary depending on the type of vehicle (car or truck) and type of inspector (normal or state).

setClass("Vehicle")
setClass("Truck", contains = "Vehicle")
setClass("Car", contains = "Vehicle")

setClass("Inspector", representation(name = "character"))
setClass("StateInspector", contains = "Inspector")

setGeneric("inspect.vehicle", function(v, i) {
   standardGeneric("inspect.vehicle")
 })

setMethod("inspect.vehicle", 
 signature(v = "Vehicle", i = "Inspector"), 
 function(v, i) {
   message("Looking for rust")
 })

setMethod("inspect.vehicle", 
 signature(v = "Car", i = "Inspector"),
 function(v, i) {  
   callNextMethod() # perform vehicle inspection
   message("Checking seat belts")
 })

setMethod("inspect.vehicle", 
 signature(v = "Truck", i = "Inspector"),
 function(v, i) {
   callNextMethod() # perform vehicle inspection
   message("Checking cargo attachments")
 })

setMethod("inspect.vehicle", 
 signature(v = "Car", i = "StateInspector"),
 function(v, i) {
   callNextMethod() # perform car inspection
   message("Checking insurance")
 })

 inspect.vehicle(new("Car"), new("Inspector"))
 inspect.vehicle(new("Car"), new("StateInspector"))
 inspect.vehicle(new("Truck"), new("StateInspector"))

Common methods

  • is, as as<-
Clone this wiki locally