-
Notifications
You must be signed in to change notification settings - Fork 755
Compared to S3, the S4 object system is much stricter, and much closer to other OO systems. I recommend you familiarise yourself with the way that S3 works before reading this document - many of underlying ideas are the same, but the implementation is much stricter. There are two major differences from S3:
-
formal class definitions: unlike S3, S4 formally defines the representation and inheritance for each class
-
multiple dispatch: the generic function can be dispatched to a method based on the class of any number of argument, not just one
Here we introduce the basics of S4, trying to stay away from the esoterica and focussing on the ideas that you need to understand and write the majority of S4 code. This document will hopefully give you the scaffolding to make better sense of documentation and other detailed resources such as: ...
In S3, you can turn any object into an object of a particular class just by setting the class attribute. S4 is much stricter: you must define the representation of the call using setClass
, and the only way to create it is through the constructer function new
.
A class has three key properties:
-
a name: an alpha-numeric string that identifies the class
-
representation: a list of slots (or attributes), giving their names and classes. For example, a person class might be represented a character name and a numeric age, as follows:
representation(name = "character", age = "numeric")
-
a character vector of classes that it inherits from, or in S4 terminology, contains. Note that S4 supports multiple inheritance, but this should be used with extreme caution as it makes method lookup extremely complicated.
You create a class with setClass
:
setClass("Person", representation(name = "character", age = "numeric"))
setClass("Employee", representation(boss = "Person"), contains = "Person")
and create an instance of a class with new
:
hadley <- new("Person", name = "Hadley", age = 31)
Unlike S3, S4 checks that all of the slots have the correct type:
hadley <- new("Person", name = "Hadley", age = "thirty")
# invalid class "Person" object: invalid object for slot "age" in class
# "Person": got class "character", should be or extend class "numeric"
hadley <- new("Person", name = "Hadley", sex = "male")
# invalid names for slots of class "Person": sex
If you omit a slot, it will initiate it with the default object of the class. Note that to access slots of an S4 object you use @
, not $
.
hadley <- new("Person", name = "Hadley")
hadley@age
# numeric(0)
This is likely not what you want, so you can also assign a default prototype for the class:
setClass("Person", representation(name = "character", age = "numeric"),
prototype(name = NA_character_, age = NA_real_))
hadley <- new("Person", name = "Hadley")
hadley@age
# [1] NA
To access a slot given by a string you can use slot
, and getSlots
will return a description of all the slots of a clas:
slot(hadley, "name")
# [1] "Hadley"
getSlots("Person")
# name age
# "character" "numeric"
You can also provide an optional method that applies additional restrictions. This function should have a single argument called object
and should return TRUE
if the object is valid, and if not it should return a character vector giving all reasons it is not valid.
check_person <- function(object) {
errors <- character()
length_age <- length(object@age)
if (length_age != 1) {
msg <- paste("Age is length ", length_age, ". Should be 1", sep = "")
errors <- c(errors, msg)
}
length_name <- length(object@name)
if (length_name != 1) {
msg <- paste("Name is length ", length_name, ". Should be 1", sep = "")
errors <- c(errors, msg)
}
if (length(errors) == 0) TRUE else errors
}
setClass("Person", representation(name = "character", age = "numeric"),
validity = check_person)
new("Person", name = "Hadley")
# invalid class "Person" object: Age is length 0. Should be 1
new("Person", name = "Hadley", age = 1:10)
Error in validObject(.Object) :
invalid class "Person" object: Age is length 10. Should be 1
# But note that the check is not automatically applied when we modify
# slots directly
hadley <- new("Person", name = "Hadley", age = 31)
hadley@age <- 1:10
# Can force check with validObject:
validObject(hadley)
# invalid class "Person" object: Age is length 10. Should be 1
There's some tension between the usual interactive functional style of R and the global side-effect causing S4 class definitions. In most programming languages, class definition occurs at compile-time, while object instantiation occurs at run-time - it's unusual to be able to create new classes interactively. In particular, note that the examples rely on the fact that multiple calls to setClass
with the same class name will silently override the previous definition unless the first definition is sealed with sealed = TRUE
.
Generic functions and methods work similarly to S3, but dispatch is based on the class of all arguments, and there is a special syntax for creating both generic functions and new methods.
The setGeneric
function provides two main ways to create a new generic. You can either convert and existing function to a generic function, or you can create a new one from scratch.
sides <- function(object) 0
setGeneric("sides")
If you create your own, the second argument should be a function that defines all the arguments that you want to dispatch on and contains a call to standardGeneric("genericName")
.
setGeneric("sides", function(object) {
standardGeneric("sides")
})
The following example sets up a simple hierarchy of shapes to use with the sides function.
setClass("Shape")
setClass("Polygon", representation(sides = "integer"), contains = "Shape")
setClass("Triangle", contains = "Polygon")
setClass("Square", contains = "Polygon")
setClass("Circle", contains = "Shape")
Defining a method for polygons is straightforward: we just use the sides slot. The setMethod
function takes three arguments: the name of the generic function, the signature to match for this method and a function to compute the result. Unfortunately R doesn't offer any syntactic sugar for this task so the code is a little verbose and repetitive.
setMethod("sides", signature(object = "Polygon"), function(object) {
object@sides
})
For the others we supply exact values. Note that that for generics with few arguments you can can simplify the signature by not explicitly giving the argument names. This saves spaces at the expensive of having to remember which position corresponds to which argument - not a problem if there's only one argument.
setMethod("sides", signature("Triangle"), function(object) 3)
setMethod("sides", signature("Square"), function(object) 4)
setMethod("sides", signature("Circle"), function(object) Inf)
You can optionally also specify valueClass
to define the expected output of the generic. This will raise a run-time error if a method returns output of the wrong class.
setGeneric("sides", valueClass = "numeric", function(object) {
standardGeneric("sides")
})
setMethod("sides", signature("Triangle"), function(object) "three")
sides(new("Triangle"))
# invalid value from generic function "sides", class "character", expected
# "numeric"
Note that it's possible to create methods that are ambiguous - i.e. it's not clear which method to pick. In this case R will pick the method that was defined first and return a warning message about the situation:
setGeneric("foo", function(a, b) {
standardGeneric("foo")
})
setMethod("foo", signature("Triangle", "Polygon"), function(a, b) {
"TriPoly"
})
setMethod("foo", signature("Polygon", "Triangle"), function(a, b) {
"PolyTri"
})
foo(new("Triangle"), new("Triangle"))
# Note: Method with signature "Triangle#Polygon" chosen for function
# "foo", target signature "Triangle#Triangle". "Polygon#Triangle" would
# also be valid
# [1] TriPoly
setMethod("foo", signature("Triangle", "Triangle"), function(a, b) {
"TriTri"
})
foo(new("Triangle"), new("Triangle"))
# [1] TriTri
Let's develop a fuller example. This is inspired by an example from the Dylan language reference, one of the languages that inspired the S4 object system. In this example we'll develop a simple model of vehicle inspections that vary depending on the type of vehicle (car or truck) and type of inspector (normal or state).
setClass("Vehicle")
setClass("Truck", contains = "Vehicle")
setClass("Car", contains = "Vehicle")
setClass("Inspector", representation(name = "character"))
setClass("StateInspector", contains = "Inspector")
setGeneric("inspect.vehicle", function(v, i) {
standardGeneric("inspect.vehicle")
})
setMethod("inspect.vehicle",
signature(v = "Vehicle", i = "Inspector"),
function(v, i) {
message("Looking for rust")
})
setMethod("inspect.vehicle",
signature(v = "Car", i = "Inspector"),
function(v, i) {
callNextMethod() # perform vehicle inspection
message("Checking seat belts")
})
setMethod("inspect.vehicle",
signature(v = "Truck", i = "Inspector"),
function(v, i) {
callNextMethod() # perform vehicle inspection
message("Checking cargo attachments")
})
setMethod("inspect.vehicle",
signature(v = "Car", i = "StateInspector"),
function(v, i) {
callNextMethod() # perform car inspection
message("Checking insurance")
})
inspect.vehicle(new("Car"), new("Inspector"))
inspect.vehicle(new("Car"), new("StateInspector"))
inspect.vehicle(new("Truck"), new("StateInspector"))
- is, as as<-