Classes and Objects in R Programming
Classes and Objects in R Programming
Classes and Objects in R Programming
function calls. This must be kept in mind when using it for solving big problems.
An object is a data structure having some attributes and methods which act on its
attributes.
Class is a blueprint for the object. We can think of class like a sketch (prototype) of a
house. It contains all the details about the floors, doors, windows etc. Based on
these descriptions we build the house.
House is the object. As, many houses can be made from a description, we can
create many objects from a class. An object is also called an instance of a class and
the process of creating this object is called instantiation.
While most programming languages have a single class system, R has three class
systems. Namely, S3, S4 and more recently Reference class systems.
They have their own features and peculiarities and choosing one over the other is a
matter of preference. Below, we give a brief introduction to them.
S3 Class
S3 class is somewhat primitive in nature. It lacks a formal definition and object of this
class can be created simply by adding a class attribute to it.
This simplicity accounts for the fact that it is widely used in R programming
language. In fact most of the R built-in classes are of this type.
Example 1: S3 class
> # create a list with required components
> s <- list(name = "John", age = 21, GPA = 3.5)
S4 Class
S4 class are an improvement over the S3 class. They have a formally defined
structure which helps in making object of the same class look more or less similar.
Class components are properly defined using the setClass() function and objects
are created using the new() function.
Example 2: S4 class
< setClass("student", slots=list(name="character", age="numeric",
GPA="numeric"))
Reference Class
Reference class were introduced later, compared to the other two. It is more similar
to the object oriented programming we are used to seeing in other major
programming languages.
Comparision between S3 vs S4 vs
Reference Class
S3 Class S4 Class Referene Class
Objects are created by Objects are created Objects are created using
setting the class attribute using new() generator functions
R S3 Class
In this article, you will learn to work with S3 classes (one of the three class
systems in R).
Most of the classes that come predefined in R are of this type. The fact that it is
simple and easy to implement is the reason behind this.
Basically, a list with its class attribute set to some class name, is an S3 object. The
components of the list become the member variables of the object.
$GPA
[1] 3.5
attr(,"class")
[1] "student"
This might look awkward for programmers coming from C++, Python etc. where
there are formal class definitions and objects have properly defined attributes and
methods.
In R S3 system, it's pretty ad hoc. You can convert an object's class according to
your will with objects of the same class looking completely different. It's all up to you.
This will bring some uniformity in the creation of objects and make them look similar.
We can also add some integrity check on the member attributes. Here is an
example. Note that in this example we use the attr() function to set the class
attribute of the object.
> s
$name
[1] "Paul"
$age
[1] 26
$GPA
[1] 3.7
attr(,"class")
[1] "student"
> class(s)
[1] "student"
> # these integrity check only work while creating the object using
constructor
> s <- student("Paul", 26, 2.5)
> s
$name
[1] "Paul"
$age
[1] 26
$GPA
[1] 3.7
attr(,"class")
[1] "student"
Furthermore, we can use print() with vectors, matrix, data frames, factors etc. and
they get printed differently according to the class they belong to.
How does print() know how to print these variety of dissimilar looking object?
The answer is, print() is a generic function. Actually, it has a collection of a number
of methods. You can check all these methods with methods(print).
> methods(print)
[1] print.acf*
[2] print.anova*
...
[181] print.xngettext*
[182] print.xtabs*
We can see methods like print.data.frame and print.factor in the above list.
If we had done the same with a factor, the call would dispatch to print.factor().
Here, we can observe that the method names are in the
form generic_name.class_name(). This is how R is able to figure out which method to
call depending on the class.
Printing our object of class "student" looks for a method of the form print.student(),
but there is no method of this form.
So, which method did our object of class "student" call?
There are plenty of generic functions like print(). You can list them all
with methods(class="default").
> methods(class="default")
[1] add1.default* aggregate.default*
[3] AIC.default* all.equal.default
...
Now this method will be called whenever we print() an object of class "student".
$GPA
[1] 3.7
> print
function (x, ...)
UseMethod("print")
<bytecode: 0x0674e230>
<environment: namespace:base>
> plot
function (x, y, ...)
UseMethod("plot")
<bytecode: 0x04fe6574>
<environment: namespace:graphics>
We can see that they have a single call to UseMethod() with the name of the generic
function passed to it. This is the dispatcher function which will handle all the
background details. It is this simple to implement a generic function.
For the sake of example, we make a new generic function called grade.
A generic function is useless without any method. Let us implement the default
method.
A sample run.
> grade(s)
Your grade is 3.7
In this way, we implemented a generic function called grade and later a method for
our class.
R S4 Class
In this ,you'll learn everything about S4 classes in R; how to define them,
create them, access their slots, and use them efficiently in your program.
Unlike S3 classes and objects which lacks formal definition, we look at S4 class
which is stricter in the sense that it has a formal definition and a uniform way to
create objects.
This adds safety to our code and prevents us from accidentally making naive
mistakes.
In R terminology, member variables are called slots. While defining a class, we need
to set the name and the slots (along with class of the slot) it is going to have.
Example 1: Definition of S4 class
setClass("student", slots=list(name="character", age="numeric",
GPA="numeric"))
In the above example, we defined a new class called student along with three slots
it's going to have name, age and GPA.
There are other optional arguments of setClass() which you can explore in the help
section with ?setClass.
> s
An object of class "student"
Slot "name":
[1] "John"
Slot "age":
[1] 21
Slot "GPA":
[1] 3.5
> isS4(s)
[1] TRUE
This generator function (usually having same name as the class) can be used to
create new objects. It acts as a constructor.
> student <- setClass("student", slots=list(name="character",
age="numeric", GPA="numeric"))
> student
class generator function for class ‚student‛ from package
‘.GlobalEnv’
function (...)
new("student", ...)
Note above that our constructor in turn uses the new() function to create objects. It is
just a wrap around.
Slot "age":
[1] 21
Slot "GPA":
[1] 3.5
Accessing slot
> s@name
[1] "John"
> s@GPA
[1] 3.5
> s@age
[1] 21
> s
An object of class "student"
Slot "name":
[1] "John"
Slot "age":
[1] 21
Slot "GPA":
[1] 3.7
> slot(s,"name")
[1] "John"
> s
An object of class "student"
Slot "name":
[1] "Paul"
Slot "age":
[1] 21
Slot "GPA":
[1] 3.7
You can list all the S4 generic functions and methods available, using the
function showMethods().
Writing the name of the object in interactive mode prints it. This is done using the S4
generic function show().
You can see this function in the above list. This function is the S4 analogy of the
S3 print()function.
> isS4(show)
[1] TRUE
We can list all the methods of show generic function using showMethods(show).
For example, we can implement our class method for the show() generic as follows.
setMethod("show",
"student",
function(object) {
cat(object@name, "\n")
cat(object@age, "years old\n")
cat("GPA:", object@GPA, "\n")
}
)
Now, if we write out the name of the object in interactive mode as before, the above
code is executed.
In this way we can write our own S4 class methods for generic functions.
R Reference Class
In this, you will learn to work with reference class which is one of the three
class systems (other two are S3 and S4).
Reference class in R programming is similar to the object oriented programming we
are used to seeing in common languages like C++, Java, Python etc.
Unlike S3 and S4 classes, methods belong to class rather than generic functions.
Reference class are internally implemented as S4 classes with an environment
added to it.
> setRefClass("student")
> s
Reference class object of class "student"
Field "name":
[1] "John"
Field "age":
[1] 21
Field "GPA":
[1] 3.5
> s$name
[1] "John"
> s$age
[1] 21
> s$GPA
[1] 3.5
> s
Reference class object of class "student"
Field "name":
[1] "Paul"
Field "age":
[1] 21
Field "GPA":
[1] 3.5
Warning Note:
In R programming, objects are copied when assigned to new variable or passed to a
function (pass by value). For example.
> # modify b
> b$y = 3
$y
[1] 2
$y
[1] 3
But this is not the case with reference objects. Only a single copy exist and all
variables reference to the same copy. Hence the name, reference.
> b
Reference class object of class "student"
Field "name":
[1] "Paul"
Field "age":
[1] 21
Field "GPA":
[1] 3.5
This can cause some unwanted change in values and be the source of strange bugs.
We need to keep this in mind while working with reference objects. To make a copy,
we can use the copy() method made availabe to us.
> # modify b
> b$name <- "Paul"
Reference Methods
Methods are defined for a reference class and do not belong to generic functions as
in S3 and S4 classes.
All reference class have some methods predefined because they all are inherited
from the superclass envRefClass.
> student
Generator for class "student":
Class fields:
Class Methods:
"callSuper", "copy", "export", "field", "getClass",
"getRefClass",
"import", "initFields", "show", "trace", "untrace", "usingMethods"
Reference Superclasses:
"envRefClass"
We can see class methods like copy(), field() and show() in the above list. We can
create our own methods for the class.
This can be done during the class definition by passing a list of function definitions
to methods argument of setRefClass().
Note that we have to use the non-local assignment operator <<- since age isn't in the
method's local environment. This is important.
Using the simple assignment operator <- would have created a local variable
called age, which is not what we want. R will issue a warning in such case.
> s$inc_age(5)
> s$age
[1] 26
> s$dec_age(10)
> s$age
[1] 16
R Inheritance
In this, you'll learn everything about inheritance in R. More specifically, how to
create inheritance in S3, S4 and Reference classes, and use them efficiently in
your program.
Defining new classes out of existing ones. This is to say, we can derive new classes
from existing base classes and adding new features. We don't have to write from
scratch. Hence, inheritance provides reusability of code.
Inheritance forms a hierarchy of class just like a family tree. Important thing to note is
that the attributes define for a base class will automatically be present in the derived
class.
Moreover, the methods for the base class will work for the derived.
Below, we discuss how inheritance is carried out for the three different class systems
in R programming language.
Inheritance in S3 Class
S3 classes do not have any fixed definition. Hence attributes of S3 objects can be
arbitrary.
Derived class, however, inherit the methods defined for base class. Let us suppose
we have a function that creates new objects of class student as follows.
This is be done by assigning a character vector of class names like class(obj) <-
c(child, parent).
We can see above that, since we haven't defined any method of the
form print.InternationalStudent(), the method print.student() got called. This
method of class student was inherited.
Now let us define print.InternationalStudent().
This will overwrite the method defined for class student as shown below.
> s
John is from France
> inherits(s,"student")
[1] TRUE
> is(s,"student")
[1] TRUE
Inheritance in S4 Class
Since S4 classes have proper definition, derived classes will inherit both attributes
and methods of the parent class.
Let us define a class student with a method for the generic function show().
Inheritance is done during the derived class definition with the argument contains as
shown below.
Here we have added an attribute country, rest will be inherited from the parent.
> show(s)
John
21 years old
GPA: 3.5
We see that method define for class student got called when we did show(s).
We can define methods for the derived class which will overwrite methods of the
base class, like in the case of S3 systems.
Now we will inherit from this class. We also overwrite dec_age() method to add an
integrity check to make sure age is never negative.
> s$dec_age(5)
> s$age
[1] 16
> s$dec_age(20)
Error in s$dec_age(20) : Age cannot be negative
> s$age
[1] 16