Object-oriented programming and programming style guidelines for R

handprintΛογισμικό & κατασκευή λογ/κού

18 Νοε 2013 (πριν από 3 χρόνια και 11 μήνες)

63 εμφανίσεις

Henrik Bengtsson

hb@maths.lth.se

(MSc Computer Science, PhD candidate in Statistics)


Mathematical Statistics

Centre for Mathematical Sciences

Lund University, Sweden

Object
-
oriented programming and
programming style guidelines for R

2

of 21

Outline


Objects and Classes


Concepts of object
-
oriented programming


A complete example in [R]


Shapes


References in [R]


[R] Programming Style Guidelines with a
few coding conventions.



3

of 21

Part I:


Object
-
oriented programming in [R]

4

of 21

Objects and Classes

MicroarrayData

layout: Layout

R: double[][]

G: double[][]

Rb: double[][]

Gb: double[][]

nbrOfSlides(): int

nbrOfSpots(): int

swapDyes(...)

append()

as.data.frame(): data.frame

getLayout(): Layout

setLayout(layout)

subtractBackground(...)

normalizeWithinSlide(...)
normalizeAcrossSlides(...)

plot(...)

plotSpatial(...)

boxplot(...)

hist(...)

static read(...): MicroarrayData

write(...)

Class name

Fields

Methods

MicroarrayData

Layout

MicroarrayData

MicroarrayData

a class is a
data type
-

an object is an
instance

of a class

Objects of different classes

A class is the
recipe
for a
certain
cake...


...and the objects are the actual cakes of that kind.

5

of 21

Encapsulation, Inheritance, and Polymorphism

Encapsulation

means that a group of related properties,
methods, and other members are treated as a single unit or
object. Objects can control how properties are changed and
methods are executed.


Why: Makes it easier to change your implementation at a
later date by letting you hide implementation details of your
objects, a practice called
data hiding
.

6

of 21

Encapsulation, Inheritance, and Polymorphism

Inheritance

describes the ability to create new classes based
on an existing class. The new class inherits all the properties
and methods and events of the base class, and can be
customized with additional properties and methods.


Why: Promotes code reuse since the code for the methods
of the subclasses do not need to be rewritten.

7

of 21

Encapsulation, Inheritance, and Polymorphism

Polymorphism

means that you can have multiple classes
that can be used interchangeably, even though each class
implements the same properties or methods in different
ways. Polymorphism is essential to object
-
oriented
programming because it allows you to use items with the
same names, no matter what type of object is in use at the
moment.


Why: Inheritance becomes more flexible. Subclasses can
keep some methods inherited from their super classes and
override
others.

8

of 21

Overloading and Overriding

Overloaded members

are used to provide different versions
of a property or method that have the same name, but that
accept a different number of parameters, or parameters with
different data types.
Currently not supported in [R]
.


Overridden properties and methods

are used to replace an
inherited property or method that is not appropriate in a
derived class. Overridden members must accept the same
data type and number of arguments (
not enforced in [R]
).
Derived classes inherit overridden members.


9

of 21

Unified Modeling Language (UML) class diagram

1

0..*

abstract

static

private

association

(“using”)

inheritance

(“is a”)

10

of 21

# Create different Shape objects and store them in a list

allShapes

<
-

list(


Rectangle
(
Point
(0,0), width=5, height=8, color="blue"),


Square
(
Point
(
-
2,
-
5), side=3, color="red"),


Triangle
(
Point
(3,3), width=10, height=12, color="orange"),


Triangle
(
Point
(
-
4,
-
2.5), width=12, height=3, color="purple"),


Circle
(
Point
(
-
4,4), radius=5, color="green")

)


# Plot all shapes

for
shape

in
allShapes


paint
(
shape
)


# Get first mouse click

click

<
-

getFromClick
(
Point
)


while
click

is inside plot region


# Check with all shapes if they contains the click coordinates.


for
shape

in
allShapes



if
contains
(
shape
,
click
) then


paint
(
click
, col=
getColor
(
shape
), style="disc")


else


paint
(
click
, style="circle")



# Get another mouse click


click

<
-

getFromClick
(
Point
)

Interactive example

polymorphism

static method call

Either
shape$contains(click)

or

contains(shape, click)

11

of 21

setClassS3("
Point
", function(x=0, y=0) {


extend(Object(), "Point",


.
x

= x, # private


.
y

= y # private


);

})


setMethodS3("
getX
", "
Point
", function(this) {


this$.x;

})


setMethodS3("
getY
", "
Point
", function(this) {


this$.y;

})


setMethodS3("
getXY
", "
Point
", function(this) {


c(this$.x, this$.y);

})


setMethodS3("
setX
", "
Point
", function(this, newX) {


this$.x <
-

newX; # Using reference!

})


setMethodS3("
setY
", "
Point
", function(this, newY) {


this$.y <
-

newY;

})


setMethodS3("
setXY
", "
Point
", function(this, newXY) {


this$.x <
-

newXY[1];


this$.y <
-

newXY[2];

})

setMethodS3("
getFromClick
", "
Point
", function(this) {


xy <
-

locator(n=1); # Ask for one mouse click


Point(x=xy$x, y=xy$y);

})


setMethodS3("
print
", "
Point
", function(this) {


print(sprintf("%s at (%.3f,%.3f).",


getClass(this), this$.x, this$.y));

})


setMethodS3("
paint
", "
Point
", function(this, ...) {


points(this$.x, this$.y, ...);

})

private

static

class

method

Code for a class

12

of 21

References


One reference can only refer to one object.


One object can have one or several references referring
to it.


c <
-

Circle(Point(0,0), radius=2);

c1 <
-

c;

setRadius(c1, 4);

getRadius(c); # will give 4!



Results in more user
-
friendly packages for the end user!


Makes design and implementation much easier.


Saves memory.

13

of 21

References, how?


References are
not
supported by [R] since everything is
copy
-
by
-
value
. Have to return new instance:


setValue <
-

function(list, value) {


list$value;


return(list);

}



What you really want to do:


setValue <
-

function(list, value) {


list$value;

}



Why: For example, each of the Shape objects can use
(refer to) the same Point object to specify its position.
By moving the Point object, all Shape object will then
move along. This is
not

possible without references.


However, reference can be emulated by
encapsulating

such
functionalities in a root class Object, which all classes are
enforced to be derived from.


Contact me to get the code for Object, setClassS3() and
setMethodS3().

14

of 21

Part II:


[R] Programming Style Guidelines

15

of 21

Programming Style Guidelines


80% of the lifetime cost of a piece of software
goes to maintenance.


Hardly any software is maintained for its
whole life by the original author.


Code conventions improve the readability of
the software, allowing programmers to
understand new code more quickly and
thoroughly.


If you ship your source code as a product, you
need to make sure it is as well packaged and
clean as any other product you create.

16

of 21

[R] Coding Convention


Currently there is no RCC and people
invent their own conventions or not at
all.


We suggest to adapt a modified version
of the Java coding convention, which
has proved to be successful and is a
de
facto
standard.

17

of 21

Class names

Names representing classes must be nouns and
written in mixed case starting with upper case.

Shape, Rectangle, Point,
MicroarrayData, Layout


Avoid . (period) in class names, because it might lead to
ambiguities. , e.g.
my.very.own.class

is
not
a good
name.

18

of 21

Field and variable names

Variables and fields names must be in mixed
case starting with lower case.

x, y, nbrOfSlides, locus

To maintain readability of the code, do
not

shorten variable
names, e.g.
nbrOfGrids

(or
ngrids
) is much better than
ngr
.

Avoid using . (period) in variable names to make names
more consistent with other naming conventions.

However,
private fields, e.g.
layout.
, may contain periods for
improving readability.

19

of 21

Method names

Names representing methods (functions) must
be verbs and written in mixed case starting with
lower case.

getLayout(), normalize(method, slides)

To maintain readability of the code, do
not

shorten method
names, e.g.
normalizeWithinSlides()

is much better
than
normWSl()
.


For same reasons as before avoid using . (period) in method
names, e.g.
get.layout()

is
not
good.


20

of 21

File names

Classes should be declared in individual files
with the file name matching the class name.

Point.R, GenePixData.R

Results in well organized file structure and also gives quick
access to the source code, Listing all *.R files in a source
directory will give you an overview of all the classes.


For stand
-
alone functions one may adapt the same policy;

intToHex.R, col2rgb.R


21

of 21

Where to start

Tutorials and source code:



R Programming Style Guidelines


Programming with References in R


Implementing support for references in R


http://www.maths.lth.se/help/R/