mostlycopy-en - Ruby Forum

tunisianbromidrosisInternet and Web Development

Feb 5, 2013 (4 years and 9 months ago)

99 views

An Implementation of Mostly
-
Copying GC on Ruby VM

Tomoharu Ugawa

The University of Electro
-
Communications,
Japan


Background(1/2)


Script languages are used at various
scene


Before: only for tiny applications


Short lifetime


Runs with little memory


GC (Garbage Collection) was not important


Now

for servers such as Rails, as well


May have long lifetime


May create a lot of objects


GC has a great impact on total performance

Background(2/2)


Ruby’s GC


Conservative Mark
-
Sweep GC


Does not move objects


Once we expanded the heap, we can hardly shrink
the heap


Heap cannot release unless it contains NO object


Lucky cases rarely happen

Ex) Once a server uses a lot of memory for a heavy request, it
will run with a large heap even after responding the request.

Initial Heap

Additional Heap 1

Additional Heap 2

Live

Goal


Compact the heap so that Ruby can return
unused memory to OS.




Use Mostly
-
Copying GC


Modify the algorithm for Ruby


Minimize the change of C
-
libraries

Agenda


Shrinking the heap using Mostly
-
Copying
GC


Modified Mostly
-
Copying algorithm


Evaluation


Related work


Conclusion


Why Ruby does not move objects?


move
-
> have to update pointers to the moving
object



Ruby’s GC does not recognize all pointers to
Ruby objects


In the C runtime stack


In regions allocated using “malloc” by C
-
libraries


Cannot update such pointers

Ambiguous root

(cloud mark)

Ambiguous pointer (blue arrow)

Exact pointer

move

Even so, we CAN move most objects


We can update pointers

contained in Ruby objects


Objects referred only from

Ruby objects can be moved


Most objects are referred only from Ruby
Objects

Most objects can be moved

This is the basic idea of the Mostly
-
Copying GC

Mostly
-
Copying GC [Bartlett ’88]


Objects referred only by exact pointers


Move it and update referencing pointers


Objects referred by ambiguous pointers
(as well)


Do not move it

The heap of Mostly
-
Copying GC


Break the heap into equal
-
sized blocks


From
-
space of copying GC is a set of blocks

root

To

To

To

From

From

From

Shrinking the heap

Free blocks are not contiguous in mostly
-
coping
collector



Release memory by the block


Block = hardware page


To release a block, do not access the block


Because such a blocks has no live object, all we have to do
is not to allocate new objects on the block


Virtual memory system automatically reuses the page frame
assigned to the block


(optional) We can tell the OS that the page has no
valid data


madvise system call (Linux)

C
-
libraries


C
-
libraries wraps “malloc”
-
ed data to handle as
Ruby objects. A wrapper object has:


A pointer to “malloc”
-
ed area


A function that “marks” objects referred from the data


NO pointer updating interface

traverse

(data) {


mark(data
-
>p1);


mark_location(…);

}

p1

Treat all pointers from

“malloc”
-
ed data

as ambiguous pointers

Agenda


Shrinking the heap using Mostly
-
Copying
GC


Modified Mostly
-
Copying algorithm


Evaluation


Related work


Conclusion

Mostly
-
Copying GC of Bartlett


Objects referred only from exact pointers


Copy it to to
-
space


Objects referred from ambiguous pointers


Move the containing block to to
-
space logically

(they call this
promotion
)



The algorithm may encounter new ambiguous
pointers. Pointed object may have been copied.


Bartlett’s algorithm copies all objects even if they are
pointed by ambiguous pointers.


Objects in blocks promoted are eventually written
back from their copies.

Problem


Memory efficiency


Copy objects even referred by ambiguous pointers


Garbage in promoted pages is not collected

root

Problem


Memory efficiency


Copy objects even referred by ambiguous pointers


Garbage in promoted pages is not collected

root

Problem


Memory efficiency


Copy objects even referred by ambiguous pointers


Garbage in promoted pages is not collected

root

Modify the algorithm


Mark
-
Sweep GC before Copying


Mark: find out ambiguous root


Objects referred by ambiguous pointers no more be
copied


Sweep (only promoted block)


Each block has a free
-
list


All Ruby objects are 5 words

=> Do not cause (external) fragmentation

Modified Algorithm(1/4)


Trace pointers from the root set


Mark all visited objects


Promote blocks containing objects referred by
ambiguous pointers

root

Promoted

(thick border)

Live mark

Modified Algorithm(2/4)


Sweep promoted blocks


Collect objects that are not marked

root

Modified Algorithm(3/4)


Copying GC (Using promoted block as the root set)


Do not copy objects in promoted blocks

root

Modified Algorithm(4/4)


Scan promoted blocks to erase mark of each objects

root

空き

空き

空き

The only change of C
-
libraries


Mark
-
array


An array that has the same pointers held in “malloc”
-
ed data


The C
-
library marks only the mark
-
array


The collector can traverse further


But, it cannot recognize they are ambiguous pointers


Remember: all pointers from “malloc”
-
ed data are treated as
ambiguous ones



Impact


2 modules


3 parts

Change C
-
libraries so that THEY
scan mark
-
array as ambiguous roots

Evaluation


Ruby VM


YARV r590

(This is old but has
essentially the same
GC as Ruby 1.9)



Items


Heap size


Elapsed time


Environment


CPU: Pentium 3GHz


OS: Linux 2.6.22


compiler:


gcc 4.1.3 (
-
O2)


Benchmark Program

2.times {


ary = Array.new


10000.times { |i|


ary[i] = Array.new


(1..100).each {|j|


ary[i][j
-
1] = 1.to_f / j.to_f


}


if (i % 100 == 0) then CP() end


}


10000.times { |i|


ary[i] = nil


if (i % 100 == 0) then CP() end


}


30000.times { |i|


100.times{
“”

}


if (i % 100 == 0) then CP() end


}

}

Increases live objects

(processing heavy req.)

Decreases live objects

(end of heavy req.)

Make short
-
live objects

(series of ordinary requests)

Profiling the heap by each

100 loops checkpoints

Heap size

(MB)

Checkpoint

Our VM

Traditional VM

Black line: amount of live objects

0
20
40
60
80
100
120
140
factorial
mandelbrot
raise
strconcat
concatenate
count_words
exception
lists
object
random
array
regexp
send
thread
GC
computation
(%)

Relative elapsed time of our VM

(Relative to traditional VM)

Average (except for thread)

102%

Related work


Customizable Memory Management
Framework [Attardi et. al ’94]


Collect garbage by sweeping promoted blocks


Ambiguous pointer are found out during
copying


Copies of objects that has been copied when the
collector recognizes they should not be copied will
become garbage


Our algorithm detects such objects before copying

Related work


MCC [Smith et. al ’98]


Pins objects referred from ambiguous root


Always manage locations of ambiguous root
by a list


C
-
libraries have to register/unregister ambiguous
root each time they “malloc”/”free”


Our algorithm finds ambiguous root by tracing at
the beginning of GC


Related work


Ruby 1.9


Reduce the size of additional heap to 16KB

(i.e., heap is expanded by the 16KB block)


Increase the opportunity for releasing


Objects become distributed all over the heap as
execution advances


We compact the heap

Conclusion


Implemented mostly
-
copying GC on Ruby
VM


Modify the algorithm for memory efficiency


Evaluated its implementation


Shirked the heap after those phases of a
program where it temporary uses a lot of
memory


Elapsed time to execute benchmarks is
comparable to traditional VM

Heap size (with Ruby 1.9)

(MB)

checkpoint

Black line: amount of live objects

Our VM

YARV

Ruby 1.9

Increase as

time spends

(even Ruby 1.9)

Benchmark Program 2

2.times {


ary = Array.new


10000.times { |i|


ary[i] = Array.new


(1..100).each {|j|


ary[i][j
-
1] = 1.to_f / j.to_f


}


if (i % 100 == 0) then CP() end


}


10000.times { |i|


ary[i] = nil


if (i % 100 == 0) then CP() end


}


30000.times { |i|


100.times{
“”

}


if (i % 100 == 0) then CP() end


}

}

sum = 0

ary[i].each {|x| sum+=x}

ary[i] = sum

Make some long
-
lifetime

objects during
decreasing

phase

Heap size (benchmark 2)

(MB)

checkpoint

YARV

Our VM

Ruby 1.9

0
20
40
60
80
100
120
140
factorial
mandelbrot
raise
strconcat
concatenate
count_words
exception
lists
object
random
array
regexp
send
thread
GC
computation
(%)

Relative elapsed time of the VM with Bartlett’s

Algorithm. (Relative to traditional VM)

Related work


Generational GC for Ruby [Kiyama ’01]


Generational Mark
-
Sweep GC


Reduced GC time


Uses much memory


All objects have extra two words (double
-
linked list) for
representing generations


Mostly
-
Copying GC can divide space for
generations [Bartlett et. al ’89]