Multithreading in Java (AW3)

lightnewsSoftware and s/w Development

Nov 18, 2013 (3 years and 4 months ago)

80 views

Multithreading in Java (AW3)
Assignment for the M.Sc.Course on
Distributed Systems
Adrian Reber
Table of Contents
1.Cooperation......................................................................................................................1
2.Introduction......................................................................................................................1
2.1.Threads....................................................................................................................2
2.2.Threads in Java........................................................................................................2
2.3.Monitors in Java......................................................................................................2
3.Aims and objectives.........................................................................................................2
3.1.Thread-per-request architecture..............................................................................2
3.2.Thread-pool architecture.........................................................................................3
4.Implementation................................................................................................................3
4.1.The common part....................................................................................................3
4.2.The thread-per-request architecture........................................................................4
4.3.The thread-pool architecture...................................................................................5
4.4.The jobs...................................................................................................................5
5.Evaluation.........................................................................................................................5
5.1.I/O job.....................................................................................................................6
5.2.Arithmetic job.........................................................................................................7
5.3.Cryptographic job...................................................................................................9
6.Conclusion......................................................................................................................11
6.1.Queues...................................................................................................................11
6.2.Cooperation...........................................................................................................11
Bibliography.......................................................................................................................12
Glossary..............................................................................................................................12
1.Cooperation
Some parts of this assignment were carried out in cooperation with Volker Heymann,Jörg
Seitter and David Vogler.In this team,it was agreed to use the same interface for the jobs
as well as for the jobserver.By using a common interface for the jobs and the jobserver,it
is possible to exchange jobserver and jobs very easily,thus making it very easy to compare
different architectures and different kinds of jobs.
1
Multithreading in Java (AW3)
2.Introduction
2.1.Threads
The traditional approach to concurrent processes is to spawn ( fork()) a new process for
every request for the service this process is providing.A different approach to concurrent
proccesses is threads.
Athread in programming terms is a lightweight process that runs in parallel (or better quasi-
parallel - at least on a single processor machine) to others within the same address space.
The difference to spawning a new process is that creation of a thread produces much less
overhead.
2.2.Threads in Java
There are two different ways to create a thread in Java.Either the new class extends the
class Thread or the class has to implement the interface
Runnable
.
In both situations the code in the method run() is the part which is executed as a different
thread of the process which is starting the thread.The thread is started by executing the
method start() of the thread instance.
2.3.Monitors in Java
Monitors are used to protect data.Mutual exclusion is assured in Java if the monitor concept
is implemented.The monitor concept guarantees that shared parts of a program are always
accessed by only one thread.Implementation of a monitor is rather simple.The methods
which should be synchronized (only one of these methods can be executed at a certain time)
are just extended using the keyword synchronized.
3.Aims and objectives
The goal of this assignment is to design and implement a jobserver using threads in Java.
A jobserver is a system which receives from different clients different kinds of jobs which
are executed by the jobserver and then the result is returned to the client.Every job has to
access some shared data from the jobserver.This data has to be protected by a mechanism
to ensure mutual exclusion.
The following two jobserver architectures are studied:
• thread-per-request architecture
• thread-pool architecture
3.1.Thread-per-request architecture
This architectures creates a thread for every request reaching the jobserver.This architectures
has obvious drawbacks.One drawback is that,for every job the jobserver receives,a new
2
Multithreading in Java (AW3)
thread has to be created.Creation of a newthread or process is always a resource-consuming
act and takes a lot of time.A better solution would be that a number of threads is created
at startup and they are only activated (and not created) when a job is received.Every thread
which has finished executing the job just exits and cannot be used for another job.
3.2.Thread-pool architecture
This architecture avoids the overhead of thread creation by creating a pool of threads at
startup and keeping them running as long as the jobserver is running.This means having a
queue to manage the incoming jobs.Every job submitted by a client is immediately accepted
by the server and put in an incoming queue.The queue is then used by the running threads
whenever they have finsihed their current job and are idle again.
A possible extension to this architecture is to create more threads if there are many objects
in the incoming queue.This would be a combination of both architectures.A given number
of threads is spawned at startup and,if there are many requests for the service,then the
jobserver can increase the number of threads to react dynamically to the newcircumstances.
These new threads are created once and then used for many jobs in contrast to the thread-
per-request architecture.It is Important to set a limit of concurrent threads in the system so
that the workload is appropriate for the environment (e.g.hardware) the jobserver is running.
From a theoretical point of view,the thread-pool architecture seems much more intelligent
and efficient.However,this will be discussed later in more detail in the analysis of the two
systems.Another factor of the effiency of the systemis probably the type of job the jobserver
has to perform.Jobs which are very calculation-intensive are probaly different from jobs
which are more I/O intensive.
4.Implementation
Both architectures (thread-pool and thread-per-request) were implemented.The implemen-
tation of both architectures was straightforward and some points of the implementation are
very similiar.
4.1.The common part
As already mentioned in the beginning,some parts of this assignment were carried out in co-
operation.The following pictures showthe class AbstractJob and the interface
JobEngine
which are the basis of both implementations.
3
Multithreading in Java (AW3)
!"#$%&&%'#*+,-r $-/%+
0.1'23
0454$1-423
064-7%&&1'8$ -8%'98&423
064-7%&:1-8'698&423
064-;'$%&8'6<141498&423
064-=1-6%8'6<141498&423
064-9%- >98&423
064-7>84'-?4$48@4d23
064-7>84'-B1b&8--4d23
08,D%'423
064-;d23
064-;'<1414?4$48@4d23
064-;'<1414B1b&8--4d23
064-=1-<1414?4$48@4d23
064-=1-<1414B1b&8--4d23
064-P.%$4,,98&4B- .-23
064-P.%$4,,98&4B-%:23
0,4-7>84'-?4$48@4d23
0,4-7>84'-B1b&8--4d23
0,4-D%'423
0,4-;d23
0,4-;'<1414?4$48@4d23
0,4-;'<1414B1b&8--4d23
0,4-=1-<1414?4$48@4d23
0,4-=1-<1414B1b&8--4d23
0,4-P.%$4,,98&4B- .-23
0,4-P.%$4,,98&4B-%:23
0,4-7%'-45-23
0d1&:B- -23
064-=1-<1414B8E423
0,4-=1-<1414B8E423
064-;'<1414B8E423
0,4-;'<1414B8E423
FF8'-4.G $4HH
!"#$%&&%'#/%+E'68'e
0,1b&8-/%b23
0:%>>?4,1>-23
0:%>>?4,1>-23
064-/%b7%1'-23
064-*@4. 64/%b98&423
064-M 5/%b98&423
0,4-/%b98&423
0,4-/%b7%1'-4.23
Figure 1.The common parts
The advantage of using a common and well-defined interface is that it is easy to exchange
jobserver and jobs between all members of the teamto evaluate different jobserver architec-
tures and different kinds of job.
Due to the fact that the interface
JobEngine
is designed with RMI
1
in mind,the jobserver
implementations are RMI aware and the clients are connect to the server via RMI.This was
not a requirement of this assignment,but by using RMI,the idea of a jobserver running on a
different machine than that of the client was realized.
Another common attribute of the jobservers is that the jobs are not returned to the clients.
The jobs are always put in a queue where they wait to be picked up by the client.This polling
architecture was chosen,because with this design,the job does not need to hold a reference
to the client which submitted it to the jobserver.In general,this would not be a bad idea,but
our common design was chosen to be extremely flexible and extensible.While using RMI,
it would be no problem to use a reference to the client,but if the communication between
server and client changed,then it would get very difficult.
Another requirement was that the jobs have to access shared data at the jobserver.The meth-
ods providing the write access to the shared data are synchronized using the monitor concept
to ensure mutual exclusion.
4.2.The thread-per-request architecture
This architecture is really simple.The jobserver receives a job froma client and immediately
creates a new thread.This thread executes the method execute of the job and waits until
the job has finished.After the job has been completed,it is put in a queue,where it waits to
be picked up by the client.The thread dies after the execution of the job and is not reused.
This implementation has no limitations regarding the number of threads running.
1.RMI - Remote Method Invocation
4
Multithreading in Java (AW3)
My expectation is that this architecture is slower due to the overhead caused by the fact that
for every job a new thread is created and then not reused.
4.3.The thread-pool architecture
This architecture is somewhat more sophisticated.The main difference to the thread-per-
request architecture is that there are number of threads which are always running even if
there is nothing to do.To prevent idle threads using too much CPU time,the threads are put
in a wait state by issuing the wait() method.
The jobs submitted by any number of clients are not directly executed by the waiting threads
but are put in an incoming queue.As soon as there are elements in this queue,the sleeping
(waiting) threads are activated by calling the method notify() of the thread.
The thread then starts the job and,if there are no more jobs in the incoming queue,goes
back to its wait state until it is notified again.The finished jobs are put in an outgoing queue
just as in the thread-per-request architecture.
Another additional feature of this implementation is that the jobserver creates more threads
if the incoming queue is rather large.The maximumnumber of threads running at startup and
the total maximumare parameters which strongly affect the performance of this architecture
and have to be selected accordingly to the environment the jobserver is running and the kinds
of jobs the jobserver is executing.
4.4.The jobs
There are three different kinds of job implemented:
• Arithmetic job:This job just calculates prime numbers.It starts at a given point and then
searches for a specified amount of following prime numbers.
• Cryptographic job:This loads a given file and then encrypts it using the Blowfish algo-
rithm.This job combines an I/O job with a computing job.
• I/O job:This is not a real I/O job but just a simple simulation.As an I/O jobs means that
the thread just has to wait until the I/O operation has finished,this job just does a sleep
for a given time.
As the jobs are not a very important part they were not implemented but
downloaded[SnipSfNet].Only minor modifications were made so that they were more
suitable as jobs for the jobserver.
5.Evaluation
To evaluate the different architectures of threaded jobservers,the time each implementation
needs to execute the different jobs is measured.Each job has methods with which the server
can set timestamps at the different stages the job passes in the jobserver.
• Communication time:This is the time the job needs fromthe client to the server and back.
• Computation time:The time the jobserver actually needs to execute the job
5
Multithreading in Java (AW3)
• Incoming queue time:This is the time the job was in the incoming queue if the jobserver
implementation is using an incoming queue.
• Outgoing queue time:Same as above but for the time the job is in the outgoing queue
after it has finished.
5.1.I/O job
The I/O job described above is the first to be analyzed.To put it more precisely,it is not the
job that is analyzed,but the behaviour of the two different jobserver architectures.
100
120
140
160
180
200
0
20
40
60
80
100
Thread-pool execution time (ms)
Thread-per-request execution time (ms)
Figure 2.I/OJob execution time
The difference between the thread-per-request architecture and the thread-pool architecture
is obvious.The execution times of the thread-pool architecture are always very similiar in
contrast to the execution times of the thread-per-request architecture where it is almost not
predictable how long the next job will take.
This difference between the two architecures is easy to explain.The thread-per-request ar-
chitecture has to create a new thread for every job it receives and the thread creation time
is very dependent on the situation of the machine the jobserver is running on and cannot be
easily predicted.As the thread-pool architecture does not have a thread creation time over-
head,the execution time is equivalent to the real time the jobserver needs to execute the
job.
6
Multithreading in Java (AW3)
The total time (as opposed to the computing time) that the thread-pool architecture needs to
complete 1000 jobs is much greater than that required by the thread-per-request architecture.
The total time includes communication time,inqueue time,computing time and outqueue
time.
• thread-per-request architecture total time:474,388 milliseconds
• thread-pool architecture total time:3,528,618 milliseconds
This big difference (almost factor 10) is due to the fact that the thread-per-request architec-
ture does not have an incoming queue but every job is executed immediately.This difference
can be seen in the following figure:
-500
0
500
1000
1500
2000
2500
0
200
400
600
800
1000
Thread-pool inqueue time (ms)
Thread-per-request inqueue time (ms)
Figure 3.I/OJob inqueue time
With this kind of job,it is no problemif over thousand jobs are executed simultaneously:as
soon the job is started,it requires no additional resources to be finished as it only sleeps for
100 microseconds.If the kind of jobs is more computing intensive,there will probably be a
different result as parrallel execution of computing intensive jobs extends the execution time
dramatically and an architecture with fewer parallel threads and an incoming queue becomes
much more effective.
7
Multithreading in Java (AW3)
5.2.Arithmetic job
The next job used to analyze the different architectures is a pure computing job which
strongly depends on the CPU of the machine running the jobserver.The following figure
shows the execution time of this job:
2000
4000
6000
8000
10000
12000
14000
16000
0
20
40
60
80
100
Thread-pool execution time (ms)
Thread-per-request execution time (ms)
Figure 4.Arithmetic job execution time
It can be seen very well that the thread-pool architecture performs much better.This can
again be explained by the overhead the creation of each threads needs.Another reason for
the better performance of the thread-pool architecture is that there are fewer threads run-
ning in parallel.With the thread-per-request architecture,there a lot more threads running
in parallel and every thread is requesting the CPU and so another overhead is introduced:
context switching.Each time a thread enters and leaves the CPU,the complete thread has to
be copied to the CPU and back to the memory again and the more active processes (threads)
are running on the CPU the more time is consumed with this task.
Again,the total time for 100 jobs is much greater for the thread-pool architecture than for
the thread-per-request architecture,whereas the execution time is less for the thread-pool
architecture.
• thread-per-request architecture total time:694,955 milliseconds
thread-per-request architecture execution time:663,027 milliseconds
8
Multithreading in Java (AW3)
• thread-pool architecture total time:4,024,001 milliseconds
thread-pool architecture execution time:489,525 milliseconds
One thing these numbers cannot illustrate is that the total time for the thread-pool architec-
ture might look much greater than the total time of the thread-per-request architecture but,
as already mentioned,this is due to the fact that the thread-per-request architecture does not
have an incoming queue where the jobs have to wait to be executed.In this implementa-
tion of the thread-per-request architecture,the jobserver stops to respond to a request from
a client to submit a job when the jobserver has too much work to do.When this happens,
the jobs have to wait at the client until submitted to the server and this time is not recorded
anywhere.So another difference between the architectures is that the thread-pool architec-
ture can submit all jobs to the server whereas,with the thread-per-request architecture,the
incoming queue moves from the server to the client without the intention of implement-
ing it that way.The effect of the incoming queue can be seen very nicely in the following
diagramm:
0
10000
20000
30000
40000
50000
60000
70000
80000
0
20
40
60
80
100
cpu time in milliseconds
Thread-pool inqueue time (ms)
Thread-per-request inqueue time (ms)
Figure 5.Arithmetic job inqueue time
The thread-pool architecture accepts all jobs for the clients and they are then held in the
incoming queue as can be seen very well in this figure.The thread-per-request architecture
seems to have no incoming queue time.That is,as already mentioned,because it has no
incoming queue.Either all jobs are executed simultaneously or the jobserver just does not
accept the jobs and they are waiting on the client side.
9
Multithreading in Java (AW3)
5.3.Cryptographic job
The execution of this kind of job gives a very interesting outcome:
0
50000
100000
150000
200000
250000
300000
350000
0
20
40
60
80
100
cpu time in milliseconds
Thread-pool execution time (ms)
Thread-per-request execution time (ms)
Figure 6.Cryptographic job execution time
This result is really fantastic because it displays the effect which was always anticpated.The
thread-pool architecture seems much more effective than the thread-per-request architecture.
Another nice effect which can be seen is that the first 6 jobs of the thread-pool architecture
take noticeably longer than the rest.This is rather easy to explain.This job loads a file from
the disk and encrypts it using the Blowfish algorithm.As the number of parallel threads is
six,these are the jobs waiting for the I/Oto finish.The other jobs do not have to wait because
the operating systems reads the file from the file cache and not from the real device.This
effect cannot be seen in thread-per-request architecture probably because this analysis was
run after the thread-pool architecture analysis and the file was already in the file cache.A
better analysis would have been to force the file to be read again fromthe real device.
One very interesting point cannot be seen in the figure.As the first task this job does is to
read a file,the CPU is not heavily loaded and the thread-per-request architecture jobserver
accepts all 100 jobs at once which results in 100 cryptographic jobs running in parallel.
This also the reason why there is such a big difference in the execution time of the two
architectures.
The queuing is almost the same as in the arithmetic job and is not investigated further here.
10
Multithreading in Java (AW3)
• thread-per-request architecture total time:30,908,326 milliseconds
thread-per-request architecture execution time:30,821,532 milliseconds
• thread-pool architecture total time:15,205,009 milliseconds
thread-pool architecture execution time:1,665,369 milliseconds
6.Conclusion
The results of this analysis are very interesting.Some were as expected,some others were
totally different.One thing,for example,was that the thread-pool architecture was expected
to be much more efficient for every kind of job and the difference to the thread-per-request
architecture much bigger.This result may originate from the implementation of the two
architectures or one type of job may fit better for one of the two architectures than for the
other.So if implementing a real-world jobserver application,the decision which architecture
should be implemented depends heavily on the type of job it has to perform.
The expected drawback of thread creation overhead for the thread-per-request architecture
was not as great as expected.At least for a small number of jobs,the difference between the
two architectures was sometimes so small that it could have been just a measuring error.The
big disadvantage of unlimited parallel threads was only discovered in the last analysis with
the cryptographic job because there,all jobs a client submitted were running at once.This
disadvantage does not only come from the architecture but also from the implementation.
It would have been no problem to implement a maximum number of active threads.This
change would introduce another problem:as this implementation has no incoming queue,
the server would just refuse the client request to submit more jobs.If an incoming queue is
implemented,then both implementations are very similiar and the difference would not be
very obvious.It could only be measured with a huge number of very short jobs,so that the
thread creation time is the biggest part of the job execution.This can be seen in the analysis
of the I/O job.
The main difference is the number of jobs which are running in parallel.The overhead of
thread creation gets significant if there are a large number of jobs.If the ratio of job time and
and job number is large,then a thread-pool architecture will show its advantages.
6.1.Queues
The implementation of the queues is not very sophisticated.Elements are always put at the
end of the queue and fetched fromstart.This may sound like a fair approach,but there is no
control wether an itemhas been just submitted to the queue or has already been waiting for
a long time.The reality showed that most of the jobs are in the queues for a short time but
some jobs were in the queues extremly long.As queue theory is a rather complex matter and
not part of this assignment,the strange behaviour of the queues was not further inverstigated.
6.2.Cooperation
The idea of cooperating with others in this assignment was a good idea but,unfortunately,it
did not work as well as expected.Due to the fact the everybody in the team had a different
11
Multithreading in Java (AW3)
schedule,I was not able to use any of the other implementations of a jobserver or a job to
analyze for this assignment.Only the design of the job and the interface for the jobserver
was shared as they were defined together in the beginning.
Bibliography
Internet
[SnipSfNet] Snippet Library:http://sourceforge.net/snippet/.
[JavaApi] Java 2 Platform,Standard Edition,v 1.4.1 API Specification:
http://java.sun.com/j2se/1.4.1/docs/api/.
Books
[Lea02] Doug Lea,2000,0-201-31009-0,Addison Wesley,Concurrent Programming in
Java Second Edition:Design Principles and Patterns.
Glossary
C
CPU - Central Processing Unit
I
I/O - Input/Output
R
RMI - Remote Method Invocation
It is a mechanismthat enables an object on one Java virtual machine to invoke methods
on an object in another Java virtual machine.
12