Performance Analysis of the

computerharpySoftware and s/w Development

Dec 2, 2013 (3 years and 10 months ago)

57 views

Performance Analysis of the
Objive Hive/Bee Framework






Sean A. Pfeifer

SE655

4/28/2008

Intended Audience


Software engineers/computer experts with
assumed knowledge


Stakeholders:


Professor Andrew Kornecki


Application of techniques and methods presented in
SE655.


Sean A. Pfeifer


Statistical evidence for improvements of network
architecture.

Outline


Five main sections


Description of the issue and project objective


Description of test environment


Experiment design


Result summary and analysis


Conclusions

Objective?


Analyze performance of Objive Hive/Bee
framework


Used in systems that handle large amounts
of simultaneous, frequent requests using
TCP


E.G.) Simulation: Client sends position to server,
server sends update to all other clients


Is Java Serialization a bottleneck


can we
improve performance by going non
-
serial?

Serialization


“The process of saving an object's state to a
sequence of bytes, as well as the process of
rebuilding those bytes into a live object at
some future time”


In these cases, serialization is done on
objects containing primitives only


Only performed when server communicates
to clients

Project Summary


Experiment set up with common
-
usage
scenario in mind


Non
-
serialized data transfer was
implemented


Measurement (timing) statements added


Test harness used to support runs


Recorded data


Analyzed


Conclusions?

SUT


Includes two systems (client and server)
connected over a closed network.


Client runs each version of Objive Bee


Server runs each version of Objive Hive


Two Bee/Hive versions:


Serialized data from server to client


Non
-
serialized data both ways


DataOutputStream

CUS


Communication portions of the Hive/Bee
packages


Specifically, use of serialization

Metrics


Round
-
trip time


Primary metric


From client preparation to send data to fully
receiving response from server


Emphasizes the speed of operation of the entire
system


Cuts down on measurement error in differing
clocks


Size of transmission


One measure for each alternative


Small, frequent requests = bandwidth concerns

Factors


Type of transmission


Serialization with ObjectOutputStream


Non
-
serialized with DataOutputStream


Number of simultaneous connections


1


5


10

Parameters


Size of the data to be sent


6
floats,
1
boolean


Network latency


Closed LAN with only client+server


Data transmission rate


Once every
50
ms (
20
Hz)


Measurement Methodology


Why measurement?


Software was available to be analyzed


Non
-
extensive modifications necessary


Provides most accurate results, compared to
other methods


Two timing statements added:


One before data was prepared to be sent


One after response received


Size measured separately

Measurement Tools


Java currentTimeMillis()



For measuring time


Avg. overhead over
1
million measurements:
0.002657
ms


Ethereal (
http://www.ethereal.com
)



Measuring transmission size


Microsoft Excel


Recording measurements


Performing ANOVA


Performing calculations on data

Initial Runs


One hundred runs


Interested in
95
% confidence with error
allowed +/
-

3.5
% (
7
% total error)



Resulted in a minimum required runs of
762
,
so we used
1000


Allowed for possibility of removing values if they
were determined to be outliers

Initial Runs (contd)


Results

Results (contd)


Size Results


Anomaly occurred during size
measurements


Server continued to send ACK packets to client
after main communication had completed


Same size other than extraneous ACKs

Possible Sources of Error


Implementation of non
-
serialized
communication


Strange behavior noticed may indicate a deeper
issue with implementation


Use of the router


Routing algorithms/behavior may bias results to
one alternative or other


Perturbation due to timing statements


Overhead may have been insignificant, but could
affect the system in unforeseen ways

Analysis


Two
-
factor ANOVA with replication


Before
-
and
-
after comparison


Difference in means


Compare each level of simultaneous
connections between alternatives

ANOVA


Calculated F values exceed critical F values


The differences in the means are due to real
differences between alternatives

Difference in Means


Differences are significant, even with one client,
where it the confidence interval is closer to zero


We can say, with 99% confidence, there are
statistically significant differences between the
alternatives


Differences above are increase in mean time after
change to non
-
serialized communication

Alternative Approach: Queuing


Queuing analysis fits this type of system


Server only able to process one client at a
time


Capable of receiving multiple clients, but can
only process one request at a time


Size of queue is virtually unlimited


Clients send messages rate of 20 Hz


Service time depends on alternative


Seek to find the impact of the difference in
service time on overall performance

Conclusions


Object serialization is not a bottleneck in this
case


We statistically determined the previous
implementation of the framework was faster


Size results were inconclusive, but seemed
even

Lessons Learned


Plans may need to change for performance
analysis projects, as well


Originally planned to also do UDP vs TCP


Fell through due to lack of time to implement


Statistical techniques are not hard to
perform, difficult part was:


Setup for the techniques, including recording
and manipulating data


Analysis of results


These statistical techniques can be applied
to real
-
world problems to give a certain
confidence in results!

Questions?