Group 18

idleheadedceleryMobile - Wireless

Dec 10, 2013 (3 years and 6 months ago)

93 views

Cellular Network Performance
Measurement

Class Presentation for CS 234
-

Advanced Networks

b
y
Pramit

Choudary
,
Balaji

Raao

&
Ravindra

Bhanot

(Group 18)

Instructor: Professor
Nalini

Venkatasubramanian


05/10/2012

Papers considered


Paper 1: Understanding Traffic Dynamics in
Cellular Data Networks by U. Paul, A.
Subramanian, M.
Buddhikot
, S. Das, IEEE
INFOCOM 2011, Shanghai, China



Paper 2: An Untold Story of
Middleboxes

in
Cellular Networks, SIGCOMM 2011, Toronto,
Ontario, Canada


(NOTE: Please refer to the relevant papers listed above in place of ‘paper
1’ or ‘paper 2’ found in the presentation slides.)

2

Each claim to cater different data rates, ranges in
operation, needs of end user/application, energy
savings, etc using different protocol designs, business
strategies, network deployments and many more.

Background
-

Internet/Data Access?


Dial
-
up connection


Broadband (DSL, Cable Internet, Fiber Optics)


Wi
-
Fi (IEEE 802.11 standard) &
WiMAX

(IEEE
802.16 standard)


Mobile Broadband using 2.5G, 3G, 4G
technologies



3

Background
-

Cellular Networks and
interconnecting subsystems

4

4G: Fourth generation of cell phone mobile communications standard

3G: Third generation of cell phone mobile communications standard

Femtocell
: Small cellular base station designed for use in a home or small business

IMS: IP Multimedia Subsystem, used to provide mobile and fixed multimedia services

Image courtesy: radisys.com

Background
-

Broadband Cellular Networks


E.g. HSPA

-

Mobile telephony protocols used in
3G cellular networks for mobile data access.


Broadband cellular access becoming most
common and pervasive world
-
wide.


Fueled by introduction of user
-
friendly smart
phones, notebooks, tablets, eBook readers.

5

Background
-

A look at
smartphone

technology

6

Courtesy: Technology Review, Published by MIT, May 9
th

2012

Background
-

Broadband Cellular Networks


Has led to innovative & flashy mobile applications
like gaming, video streaming, social networking,
etc.


Use of several and various types of
middleboxes

to manage the scarce resources (because same
resources are shared mostly) in the network and
to protect them
e
.g. Network Address Translation
(NAT) boxes, firewalls, etc.



7

Background
-

On usage of
middleboxes

8



Many times, cellular network
middleboxes

(deployed by carriers like
AT&T, T
-
Mobile) and mobile applications (application developers)


managed independently.



Knowledge mismatch
-
> End
-
to
-
end performance degradation, Increase
in energy consumption, Introduce security vulnerabilities.



E.g. Carrier setting aggressive timeout for inactive TCP connections in
the firewall and disrupting long lived and occasionally idle connections
maintained by applications like instant messaging, push
-
based email, etc.



Need for understanding the effects of
middleboxes

in cellular network.



Paper 2 specifically focuses on NAT boxes, their policies & firewall and
its policies.


Background
-

Broadband Cellular Networks (Contd.)


Expectations in increase in the volume of data
seen exponentially.


Supporting such an increase requires good
understanding of traffic dynamics and its impact
on resource allocation on the service provider’s
network.


Leading to better resource planning, network
designs, spectrum allocation and energy savings.



9

Background
-

Broadband Cellular Networks (Contd.)


For some exciting numbers, refer to a white
paper by Cisco on global mobile data traffic
forecast for 2011
-
2016:
http://www.cisco.com/en/US/solutions/collat
eral/ns341/ns525/ns537/ns705/ns827/white_
paper_c11
-
520862.html



10

Paper 1
-

Short Summary


Discuss traffic dynamics specific to 3G cellular
networks.


End user perspective: Study subscriber traffic
patterns, number of distinct base stations visited
by subscribers, relate mobility and traffic,
subscriber temporal activity & relate subscriber
activity and traffic.


Network component perspective: Study
aggregated load at base stations, base station
load distributions, spatial characteristics,
temporal characteristics and
spatio
-
temporal
characteristics of network load at base stations.

11

Paper 1
-

Short Summary (Contd.)


Provide implications on the measurements and
observations made.


Test conducted in
2007

for a week over a US
nation
-
wide network with thousands of base
stations and with entire subscriber base (order of
hundreds of thousands i.e. close to a million).


Performed measurements on all generated data
packet headers (not including payloads) and on
signaling & accounting packets.

12

Paper 1
-

Subscriber Traffic Dynamics


Subscriber Traffic Distribution:


13

KEY OBSERVATIONS



Heavy users: Users who generate
as high as 10GB of traffic per day
(10^5 times median).



Light users: Users who generate
less than 1KB per day.



CDF shifts left over weekends.


INFERENCE



Less traffic on weekends relative
to traffic on working days.

Fig. CDF of total traffic
volume per subscriber per
day.

Paper 1
-

Subscriber Traffic Dynamics


Subscriber Traffic Distribution (Contd.):


14

KEY OBSERVATIONS



1% of the subscribers create more
than 60% of the daily network
traffic.



10% of subscribers create 90% of
the daily network traffic.


INFERENCE



Imbalance in network usage with
few subscribers (10%) using much
of the network resources.

Fig. CDF of normalized
traffic volume over the
percentage of subscribers
per day.

Paper 1
-

Subscriber Traffic Dynamics


Implications of Subscriber Traffic Distribution
:

1)
An unlimited data plan with flat rate pricing is
not efficient both from the carrier’s
perspective and subscriber’s perspective.

2)
CDF graphs shown in previous two slides can
be used to create a ‘tiered’ rate plan for data.

3)
Tiered rate plan deals with providing different
pricing options based on data usage.


15

Paper 1
-

Subscriber Traffic Dynamics


Implications of Subscriber Traffic Distribution
(Contd.)
:

4) To alleviate the problem of high volume
subscribers creating poor experience for other
subscribers, high volume subscribers can be
provided with some incentives.

5) Paper doesn’t consider optimal pricing schemes
based on subscriber usage and network capacity.
It only provides heuristic implications for
subscriber traffic distribution.

16

Paper 1
-

Subscriber Traffic Dynamics


Subscriber Mobility (i.e. Base Stations Visited):


17

KEY OBSERVATIONS



Distribution similar on weekdays
and different on weekends.




60% of users are stationary (i.e.
constrained within a cell) and over
95% of users travel across less than
10 base stations in a day.



Highly mobile users (who visit
more than 50 distinct base stations
in a day) are about 0.01%.

Fig. CDF of number of
distinct base stations
visited by a subscriber each
day.

Paper 1
-

Subscriber Traffic Dynamics


Subscriber Mobility (i.e. Base Stations Visited):


18

INFERENCE



Tendency of lesser degree of
mobility on weekends.




I
n terms of the number of distinct
base stations visited, the
o
verall
mobility is low.

Fig. CDF of number of
distinct base stations
visited by a subscriber each
day.

Paper 1
-

Subscriber Traffic Dynamics


Subscriber Mobility (Radius of Gyration):


19

Fig. CDF of radius of
gyration.



Radius of Gyration is the linear
size occupied by a subscriber’s
trajectory. Requires certain
duration of time (t) for
computation from subscriber’s
trajectory.



It is basically a root mean square
value.



Calculated with respect to the
center of mass point of the user’s
trajectory.

Paper 1
-

Subscriber Traffic Dynamics


Subscriber Mobility (Radius of Gyration):


20

Fig. CDF of radius of
gyration.

INFERENCE:



Shows the low level of mobility of
majority of subscribers (half of
them).

KEY OBSERVATIONS:



53% of subscribers are practically
static and almost 98% of the
subscribers have radius of gyration
less than 100 miles.

Paper 1
-

Subscriber Traffic Dynamics


Subscriber Mobility (Radius of Gyration):


21

Fig. Radius of gyration versus duration
of computation for subscribers
categorized into 4 groups according to
their final
rg

at the end of seven
-
day
period.

KEY OBSERVATIONS
:



Radius of gyration on an average
comes to a saturation point in just
few days (based on no. of hours).

Saturation indicates that some sort
of boundary in the movement area
has been reached. Quick saturation
measured in terms of ‘return
probability’ in next slide.



Users with larger radius of
gyration need longer time to
saturate.

Paper 1
-

Subscriber Traffic Dynamics


Subscriber Mobility (Radius of Gyration):


22

Fig. Probability distribution of time to
returning to the same location.

KEY OBSERVATIONS
:



Distribution has peaks at 24
th
,
48
th

and 72
nd

hours.

INFERENCE
:



Periodic nature of human
mobility with a 24 hour period (like
coming back home) and tendency
to return to the same location
periodically. This infers the
saturation of radius of gyration.

Paper 1
-

Subscriber Traffic Dynamics


Subscriber Mobility (Radius of Gyration):


23

Fig. Probability of finding a subscriber
at different locations that are ranked
on the basis of their frequency of
visits. Shows four categories of
subscribers who visit 5, 10, 30 and 50
distinct base stations.

KEY OBSERVATIONS
:



Location with rank, L = 1 indicates
the most visited base station for a
subscriber.



Subscribers spend 30% of their
time in the top two preferred
locations.

INFERENCE
:



Subscribers are found at their
favorite location with high
probability even there is high
mobility among them.

Paper 1
-

Subscriber Traffic Dynamics


Inferences on Subscriber Mobility so far

1)
Large fraction of subscribers have limited
mobility (roughly half of them are static
moving within just 1 mile).

2)
Subscriber mobility also exhibits periodic
behavior with high probability of returning to
same base station at same time of the day.

3)
Overall mobility is predictable.

4)
More mobile users tend to generate more
traffic.



24

Paper 1
-

Subscriber Traffic Dynamics


Implications on Subscriber Mobility

1)
Idea of caching content and delivering it to
subscribers who exhibit a predictable
mobility behavior
-

Innovative cloud
-
based
content delivery applications.

2)
Optimizing the location based services and
targeted ad
-
services through predictable
mobility pattern.


25

Paper 1
-

Subscriber Traffic Dynamics


Relating subscriber mobility and traffic they generate:


26

Fig. CDF of traffic generated per day
by subscribers
based on number of
locations (base stations) visited

in a
day.

Fig. CDF of traffic generated per day
by subscribers
based on radius of
gyration
.

Paper 1
-

Subscriber Traffic Dynamics


Relating subscriber mobility and traffic they generate:


27

KEY OBSERVATIONS FROM PREVIOUS SLIDE
:



Though the plot lines appear similar, they differ in
traffic volume for different number of base stations
visited and traffic volume for different radii of gyration.

INFERENCE
:



More traffic is generated by more subscribers.



Median traffic generated by subscribers in the highest
mobility category is roughly twice that of the
subscribers in the lowest mobility category.

Paper 1
-

Subscriber Traffic Dynamics


Implications relating to subscriber mobility
and traffic they generate
:

1) Planning resources dynamically based on
traffic generated by subscribers specific to
subscriber timings of movements.

2) Spectrum management based on timings of
traffic generated and in different cells.

28

Paper 1
-

Subscriber Traffic Dynamics


Subscriber Temporal Activity:


29

KEY OBSERVATIONS



About 28% of the subscribers
generate traffic only in single hour
during the peak hours.



A typical subscriber (i.e. median)
is active in the 4 different hours
during the peak hours. (Consider a
straight line
-
50% line
-

across the
graph)

Fig. CDF of number of
hours among peak hours (8
AM to 8 PM) subscribers
generate traffic.

It is the number of days (or hours)
in a week (or in a day), subscribers
generate traffic.

Paper 1
-

Subscriber Traffic Dynamics


Subscriber Temporal Activity:


30

INFERENCE:



Large fraction of subscribers
generate traffic only in few hours
within a day.



That is, more of number of
subscribers generating traffic is for
a lesser duration of time (for the
week / for a day).

Fig. CDF of number of
hours among peak hours (8
AM to 8 PM) subscribers
generate traffic.

Paper 1
-

Subscriber Traffic Dynamics


Subscriber Temporal Activity:


31

KEY OBSERVATIONS:



Median usage is about 100 sec.



For all 24 hrs (86,400 sec), very
few i.e. less than 1% of subscribers
use the radio channel.



Weekend usage again lower
compared to weekday usage.

Fig. CDF of airtime among
subscribers.



Airtime
: Amount of time a
subscriber holds onto a radio
channel regardless of whether it
communicates or not.

Paper 1
-

Subscriber Traffic Dynamics


Relating subscriber temporal activity and traffic they generate:


32

KEY OBSERVATIONS:



A typical heavy user appears in 4
to 6 different hours during peak
hours in the days they generate
traffic.

Fig. CDF of occurrence for
heavy users (within top
5000 in
atleast

one day in
the week with regard to
traffic) in peak hours.



INFERENCE
:



Most heavy users are actually
quite sporadic in traffic
generation and not habitual.

Paper 1
-

Subscriber Traffic Dynamics


Relating subscriber temporal activity and traffic they generate:


33

KEY OBSERVATIONS:



Subscribers generating less traffic
(<= 30 KB) have poorer effective bit
rate compared to more traffic
ones. May be due to the kind of
application they use (next slide).

Fig. CDF of effective bit
rate for subscribers
categorized by traffic
generated per day.


Effective bit rate is the ratio of
amount of actual traffic generated
by the subscribers to the airtime.
Metric for efficient radio channel
use.

Paper 1
-

Subscriber Traffic Dynamics


Relating subscriber temporal activity and traffic they generate:


34

KEY OBSERVATIONS:



P2P and http:yahoo have the best
channel efficiencies.



VPN, https and http for Google,
Microsoft have poorest
efficiencies.

Fig. Effective bit rate for
popular TCP based
applications.

INFERENCE:



Enterprise applications generate
less traffic compared to other
applications for the same airtime.



All applications have significantly
poorer effective bit rates compared
to nominal rates (
phy

channel).

Paper 1
-

Subscriber Traffic Dynamics


Relating subscriber temporal activity and traffic they generate:


35

REASONING for INFERENCE:



Enterprise applications (VPN)
tend to use network sporadically
like keep
-
alive messages and
typically not high throughput
applications.



Considering dormancy/sleep
modes, effective bit rate is poor for
VPN
-
like applications.



High throughput applications like
P2P use the channel better.

Fig. Effective bit rate for
popular TCP based
applications.

Paper 1
-

Subscriber Traffic Dynamics


Implications on effective bit rates
:

1)
Inefficiency in the usage of the radio
channel airtime drives the need for an
innovative protocol to use wireless channel
efficiently.

2)
Inefficiency arises because of wired
-
internet protocols used to access wireless
channel and hence better network
protocols need to be designed.

36

BASE
STATION TRAFFIC DYNAMICS


Aggregate
Load


Base Station Load
Distribution


Spatial
Characteristics


Temporal
Characteristics


Load


Auto
-
correlation


Spatiotemporal Characteristics

37

We focus on network behavior as
a whole or in terms of network components (base
stations) instead
of focusing on subscribers.

BASE
STATION TRAFFIC
DYNAMICS
-

Contd.


T
otal traffic split
into upload
and download for each day of
the
week.


Favorite weekends
see a lesser
load


Downloads dominate
relative
to uploads with more than
75% of
daily load
coming from
download
traffic



38

Aggregate Load
:

BASE STATION TRAFFIC DYNAMICS
-

Contd.

39

Aggregate Load
:

L
oad
on
the network
is
relatively low in the early
morning hours,
and roughly
similar during the day and
the evening.

40

BASE STATION TRAFFIC
DYNAMICS
-

Contd.



Base Station Load
Distribution:
Volume
of daily traffic load for each

base station

80
% of the base stations are loaded in the
range of
1
-

100MB
per day and 10% of the
base stations are highly
loaded (more
than
100MB per day).


shows the CDF
of daily
base station loads
normalized by the total network load.


10% of the base stations experience
roughly about
50
-
60% of the aggregate
traffic load.

In both
cases, weekend
behavior is slightly different than weekday
behavior. The
load
imbalance seems more pronounced in weekends.
Great
imbalance of the base station loads
indicates that a
more careful
cell planning is possibly needed. Network providers

may use smaller cells or microcells at the hotspots to
even out
the imbalance.

41

BASE STATION TRAFFIC
DYNAMICS
-

Contd.

Spatial Characteristics


Goal is to identify
whether or how
much spatially
correlated the network load
is.


E
stimates can potentially
help the provider to allocate
resources appropriately
.


Use of Voronoi cells to conduct the experiments



Voronoi
cell corresponds to the geographic region
of each
base station’s coverage
.

E.g. 10
shops in a flat city and their Voronoi
cells

42

BASE STATION TRAFFIC DYNAMICS
-

Contd.

More on Voronoi cells:

Region1

Region2


Voronoi cells in certain areas (city centers)
signifying some degree of cell planning.




We can readily see again that the cells are not
uniformly loaded in space. The load
differentials can extend several orders of
magnitude.



There
does appear to
be some
degree of
negative correlation between the Voronoi
cell

size
and load.



Large
Voronoi cells
mean sparsely
located base
stations, implying sparer
population density
.
No significant spatial correlation between
adjacent cells
is observed via visual inspection
of similar plots for
all days
.

43

BASE STATION TRAFFIC DYNAMICS
-

Contd.

Temporal
Characteristics:
correlation or predictable relationship between
signals


observed at different moments in time
.

1. Load
:


H
ourly
aggregate load of the entire
network and
highly
loaded base
stations.


Aggregate
network
load exhibits
a nice
periodic behavior with relatively high
loads
during the day and the lowest
load during
midnight.


Individual
base station loads do not
show that
much periodicity.


load curve varies significantly
among
individual
base stations with their
peaks occurring at
different times
of
the day
.

44

BASE STATION TRAFFIC DYNAMICS
-

Contd.

Auto
-
correlation
:


Rigorous
analysis of the
periodic behavior
describing the network


load is done using temporal correlation
for a load metric
.


Helps in understanding the
underlying trends and seasonal variations better.


Auto
-
correlation function of the
time
series



at
different lags.


Notice the plot shows a high degree
of



temporal
correlation
.


High
peaks occur
at 24
hour intervals
and low peaks at 12 hour intervals.


Isn’t this consistent
with diurnal human
activity patterns
.


45

BASE STATION TRAFFIC DYNAMICS
-

Contd.

Spatiotemporal
Characteristics:


Use of Moran I to investigate spatial behavior.


Moran's
I

is a measure of

spatial auto correction.



Spatial autocorrelation is
characterized by a correlation in a signal among
nearby locations in space. Spatial autocorrelation is more complex
than one
-
dimensional

autocorrelation because
spatial correlation is multi
-
dimensional
(i.e. 2 or 3 dimensions of space) and multi
-
directional
.


It’s defined as



is the is the hourly load on a
base station(random variable).

--

(x bar)

mean of x




’s
are the observations.




is the weight
associated with
each pair (


,


)

𝑁

is the number of observations
.

46

BASE STATION TRAFFIC DYNAMICS
-

Contd.

More on Moran I:


Binary weights
:




= 1, when the base stations are in close
proximity (a
threshold of 2 miles is used), else




=
0.


Moran’s
I metric
is plotted for
hourly loads of all base stations
in


the
network on a temporal scale.


Periodic behavior with
a diurnal cycle is
interesting
.


Appears
that
while temporal
usage patterns
of base stations may be very
different


and might even miss periodicity there is a


general
tendency for proximate base




station
loads to be
more correlated
when




the
loads are high
.


Correlation is fairly
small, rarely exceeding
0.15.


Min close
to zero, showing almost
independent loading
behavior around
midnights when generally the loads are
small.

47

Implication of variability in Base station Load


High degree
of variability in base station loads has
important
implication
on spectrum allocation and energy
saving schemes
in
the network
.



A
daptively
turning on/off certain carriers or radios in
base stations
based on the load experienced need to be developed
.



P
eak
hours of
different cells
vary a
lot



Dynamic
allocation
of spectrum
resources to highly loaded cells
during their
peak hours



Future Work:
model the
demand characteristics
on different cells in
cellular data
networks based
on measurements for a long period of
time and feed
the model
as inputs to dynamic spectrum allocation
algorithms. Study the observation

48

Paper 2


NetPiculet



Untold Story of
middleboxes



Cellular networks becoming more and more ubiquitous and
\



pervasive.




Two major players involved in such networks




-

Network providers


-

Application developers



Cellular Networks also face problems similar to their Internet
counterparts such as IP address space depletion and security
loopholes



Moreover cellular networks have limited resources



To make best use of their limited resources, number of
middleboxes

deployed by providers to enforce policies


49


NetPiculet




An Android Application opened to
mret

place in January 2011


in order to record policies




Major policies tested are NAT and Firewall



Tested over 6 continents and 107 different carriers.



Made lucrative by making the user know his network shortcomings
and loopholes




50


NetPiculet

-

System Architecture

51


NAT traversal



NAT traversal

is a general term for


techniques that establish and


maintain

Internet protocol


connections traversing

a

NAT


gateway




IPv4 address space depleted and


number of users increasing.



Also allows hiding of end clients
behind NAT routers and thus
increases security



Many filtering policies implemented
at NAT gateways which was the aim of
NetPiculet

to find out.




52


NAT mapping schemes



NAT
middlebox

maps an external endpoint based on the TCP 5


tuple


(protocol, local
-
addr
, local
-
process, foreign
-
addr
, foreign
-



process)




Mapping can be any one of the following:
-


-

Independent :
-

external endpoint remains the same


-

Address and Port(delta)


external endpoint changes when


destination endpoint changes


-

Connection(delta)


External endpoint changes for each new


connection

[delta


increment in external port number for every new connection]




Port number predicted in order to test with stream of packets for


new connections.



53


NAT Policies



Nat properties:
-


-

End point filtering


-

TCP state tracking


-

Filtering Response


-

Packet mangling




NAT characteristics:
-


-

Time dependent NAT mapping


has advantages as well as


disadvantages and hence a compromised value has to be


decided depending upon tradeoff


-

Multiple NAT boxes


system complexity increases



54


Summary of results of NAT Policies



Discovered a previously unknown NAT mapping scheme and


implemented a corresponding traversal scheme which succeeds


with high probability.




A single client may encounter multiple NAT boxes due to load


balancing and hence care should be taken to maintain mapping


during the traversal.




Some of the carriers assign random ports for connections which is


worst for NAT mapping and traversal. Birthday paradox used to


resolve the mapping but for P2P applications, it is better to use a


consistent mapping scheme.


55


Firewall



Required to protect end users from malicious attacks such as
DoS
,


Battery drain
-
out, etc




Implemented at
middleboxes

inline with NAT.




Methodology used for testing:
-


-

Testing IP spoofing


-

Testing
stateful

Firewall


-

Testing TCP connection timeout


-

Testing Out
-
of
-
order Packet Buffering

56


Firewall Policies

57

Firewall Policies


Implications and Recommendations



Energy impact of TCP connection timeout




Performance and Energy impact of buffering
-


-

Disabling TCP fast retransmit


-

Bad interaction with Protect against wrapped sequence


-

Bad interaction with TCP Forward
-

RTO recovery




Exploiting large sequence number window




Flaws with closing TCP connections

58


Firewall Policies


Effect on Download time

59


Firewall Policies


Effect on Energy Consumption

60


Summary of Firewall Policies



4 out of 60 cellular networks allow IP spoofing making the user


vulnerable




Nearly 15 % of carriers set TCP timeout less than 10 minutes


increasing energy consumption. SDK suggested to be used by


developers to maintain uniformity.




TCP out of order buffering causes degraded performance and energy


waste in some cases. So a tradeoff has to be decided between


performance and security.