Web Caching on Smartphones: Ideal vs. Reality

odecrackΤεχνίτη Νοημοσύνη και Ρομποτική

29 Οκτ 2013 (πριν από 4 χρόνια και 11 μέρες)

126 εμφανίσεις

Web Caching on Smartphones:

Ideal
vs. Reality

Feng

Qian
1
,
Kee

Shen

Quah
1
,
Junxian

Huang
1
,
Jeffrey
Erman
2

Alexandre

Gerber
2
, Z
. Morley Mao
1
,
Subhabrata

Sen
2
, Oliver Spatscheck
2



1
University of Michigan


2
AT&T Labs
-

Research

June 27 2012

Mobile Traffic: An Explosive Growth







Deployment of cellular infrastructures:
much slower


Spectrum shortage and economic issue


The
cellular infrastructure spending
in 2011 was expected
to
be only a 6.7% rise over 2010


Year

2011

2012

2013

2014

2015

2016

Global Mobile
Data Traffic per
Month (10
6

TB)

0.6

1.3

2.4

4.2

6.9

10.8

Avg.

Smartphone
Traffic per Month
(MB)

150

2576

Source: Cisco Visual Networking Index (VNI) Global Mobile Data Traffic Forecast, 2011
-
2016

1600% increase

2

Web Caching on Cellular Devices


The big
picture:
traffic redundancy
elimination


The
first network
-
wide study
of
redundant transfers
caused by
inefficient
HTTP caching on
cellular devices


HTTP
:
The dominant
app
-
layer protocol for ~20 years


Caching
: Huge benefits, but complex


Caching
on cellular devices
:

Reduces redundant data
transferred over the
RAN

Improves performance due to reduced latency

Cuts cellular bills for customers


3

Background: Caching in HTTP 1.1


Use
Expiration

and
Revalidation

to ensure caching
consistency


Before expiration
: the client
should safely assume
the
freshness of the cached file


After expiration
: the client must send a
revalidation
message
to the server to query the freshness of the cache entry


4

Last
-
Modified: Feb 1 2012 15:00:00

Expires:

Feb 10
2012
15:00:00

If
-
Modified
-
Since:

Feb 1
2012
15:00:00

?

304 Not Modified

Last
-
Modified: Feb 12 2012 15:00:00

Expires:

Feb 15
2012
15:00:00

Last
-
Modified: Feb 1 15:00:00

Expires: Feb 10 15:00:00

Last
-
Modified: Feb
12
15:00:00

Expires: Feb
15
15:00:00

Well known protocol for 20 years

What is the state
-
of
-
the
-
art in the context of
cellular devices?

Measurement Goal


Goal
: understand the state
-
of
-
the
-
art in

HTTP caching on cellular devices


What to study
: redundant
transfers
caused by
inefficient HTTP caching


Potential cause: HTTP implementation Related


Caching logic (client/server) not following HTTP spec


Limited cache size


Non
-
persistent cache


Potential cause: application semantics related


Server conservatively sets headers to make files
uncacheable

or expire too soon

5

They account for 20% of the
total HTTP traffic volume!

Measurement Data

Name

ISP

UMICH

Collection period

May 20 2011

(24 hours)

May
to Oct 12 2011 (5 months)

Collection location

Commercial cellular

core network

Directly on user

handsets

Data format

695 million records of HTTP
transactions

Full packet trace with payload

of all traffic

Traffic volume

24.3 TB

118 GB

Dataset size

271 GB

119 GB

#

Users

About 2.9 million

20 U

of Michigan students

Platforms

Multiple (mainly
iOS

and Android)

Android 2.2

User interface for the data

collector/
uploader

software

6

Methodology


A
simulator

strictly follows HTTP/1.1 caching logic (RFC 2616)


Expiration and freshness calculation
mechanism


Non
-
cacheable

objects


Partial
caching
due to byte
-
range requests and broken connection


LRU cache replacement
algorithm, and more …


Feed each user’s HTTP transactions to
the
cache
simulator


Redundant transfers are accurately identified in the
simulation process


HTTP caching is
not simple
: 2K C++ LoC
even for the
simulation core




7

Cacheability

and Redundancy


File cacheability: for both datasets


Most bytes (
70%

to
78%
) and most files (
66%

to
72%
) are cacheable.


Traffic Redundancy (assuming unlimited cache size)





Root causes of redundant transfers (within all HTTP traffic)

8

Dataset

%

Redundancy

(HTTP only)

% Redundancy

(HTTP

+ non
-
HTTP)

ISP

17.7%

N/A

UMICH

20.3%

17.3%

Origin of redundancy

ISP

UMICH

1. Handset issues a request
before local copies expire

15.9%

16.3%

2. Handset does not revalidate
after local copies expire

(the file unchanged).

1.8%

4.0%

3. Server does not recognize revalidation
after local copies
expire
(the file unchanged)

<0.1%

<0.1%

Under
-
estimation

due to HTTPS and

app
-
semantic
-
related

redundancy

Client

Issue


Server

Issue

Limited Cache Size and

Non
-
persistent cache


Which factor
has the main
responsibility for redundancy?



Problematic caching logic



Limited cached size:
cache size



4MB
,
HTTP traffic savings
17%

ㄳ1



Non
-
persistent cache:
59%

of
consecutive cache hits < 1 min


How large the cache size needs to
be?


A cache of
50 MB

achieves
90%

of
the gain (
w.r.t.

traffic reduction)
compared to an unlimited cache


Dist. of intervals between consecutive cache
hits on the same entry
(ISP
trace
)

9

The benefits are significant
even for a small cache.

It is unlikely that the handset
is rebooted during such a
short interval.

Quantifying the Resource

Impact of Redundant Traffic


In cellular networks, we also care about
cellular resources


Use our trace
-
driven
RRC state machine simulator
with a
handset radio
power model

[
Qian

etal
,
Mobisys

11]


Applied to only cellular traffic within UMICH dataset



Three important metrics characterizing cellular resource
consumption:


D
: radio resource consumption


S
: signaling load


E
: handset radio energy consumption






Compute the impact:

Δ
E

= (
E
0



E
R
) /
E
0

E
0
: Radio energy
consumption

in original traces

E
R
: Radio energy
consumption in modified
traces with
redundant
transfers removed

Δ
E
: Radio energy
impact

of
redundant transfers
(a positive value)

10

Quantifying the Resource

Impact of Redundant Traffic


When redundant and other traffic coexist, only eliminating
redundant traffic may not reduce resource consumption


As long as
one of the concurrent transfers
exists, the radio
is on (i.e., consuming resources)


Non
-
HTTP traffic plays a role (
push notification and chatting
)


Traffic volume: small (
1%
); resource impact: high (
18%
)


Resource release is controlled by fixed inactivity timers


Sending small data incurs high resource overhead










Δ
S
Signaling

load Impact

Δ
E
Radio

Energy Impact

Δ
D
Radio
Resource Impact

HTTP only

27%

26%

27%

All traffic

6%

7%

9%

11

Testing HTTP Libraries and Browsers


Verify measurement findings by testing popular

HTTP libraries and browsers on real handsets


Design 13 controlled tests to cover all important
aspects of caching implementation

Feature tests (is it well

supported?)

1. Basic caching

2. Revalidation

3. Various non
-
caching directives

4. Various expiration directives

5. URL with

query strings

6.

Partial

caching

7. Redirection caching

Attribute tests (infer the parameters)

1.

Shared or

non
-
shared?

2. Persistent or non
-
persistent?

3.

Cache entry size limit

4. Total cache size

5. Cache

entry replacement policy

6. Heuristic freshness lifetime

12


Revisit
: which factor has the main
responsibility for redundancy?



Problematic caching logic



Limited cached size


Non
-
persistent cache

Testing HTTP Libraries and Browsers


Basic caching test


Handset
requests for a small
cacheable file
f


Server
transfers
f

with a proper
Expires

directive
.


Client
requests for
f

again
before it expires
.


PASS

iff

the 2nd request not incurring any network traffic


Cache size test: perform
binary search


Cache replacement policy test: try popular
algorithms (LRU, LFU, FIFO)


See paper for all 13 tests


13

Test Results

Smartphone

HTTP library

OS

version

Support

Caching?

Caching

Enabled by
Default?

java.net.URLConnection

Android 2.3

No

No

java.net.HttpURLConnection

Android 2.3

No

No

org.apache.http.client.HttpClient

Android 2.3

No

No

android.webkit.WebView

Android 2.3

Yes

No

android.net.http.HttpResponseCache

Android 4.0.2

Partially

No

Three20 (Version 1.0.6.2)

iOS 4.3.4

No

No

NSURLRequest

iOS 5.0.1

Partially

No

ASIHTTPRequest (Version 1.8.1)

iOS 4.3.4

Partially

No

Android Browser

Android 2.3

Partially

Yes

iPhone Browser

iOS

4.3.4/5.0.1

Partially

Yes

Chrome Browser

Android 4.0.2

YES

YES

Implementation issues of caching



4 out of 8 libraries do not support caching at all.


For both browsers, when
loading the same URL
back
-
to
-
back
, the second
request is
treated as a
full reload from the remote server


Android browser uses a small cache of 8MB


Partial caching is not supported


Some do
not
properly handle
Pragma:no
-
cache

or
Cache
-
Control:no
-
cache
.





A huge gap between protocol
specification

and
implementation
, leading
to significant
redundancy of network
traffic.

14

Summary


The
first network
-
wide study

of
cellular HTTP
caching


Redundant transfers
are prevalent


18%

(ISP) and
20%

(UMICH) of HTTP traffic volume


17%

of overall traffic volume (UMICH)


6%~9%
of cellular resource consumption (UMICH)


The root cause:
problematic caching logic on
handsets


Validated by caching tests of popular libraries and
browsers

15

Backup Slides


Diversity Among Applications


Identifying smartphone applications


ISP: by
user
-
agent fields
in HTTP requests


UMICH: by the captured
packet
-
process correspondence


Diversity among top apps


HTTP redundancy ratios range from
0.0%

to
100.0%


Validate apps with high redundancy ratios (> 90%)


Analyze
locally collected
tcpdump

traces


They do not cache HTTP
responses


Some apps have negligible redundant transfers


Almost all bytes are
not cacheable

e.g., all requests are
HTTP POST

instead of HTTP GET


17

The Cache Simulator (Simplified Version)

18

The simulation algorithm:


Performs fine
-
grained
caching simulation at a
per
-
user basis


Assigns
to each HTTP
transaction a
label

indicating its caching
status.


Red labels correspond to
duplicated transfers.

The
file contains "Cache
-
Control:
no
-
store“.
It cannot be cached.

Cache miss.

Duplicated transfer
: A
request is issued
before the file
expires.

The
file has changed after the cache entry
expires.

The
file has not changed after the cache
entry expires, and a cache revalidation is
properly
performed.

Duplicated transfer
: the
file has not changed
after the cache entry expires, but the
handset does not perform cache
revalidation.

Duplicated transfer
: the file has not changed
after the cache entry expires, but the server
does not recognize the cache revalidation.

Background: Radio Resource
Management
in
Cellular Networks


RRC (Radio Resource Control) state machine
[
3GPP TS 25.331]


State promotions
have promotion
delay


State demotions
incur tail times


Tail Time

Tail Time

Delay: 1.5s

Delay: 2s

RRC State

Channel

Radio

Power

IDLE

Not
allocated

Almost


zero

CELL_FACH

Shared,
Low Speed

Low

CELL_DCH

Dedicated,
High Speed

High

UMTS RRC State Machine for a large US 3G carrier

Page
19

Background: Radio Resource
Management in
Cellular
Networks

Promo

Delay

2 Sec

DCH

Tail

5 sec

FACH

Tail

12 sec

Tail Time

Waiting inactivity timers to expire

Page
20