High Performance Web Sites

bubblesradiographerΔιακομιστές

4 Δεκ 2013 (πριν από 3 χρόνια και 9 μήνες)

106 εμφανίσεις

CS193H:

High Performance Web Sites

Lecture 16:

Rule 13


Configure ETags

Steve Souders

Google

souders@cs.stanford.edu

announcements

11/17


guest lecturer:
Robert Johnson
(Facebook),
"
Fast Data at Massive Scale
-

lessons learned at Facebook"

HTTP/1.1 200 OK

Content
-
Type: application/x
-
javascript

Last
-
Modified:
Mon, 22 Sep 2008 21:14:35 GMT

Content
-
Length: 2066

Content
-
Encoding: gzip


XmoÛHþ
\
ÿFÖvã*wØoq...

Expires

expiration date determines
freshness

can also use
Cache
-
Control:max
-
age

GET /v
-
app/scripts/107652916
-
dom.common.js HTTP/1.1

Host: www.blogger.com

User
-
Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1

Accept
-
Encoding: gzip,deflate

HTTP/1.1 200 OK

Content
-
Type: application/x
-
javascript

Last
-
Modified:
Mon, 22 Sep 2008 21:14:35 GMT

Content
-
Length: 2066

Content
-
Encoding: gzip

Expires: Fri, 26 Sep 2008 22:00:00 GMT


XmoÛHþ
\
ÿFÖvã*wØoq...

GET /v
-
app/scripts/107652916
-
dom.common.js HTTP/1.1

Host: www.blogger.com

User
-
Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1

Accept
-
Encoding: gzip,deflate

If
-
Modified
-
Since:
Mon, 22 Sep 2008 21:14:35 GMT

HTTP/1.1
304 Not Modified


Conditional GET (IMS)

IMS determines
validity


does the browser's
cached version match what's on the server?

the comparison is based on the resource's date

a 304 response is sent instead of all the data

IMS is used when Reload is pressed

sometime after 3pm PT 9/24/08:

HTTP/1.1 200 OK

Content
-
Type: application/x
-
javascript

Last
-
Modified:
Mon, 22 Sep 2008 21:14:35 GMT

Content
-
Length: 2066

Content
-
Encoding: gzip


XmoÛHþ
\
ÿFÖvã*wØoq...

ETag Response Header

GET /v
-
app/scripts/107652916
-
dom.common.js HTTP/1.1

Host: www.blogger.com

User
-
Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1

Accept
-
Encoding: gzip,deflate

HTTP/1.1 200 OK

Content
-
Type: application/x
-
javascript

Last
-
Modified:
Mon, 22 Sep 2008 21:14:35 GMT

Content
-
Length: 2066

Content
-
Encoding: gzip

Expires: Fri, 26 Sep 2008 22:00:00 GMT

ETag: "19f1e
-
7920
-
4525b037f0440"


XmoÛHþ
\
ÿFÖvã*wØoq...

GET /v
-
app/scripts/107652916
-
dom.common.js HTTP/1.1

Host: www.blogger.com

User
-
Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1

Accept
-
Encoding: gzip,deflate

If
-
Modified
-
Since:
Mon, 22 Sep 2008 21:14:35 GMT

If
-
None
-
Match:
"19f1e
-
7920
-
4525b037f0440"

HTTP/1.1
304 Not Modified


Conditional GET (INM)

alternative way to test validity

sometime after 3pm PT 9/24/08:

What is an ETag

http://www.w3.org/Protocols/rfc2616/rfc2616
-
sec3.html#sec3.11

added in HTTP/1.1

used by clients and servers to validate expired
resources

more flexible than Last
-
Modified date

"An entity tag consists of an opaque quoted
string"

"

An entity tag MUST be unique across all versions
of all entities associated with a particular
resource."

If
-
None
-
Match (hit)


"
If any of the entity tags match the entity tag of the
entity that would have been returned in the
response to a similar GET request (without the If
-
None
-
Match header) on that resource[…], then the
server MUST NOT perform the requested method,
unless required to do so because the resource's
modification date fails to match that supplied in an
If
-
Modified
-
Since header field in the request.
Instead, if the request method was GET or HEAD, the
server SHOULD respond with a 304 (Not Modified)
response,…"


http://www.w3.org/Protocols/rfc2616/rfc2616
-
sec14.html#sec14.26

INM, IMS hit & miss

hit

miss

hit

304

full

response

miss

If
-
Modified
-


Since

If
-
None
-
Match

If
-
None
-
Match (miss)

If none of the entity tags match, then the server
MAY perform the requested method as if the
If
-
None
-
Match header field did not exist, but
MUST also ignore any If
-
Modified
-
Since
header field(s) in the request. That is, if no
entity tags match, then the server MUST NOT
return a 304 (Not Modified) response.

INM, IMS hit & miss

hit

miss

hit

304

full

response

miss

full response

full response

If
-
Modified
-


Since

If
-
None
-
Match

if not managed properly, sending both IMS and INM
lowers the chances of a simple, small 304 response

How could it not be managed properly?!

Apache ETags

"19f1e
-
7920
-
4525b037f0440"

"inode
-
size
-
timestamp"

inode


used by filesystems to store file type,
owner, group, permissions, etc.

inode for the same file differs across servers
even if file size, timestamp, and directory is
the same

http://stevesouders.com/images/arrow
-
right
-
9x13.png

ETag: "21f5315
-
d4
-
5d51f0c0"

http://1.cuzillion.com/images/arrow
-
right
-
9x13.png

ETag: "1ee57ec
-
d4
-
5d51f0c0"

IIS ETags

"b4f35327edac51:113f"

"timestamp:changenumber"

changenumber


counter to track IIS
configuration changes

changenumber rarely the same across servers

http://hp.msn.com/global/c/hpv10/favicon.ico

ETag: "b4f35327edac51:113f"

ETag: "b4f35327edac51:e6e"

example ETag miss

GET /global/c/hpv10/favicon.ico HTTP/1.1

Host: hp.msn.com

If
-
Modified
-
Since:
Wed, 26 Oct 2005 22:39:58

GMT

If
-
None
-
Match: "
b4f35327edac51
:
19bc
"


HTTP/1.x 200 OK

Content
-
Length: 1406

Etag: "
b4f35327edac51
:
d76
"

Last
-
Modified:
Wed, 26 Oct 2005 22:39:58

GMT

Expires: Wed, 06 Feb 2008 01:10:16 GMT


timestamp is the same

Last
-
Modified
matches (but
IMS misses)

changenumber

differs,
validations misses,
entire body is resent

validation miss

the problem with ETags

the default ETag syntax in Apache and IIS makes
it unlikely that INM will match across servers,
even when the resource is the same

probability of an incorrect INM miss:

(n
-
1)/n where "n" is the number of servers

not an issue if you just have one server

http://www.apacheweek.com/issues/02
-
01
-
18

"
can cause an unnecessary performance hit as
resources are fetched more often than is required"

http://support.microsoft.com/kb/922703

"
IIS 6.0 sends a 200 response because it considers
the different change numbers to mean that [the
resources] are not the same versions"


the solution for ETags

if you're not leveraging ETags, turn them off

reduces size of requests and responses

reduces outbound traffic from your servers

increases proxy cache hit rate

Apache:

FileETag none

IIS:

synchronize changenumber across servers

http://support.microsoft.com/kb/922703/

ETags in the wild

server

ETags?

default
syntax?

www.aol.com

AOLserver

no



www.ebay.com

IIS

yes

yes

www
.
facebook
.
com

Apache

no



www
.
google
.
com/search

gws

no



search
.
live
.
com/results

ASP.NET

yes

no

www
.
msn
.
com

IIS

no



www
.
myspace
.
com

Apache

some

no

en
.
wikipedia
.
org/wiki

Apache

lighthttpd

some

yes

no

?

www
.
yahoo
.
com

YTS

no



www
.
youtube
.
com

btfe

no



possible uses for ETags

???

Homework

11/7 11:59pm



rules 4
-
10 applied to your
"Improving a Top Site" class project

11/12 3:15pm


Web 100 Double Check


look at your rows in Web 100 spreadsheet


double
-
check your entries for any rows in red


update incorrect entries


enter "y" in "Double Checked" column

read
HPWS Chapter 14

Questions

Why were ETags introduced in HTTP/1.1?

What do "IMS" and "INM" stand for?

How do IMS and INM interplay during resource
validation?

What's the default syntax for ETags in Apache
and IIS?

What component in each default syntax hurts
performance, and why?

What are three performance gains you can
achieve by turning off ETags?