Dynamics of Hot-Potato Routing in IP Networks

smashlizardsΔίκτυα και Επικοινωνίες

29 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

90 εμφανίσεις

Dynamics of Hot
-
Potato
Routing in IP Networks

Renata Teixeira

(
UC San Diego
)

http://www
-
cse.ucsd.edu/~teixeira

with

Aman Shaikh (AT&T), Tim Griffin(Intel), and
Jennifer Rexford(AT&T)

SIGMETRICS’04


New York, NY

SIGMETRICS’04

2

Internet Routing Architecture

UCSD

Sprint

AT&T

Verio

AOL

interdomain routing (BGP)

intradomain routing (OSPF,IS
-
IS)

Changes in one AS

may impact traffic

and routing in other ASes

User

Web

Server

End
-
to
-
end performance

depends on all ASes

along the path

SIGMETRICS’04

3

Hot
-
Potato Routing

San Francisco

Dallas

New York

Hot
-
potato routing = route to closest egress point



when there is more than




one route to destination

ISP network

9

10

dst

multiple connections

to the same peer

SIGMETRICS’04

4

Hot
-
Potato Routing Change

San Francisco

Dallas

New York

ISP network

dst

9

10

-

failure

-

planned maintenance

-

traffic engineering

11

Routes to thousands

of destinations switch

exit point!!!

Consequences:


Transient forwarding instability


Traffic shift


Inter
-
domain routing changes

11

SIGMETRICS’04

5

Approach


Understanding impact in real networks


How often hot
-
potato changes happen?


How many destinations do they affect?


What are the convergence delays?


Main contributions


Methodology for measuring hot
-
potato changes


Characterization on AT&T’s IP backbone


SIGMETRICS’04

6

Challenges for Identifying

Hot
-
Potato Changes


Cannot collect data from all routers


OSPF: flooding gives complete view of topology


BGP: multi
-
hop sessions to several vantage points


A single event may cause multiple messages


Group related routing messages in time


Router implementation affects message timing


Controlled experiments of router in the lab


Many BGP updates caused by external events


Classify BGP routing changes by possible causes

SIGMETRICS’04

7

Measurement Methodology

Replay routing decisions from

vantage point

A and B to identify

hot
-
potato changes

AT&T

backbone

BGP monitor

BGP updates

OSPF Monitor

OSPF

messages

A

B

SIGMETRICS’04

8

Algorithm for Correlating

Routing Changes


Step 1: Process stream of OSPF messages


Group OSPF messages close in time


Transform OSPF messages into vantage point’s routing
changes


Step 2: Process stream of BGP updates from
vantage point


Group updates close in time


Classify BGP routing changes by possible OSPF cause


Step 3: Match BGP routing changes to OSPF
changes in time


Determine causal relationship

SIGMETRICS’04

9

Characterization of

AT&T Network


Dataset


BGP updates from 9 routers


176 days of data from February to July 2003


Understanding impact of hot
-
potato changes


How often hot
-
potato changes happen?


How many destinations do they affect?


What are the convergence delays?


SIGMETRICS’04

10

Frequency of

Hot
-
Potato Changes

router A

router B

Need data from many vantage points and long duration

SIGMETRICS’04

11

Variation across Routers

NY

10

9

SF

A

NY

1000

1

SF

dst

dst

Small changes will make router A

switch exit points to dst

More robust to intradomain

routing changes

B

Important factors:

-

Location: relative distance to egresses

-

Day: which events happen

SIGMETRICS’04

12

Impact of an OSPF Change

router A

router B

SIGMETRICS’04

13

Delay for BGP Routing Change


Steps between OSPF change and BGP update


OSPF message flooded through the network (t
0
)


OSPF updates path cost information


BGP decision process rerun (timer driven)


BGP update sent to another router (t)


First BGP update sent (t
1
)


Metrics


Time for BGP to revisit decision: t
1

-

t
0


Time for BGP update: t


t
0

BGP monitor

OSPF monitor

SIGMETRICS’04

14

BGP Reaction Time

uniform 5


80 sec

Transfer delay

First BGP update

All BGP updates

Worst case scenario:


0


80 sec to revisit BGP decision


50


110 sec to send multiple updates


Last prefix may take 3 minutes to converge!

SIGMETRICS’04

15

Data Plane Convergence

R
1

R
2

dst

10

100

10

111

E
1

E
2

Disastrous for interactive applications (VoIP, gaming, web)

2


R
2

starts using E
1

to reach dst

1


BGP decision process runs in R
2

R
1

R
2

dst

10

100

10

111

E
1

E
2

3


R
1
’s BGP decision can

take up to 60 seconds to run

Packets to dst may

be caught in a loop

for 60 seconds!


2


R
2

starts using E
1

to reach dst

1


BGP decision process runs in R
2

SIGMETRICS’04

16

Conclusion


Measured impact of hot
-
potato routing


Convergence delay (partially fixable)


Route changes and traffic shifts (fundamental property)


External routing updates


What to do about it?


Router vendor: event
-
driven implementation


Network operator: operational practices to avoid changes


Network designer: designs that minimize sensitivity


Model of sensitivity to hot
-
potato disruptions (
SIGCOMM’04
)


Protocol designer: looser coupling of routing protocols

SIGMETRICS’04

17

Hot
-
Potato Changes

across Prefixes

Cumulative

% BGP updates

% prefixes

Non hot
-
potato changes

All

Hot
-
potato changes

OSPF
-
triggered BGP updates

affects ~60% of prefixes

uniformly

prefixes with only

one exit point

Contrast with

non
-
OSPF triggered

BGP updates

SIGMETRICS’04

18

Algorithm for Correlating
Routing Changes

Stream of OSPF messages

Stream of BGP updates from vantage point

Transform OSPF msgs

into vantage point’s

routing changes

Determine “stable” routing

changes per dst and

classify them according

to possible OSPF cause

time

Match path cost changes

with BGP routing changes

that happened close in time

SF 9

NY 10

Costs from

Dallas

SF 11

NY 10

SF 11

NY 10

dst

dst
2