Inter-Domain Routing

reekydizzyNetworking and Communications

Oct 28, 2013 (3 years and 5 months ago)

76 views

Inter
-
domain Routing

Don Fussell

CS 395T

Measuring Internet Performance

Internet Routing


Two
-
level architecture, two protocol classes


IGP: Internal Gateway Protocol


Within an organization’s network


Optimized protocol


Intra
-
domain routing protocol


EGP: External Gateway Protocol


Between organizations’ networks


Policy routing


Inter
-
domain routing protocol

Internal Gateway Protocol


Runs within an Autonomous System (AS)


An AS is a collection of routers (not a collection
of IP addresses or prefixes)


Can provide optimal paths between nodes
(according to some cost metric)


Examples


RIP (Routing Information Protocol


OSPF (Open Shortest Path First)


IS
-
IS (Intermediate System to Intermediate System)


IGRP, EIGRP (CISCO proprietary)

External Gateway Protocol


Allows different ASs to exchange routing
information


Policy routing


Control can be exerted over the
information that crosses the border between Ass


Based on cost metrics, but do not necessarily
optimize like IGPs do


Examples


BGP4 (Border Gateway Protocol, de facto standard)


EGP (External Gateway Protocol, specific not generic)


GGP (Gateway to Gateway Protocol)


Hello


Distance Vector Protocols


Simple to understand and implement


Poor scalability, based on transmitting routing
tables between routers


Require periodic retransmission of routing
information as routing tables expire


Limited to small networks with simple topologies


Can exhibit “counting to infinity” behavior in the
presence of link failures


Example


RIP (Routing Information Protocol)



Link State Protocols


Routers exchange Link State Packets (LSPs), not routing
tables


LSP information from a router flooded to rest of network


Only regenerates this information based on topology
changes


Good scalability
-

amount of information sent proportional
to topology change, not number of IP prefixes


Each router maintains local map of entire network (AS),
called Link State Database (LSDB), and constructs shortest
path information using Dijkstra’s algorithm


Examples


OSPF, IS
-
IS

Classless Inter
-
Domain Routing (CIDR)


The Internet is a collection of networks


hence an IP address contains
two parts, a network identifier and a host identifier


Networks within the Internet have different numbers of hosts, hence
originally networks were divided into classes


Network classes


Class A


0 in high order bit, network id is in first octet, host address is in
the last three octets


128 class A networks each with 16.7 million host addresses


Class B


10 in high order two bits, network id is in first two octets, host
address is in the last two octets


16,384 class B networks each with 65,535 host addresses


Class C


110 in high order three bits, network id is in the first three
octets, host address is in the last octet


2.1 million class C networks each with 255 host addresses


Class D


for multicast


Class E


reserved and unused


This architecture is now obsolete

Classless Addressing


Rapid growth of Internet outpaced class based addressing


Routing tables growing too large


Running out of IP address space


CIDR primarily addresses routing table problem


Basic idea


get rid of implicit netmasks, pass explicit
netmasks in inter
-
domain routing protocols


CIDR allows service providers to aggregate classful
networks and provide single summarized routing
advertisements to other domains, thus controlling the
growth of routing tables


Addresses can overlap, forwarding must use longest
matching prefix

CIDR Advantages


Reduced the size of the Internet routing
table


Reduced the growth rate of the Internet
routing table


Allows current generation routers to handle
Internet addressing and forwarding


Extended the lifetime of IPv4 addressing

CIDR Issues


Address allocation must be done in such a way as
to allow aggregation


BGP4, which was created to support CIDR, must
also be configured to support aggregation


Multihoming


having more than one link to the
Internet


how to aggregate


Proxy aggregation


One AS performs
aggregation of addresses contained within another

BGP Outline


Based on Distance Vector algorithms


Uses TCP as transport protocol


A BGP session involves two nodes


Routers can be involved in several concurrent BGP sessions


BGP message types


Open session


Activate new routes to prefixes


Deactivate old routes to prefixes


Report unusual conditions


Close session


Advertised routes are actively being used by advertiser


Prefix advertisement attributes


Next hops


Route preference metrics


AS path of routing announcement


How the prefix entered the routing table of the source AS


BGP is extensible


new attributes can be added as needed

BGP State Machine

Idle

Connect

Active

Open

Sent

Open

Confirm

Established

Connection

Accepted

Open

Received

TCP
Connection

Failed

TCP
Connection

Established

Connection

Rejected

or Error

Error

TCP
Connection

Attempted

TCP
Connection

Failed

BGP Message Types


Open


Update


Notification


Keepalive

Open Message


Version (1 octet)


My Autonomous System (2 octets)


Hold time (2 octets)


BGP identifier (4 octets)


Optional parameters length (1 octet)


Optional parameters (variable length


Type (1 octet)


Length (1 octet)


Value (variable)

OPEN Optional Parameters


1


Authentication information (1 octet
authentication code and variable length
information field. Not really used.)


2


Capability negotiation

Update Message


Withdrawn (unfeasible) routes length (2 octets)


Withdrawn (unfeasible) routes (variable)


IP prefix length in bits (1 octet)


IP prefix (variable)


Total path attributes length (2 octets)


Path attributes (variable)


Network layer reachability information (variable)


Attribute Encoding


Attribute Type (2 octets)


Attribute Flags (1 octet)


Attribute Type Code (1 octet)


Attribute Length (1 or 2 octets)


Attribute Value (variable)

Attribute Flags


Bit 1


Optional


0 = well
-
known, required in all BGP implementations


1 = optional



Bit 2


Transitive


0 = non
-
transitive, not passed to other peers


1 = transitive, must be passed on to others


Bit 3


Partial


1 = some router didn’t understand optional transitive attribute


0 = otherwise, must be 0 for well
-
known and optional nontransitive
attributes


Bit 4


Extended Length


0 = attribute length represented in 1 octet


1 = attribute length represented in 2 octets



Notification Message


Error code (1 octet)


Error subcode (1 octet)


Data (variable)

Error Codes


1


Message Header Error


2


OPEN Message Error


3


UPDATE Message Error


4


Hold Timer Expired


5


Finite State Machine Error


6


Cease


Message Header Error Subcodes


1


Connection Not Synchronized


2


Bad Message Length


3


Bad Message Type

OPEN Message Error Subcodes


1


Unsupported Version Number


2


Bad Peer AS


3


Bad BGP Identifier


4


Unsupported Optional Parameter


5


Authentication Failure


6


Unacceptable Hold Time

UPDATE Message Error Subcodes


1


Malformed Attribute List


2


Unrecognized Well
-
known Attribute


3


Missing Well
-
known Attribute


4


Attribute Flags Error


5


Attribute Length Error


6


Invalid ORIGIN Attribute


7


AS Routing Loop


8


Invalid NEXT
-
HOP Attribute


9


Optional Attribute Error


10


Invalid Network Field


11


Malformed AS
-
PATH

Keepalive


Common header, no data

Model of Operation


Each peer contains three locations


Adj
-
RIB
-
In (Adjacent Routing Information Base In)


1 per peer (BGP session)


Contains prefixes learned from that peer


Loc
-
RIB (Local Routing Information Base)


1 per system


Contains prefixes selected for use


Adj
-
RIB
-
Out (Adjacent Routing Information Base Out)


1 per peer (BGP session)


Contains prefixes advertised to that peer

Standard Attributes


1


Origin (well
-
known)


Indicates how a given prefix came into BGP at
the AS originating the prefix announcement


1


IGP: The prefix was learned from an IGP


2


EGP: The prefix was learned through BGP


3


INCOMPLETE: The prefix was learned
through some mechanism other than IGP or
EGP, in practice these are the static routes

Standard Attributes


2


AS
-
PATH (well
-
known)


Contains sequence of ASNs through which the
announcement has passed


Primarily used for loop detection/prevention


If a peer’s ASN appears in the AS
-
PATH, the
announcement is generally rejected, although some
implementations can be configured to accept such a
route for
partition healing
.


Encoded as sequence of AS
-
PATH segments


Each has a TYPE ( 1 octet), LENGTH (1 octet), VALUE (list
of length LENGTH of 2 octet ASNs)


TYPE is either AS
-
SET or AS
-
SEQUENCE, allows for
aggregation of routes received via different AS
-
PATHS

Standard Attributes


3


NEXT
-
HOP (well
-
known)


Address of the node to send packets to get them to the
advertised prefix


Often the same as the speaker’s IP address


Can be different (
third
-
party next hop
), otherwise
would be redundant


Requires special configuration, need not be accepted by
listener


Can be useful when several routers are on a LAN but
only some of them speak BGP

Standard Attributes


4


MULTI
-
EXIT
-
DISCRIMINATOR (MED)
(optional, nontransitive, 4
-
octet unsigned integer)


Used when two ASs connect to each other at multiple
places


Carries a metric expressing a degree of preference for
the link in the advertisement for routing to a prefix


Sent by one AS, used by another, thus typically used in
provider
-
subscriber relationships

Standard Attributes


5


LOCAL
-
PREF (well
-
known,
discretionary, 4 octet unsigned integer)


Generally used locally by an AS to express
preferences for routes to a prefix when multiple
routes to different ASs are known


Different from MED in that it isn’t passed by
one AS to another, and doesn’t only apply to
multiple connections between a pair of ASs

Standard Attributes


6


ATOMIC
-
AGGREGATE (well
-
known,
discretionary, 0 length used as a flag)


Indicates that the advertised prefix has been aggregated


Some parts of paths to parts of the aggregate address
space advertised may not appear in the AS
-
PATH


The receiver of the advertisement should not
deaggregate the prefix into more specific BGP entries

Standard Attributes


7


AGGREGATOR (optional, transitive, 2
octet ASN, 4 octet IP address)


Indicates the AS and router that performed the
aggregation of the announced prefix

Internal and External BGP


How do multiple routers speaking BGP within a single AS
exchange routing information?


Could use IGP such as OSPF, but the volume of routing table
information and frequency of updates typically transmitted by BGP
would break LSPs


A preferred way is to use Internal BGP (I
-
BGP)


Strictly speaking, we should call the typical EGP use of BGP E
-
BGP


Basically, the two are the same, with the key difference that
prefixes learned from an E
-
BGP neighbor can be advertised to an
I
-
BGP neighbor and vice versa, but a prefix learned from an I
-
BGP
neighbor cannot be advertised to another I
-
BGP neighbor


This presents looping routing announcements within an AS, the
AS
-
PATH attribute is useless for this within one AS


It also leads to the requirement of a full
-
mesh of logical
connections between I
-
BGP peers within an AS

BGP Route Selection


How does a system choose among multiple routes for the
same (identical, not overlapping) prefix?


The route with the highest LOCAL
-
PREF is selected first


If no unique route is found, then the route with the shortest AS
-
PATH is selected from among those previously selected,


If this does not produce a unique route, then if the system accepts
MED and the multiple routes were learned from a single
neighboring AS, the route with the lowest MED value is selected


If multiple routes are still available, then choose the route with the
minimum cost to the NEXT
-
HOP according to the IGP in use


If no unique route has been chosen, and exactly one of the routes
was learned by E
-
BGP, choose that one.


If no unique route has been chosen, and all routes were learned via
I
-
BGP, then choose the route learned from the I
-
BGP neighbor
with the lowest BGP ID