Crafting Confederations An overview of the Confederation POP Approach to Network Architecture Dan...

Post on 18-Dec-2015

216 views 0 download

Transcript of Crafting Confederations An overview of the Confederation POP Approach to Network Architecture Dan...

                        

                        

Crafting Confederations

An overview of the Confederation POP Approachto Network Architecture

Dan Golding NetRail, Inc.

dan@netrail.netMiguel Dimayuga

Earthlink, Inc.mdimayuga@corp.earthlink

.net

                        

                        

The Old Way…

Conventional Network Routing Architectures….

• Full Mesh iBGP or Route Reflectors

• A fully meshed Network via ATM PVCs.

                        

                        

• It’s not adapted to the New Optical Network!• POS is here in force, ATM’s value in the core is

receding.• It is far more fragile, and far less agile than newer

methods of Inter-domain Routing.• The Old Way was prone to user-error. The E-

Commerce Revolution demands a New Way!

What’s Wrong With The Old Way?

                        

                        

A Better Way

• Emphasizes Large Scale, IP Based, Fiber Ring Networks

• Optimized for Service Provider Needs

• Utilizes cutting edge routing technologies to provide far greater fault tolerance and usable traffic engineering.

• Implemented via advanced BGP techniques: Communities and Confederations.

                        

                        

How the Old worked…(Full Mesh iBGP)

• Every router must be fully meshed with all others.

• Works well in small systems

• Grows exponentially• Eventually consumes all

CPU, memory, and engineering resources.

Full iBGP Mesh

Exponential growth!

                        

                        

How the Old Way worked…(Route Reflectors)

• Scaled Well• Well suited to

fully meshed ATM Networks – Star Topology.

but...• Not Survivable in

a Fiber Ring Network.

Peer Isolation withBGP Route Reflection

Peers

RR Server

Peers

RR Client

                        

                        

How the Old Way worked…(Filtering)

• List of IP Prefixes and/or AS numbers set on all border routers to other ISPs. Only the access-list contents would be advertised.

• Worked well when most customers were single-homed and didn’t run BGP.

• Changes were VERY manpower intensive.

• With multi-homed e-commerce shops, no longer feasible.

                        

                        

How the New Way works…(Confederations)

• Routers peer with neighbors

• Highly Survivable

• Very Scalable• Easily

Configured• Aids

Troubleshooting Peers

Peers

Routers Peerwith

Neighbors

BGP Confederations

                        

                        

• BGP allows three types of peer relationships:– iBGP (Full iBGP mesh)– eBGP (External Peering or Transit)– Confederation eBGP (its an iBGP with an eBGP look!)

• Confederation eBGP is like regular eBGP, except– Next Hop, Local Preference and MEDs are preserved– Confederation elements in the AS-PATH are not counted for route

selection purposes

Confederation Overview

                        

                        

Confederation Overview

• Confederations allow groups of routers to form “sub-autonomous systems” to eliminate scaling problems with full mesh iBGP

• All Routers within a sub-AS must be fully meshed (or optionally in a route reflector cluster configuration)

• Confederations are most advantageous when there are few routers per sub-AS. There is no reason to limit the number of sub-AS’s you have – nothing is gained.

                        

                        

Confederation Overview

• Most confederation designs start out with only two or three sub-ASes. This offers few advantages over full mesh iBGP in a ring network topology.

• The more sub-ASes you add, the greater the advantage

• The final result: One sub-AS per POP

• The upper limit on this is 1000 sub-AS’s per RFC

                        

                        

The Advantages of a Confederation of POPs

• The routers within each POP need only peer with each other, utilizing iBGP

• Neighboring POPs are peered with via POP border routers speaking confederation eBGP

• Next Hop, Local Pref and MEDs are preserved• More survivable than Route Reflectors• Far more scalable than full iBGP mesh

                        

                        

How to Make It Work

• Thoughtful use of sub-AS numbers• Local Preference Hierarchy• Useful and Descriptive Community Strings• Meaningful MEDs• Use of various policies – via access lists,

community lists, etc – as building blocks• Use of Peer Groups whenever implementation

allows.

                        

                        

Sub-AS Assignment

• Sub-AS’s become useful tools for debugging – show ip bgp, show route

• Suggested assignment is geographical• Always remember to keep room for expansion!• Put plenty of extra sub-AS’s in your configs –

don’t count on adding them later!

                        

                        

• Southeast 65000-65099• Northeast 65100-65199• Northcentral 65200-65299• Southcentral 65300-65399• Western 65400-65499• Canadian 65500-65535• Latin/South American64512-64599• European 64600-64699• Asian 64700-64799• Reserved 64800-64999

Geographical Region as sub-AS

                        

                        

Sample Community Assignments

msp

4xT1354

65407OAK

65405SEA

65100NYC

65200CHI

65406DEN

65300DAL

65000ATL

65401PHX

65400LAX

65101DC

65005RTP

65203CLE

65102BOS

                        

                        

• Communities are “tags” or “post-it notes” attached to routes to help identify them. – There can be more than one community attached to a route.

• Communities are recommended to be set at the ingress point. – Communities need be applied only once– administrative burden and complexity is greatly reduced.

• When routes egress, filtering can be based on one or more community strings.

• Sample Communities – Regional, by Peer, Customer, Internal, Peer, Transit

Community Strings are the Key

                        

                        

Communities Set at Ingress

AS701

AS4355

transitrouter bgp 4355network 207.69.0.0/16 route-map make-greennetwork 199.174.166.0/24 route-map make-red

207.69.0.0/16 i198.99.146.0/24 i

4.0.0.0/8 i 5.0.0.0/8 i

router bgp 4355neighbor a.a.a.a remote-as 701neighbor a.a.a.a route-map make-blue in

4.0.0.0/8 701 i 5.0.0.0/8 701 i

                        

                        

Communities Used to Filter on Egress

AS701

AS3703AS4355

transit

customer

207.69.0.0/16 i198.99.146.0/24 i

4.0.0.0/8 i 5.0.0.0/8 i

4.0.0.0/8 701 4335 i 5.0.0.0/8 701 4335 i207.69.0.0/16 4335 i

router bgp 4355neighbor b.b.b.b remote-as 3703neighbor b.b.b.b route-map blue-green out

4.0.0.0/8 701 i 5.0.0.0/8 701 i

                        

                        

• Customer Routes 4006:65150• Private Peering 4006:65140• Transit 4006:65130• Public Peering 4006:65120

• Internal Routes (OPN-visible) 4006:65110• Internal Routes (Global-visible) 4006:65100

Community Categories – Route Type

                        

                        

Other Peoples Networks (OPNs)

• To expand our national coverage, Mindspring utilized third party networks’ dialup facilities. These networks are what we term as OPNs.

• Prefixes for Core Services which we want restricted to MindSpring customers and not visible to the rest of the world (e.g. news, radius, smtp) are announced to our OPNs alone.– This has the added advantage of protecting against abuse of our

services by non-customers.

• With communities, we can tag routes for export to OPNs alone.

                        

                        

• Field Peering 4006:65020

• Exchange Point Peer 4006:65010

• Northeast Region Peering (DC) 4006:65030

• Southeast Region Peering (Atlanta) 4006:65040

• Northcentral Region Peering (Chicago) 4006:65050

• West Peering Region (Palo Alto) 4006:65060

• Southcentral Region Peering (Dallas)4006:65070

Community Categories – Route Ingress Location

                        

                        

Community Categories – Specials

• No Export to any external BGP peerNo-Export

• Do Not Advertise to any peer (Well Known)

No-Advertise

• Always Prefer (proposed Well Known)

Prefer-Me (65535:65519)

• Always Avoid (proposed Well Known)

Avoid-Me (65535:65504)

                        

                        

Also add a community string for the origin AS

If the route comes from UUNet,

then add 4006:701

If the route comes from Sprint,

then add 4006:1239

Community Categories – Origin AS

                        

                        

router bgp 4355neighbor b.b.b.b remote-as 4006neighbor b.b.b.b route-map setlocpref90 in

router bgp 4355neighbor c.c.c.c remote-as 701neighbor c.c.c.c route-map setlocpref60 in

Local Preference

AS4006

AS3703AS4355

peering

customer165.200.1.0/24 100 3703 i

165.200.1.0/24 1 3703 i

165.200.1.0/24 i

router bgp 4355neighbor a.a.a.a remote-as 3703neighbor a.a.a.a route-map setlocpref100 in

165.200.1.0/24 60 701 3703 i

AS701

165.200.1.0/24 1239 3703 i

transit

165.200.1.0/24 90 4006 3703 i

                        

                        

• The higher the Local Preference, the more desirable the route.

• Customers ALWAYS come first – we never want to send their traffic to a peer, regardless of AS-Path padding

• Private Peering is always more desirable than Public Peering

• Transit is less desirable than private peering for economic reasons

Local Preference Hierarchy

                        

                        

• Always Preferred 250• Customer Routes 100• Customer Backup Routes 90• Private Peering 80• Less Preferred Private Peering (congested) 70• Paid Transit 60 • Less Preferred Paid Transit (congested) 50• Public Peering (ATM NAPs) 40• Less Preferred Public Peering (FDDI NAPs) 30• Never Preferred 1

Local Preference Hierarchy

                        

                        

Peer Types

• Local sub-AS Peer (within a POP)• Confederation Peers (other POPs or sub-ASes)• Transit Peers (we buy transit from them)• Public/Private Peering • Customer Peers

                        

                        

Local sub-AS Peers

• All peers within a POP are members of this group.• The update source for these BGP sessions will be

the loopback address of the router.• Communities must be recognized.• Option to use full-mesh or route-reflectors.

For Each Local Sub-AS Peerneighbor <neigh-ip A> remote-as <neighbor-as A>neighbor <neigh-ip A> description otherlocalrouternameneighbor <neigh-ip A> update-source loopback0neighbor <neigh-ip A> send-communityneighbor <neigh-ip A> version 4

                        

                        

Update-Source Loopback Address

• The routers will use loopback address as the source of the bgp packets. – Only one session needs to be created even with

multiple paths between routers.

• Peering between loopback addresses increase the stability of the bgp sessions since loopback addresses don’t go down.

207.69.132.1/24 207.69.132.2/24

207.69.133.1/24 207.69.133.2/24192.168.128.1/32 192.168.128.2/32

                        

                        

• All peers that are POP border routers are members of this group.

• The update source for these BGP sessions will be the facing interface of the router.

• Inbound Soft Reconfiguration is not necessary.– Outbound soft reconfiguration can be done at the remote end

• Communities must be recognized.• Filtering is done on egress, MEDs are set on ingress.

Confederation Peers

                        

                        

Soft Reconfiguration

• “clear ip bgp” drops the TCP session. Soft reconfiguration is much friendlier.

• “clear ip bgp <neighbor-ip> soft out” issues withdrawals for all advertised routes, recomputes and then resends the routes (low cpu)

• “clear ip bgp <neighbor-ip> soft in” reevaluates routes received from its peers stored in memory. (high memory requirements)

                        

                        

Peer-Groupneighbor internal peer-groupneighbor internal version 4neighbor internal send-community For Each Peerneighbor <neigh-ip A> remote-as <neighbor-as A>neighbor <neigh-ip A> description remotesitenameneighbor <neigh-ip A> route-map <site>-recv-<remotesite> inneighbor <neigh-ip A> route-map <site>-send-<remotesite> outneighbor <neigh-ip A> peer-group internal route-map <site>-recv-<remotesite> permit 10set metric +<metric>

route-map <site>-send-<remotesite> permit 10match community <send-all-except-no-advertise-routes>

Confederation Peer Configuration

                        

                        

Confederation Peer Routes

• Don’t Send: No Advertise• Send: Customer, Peer, Transit, Internal

                        

                        

Additive MEDs• Why

– Allows a tiebreaker based on optimum routing

– Allows an alternate method to de-prefer routes in case of transit/peering congestion

• Possible Values – – Mileage

– delay in ms

– fixed value per hop

• Supported by -– Cisco IOS

– Feature Request in JUNOS, Riverstone, Foundry IronWare

                        

                        

Additive MEDs in Confederations

65401HOU

65400DAL

65012BHAM

65000ATL

207.69.0.0/16 120 (65000)

207.69.0.0/16 0(originated here)

207.69.0.0/16 700 (65012 65000)

120580

60040

207.69.0.0/16 720 (65012 65000)207.69.0.0/16 740 (65400 65012 65000)

207.69.0.0/16 760 (65401 65012 65000)

                        

                        

• The update source for these BGP sessions will be the facing interface address of the router.

• Soft Reconfiguration should be used.• Communities must be recognized.• Send out only customer and internal routes. • Apply an import ACL to the routes that prevents reception

of martian routes, and assigns proper communities and local preference.

• Allows prepending certain subsets of routes with additional AS numbers.

Transit Peers

                        

                        

neighbor <neighbor-ip> send-communityneighbor <neighbor-ip> version 4neighbor <neighbor-ip> next-hop-self neighbor <neighbor-ip> soft-reconfiguration inboundneighbor <neighbor-ip> distribute-list martians inneighbor <neighbor-ip> remote-as <neighbor-as C>neighbor <neighbor-ip> route-map <site>-recv-<provider> inneighbor <neighbor-ip> route-map <site>-send-<provider> outneighbor <neighbor-ip> description transitprovidername route map <site>-send-<provider> deny 10match community 4

route map <site>-send-<provider> permit 20match community 1 set as-path prepend 4006 4006

route-map <site>-recv-<provider> permit 10

set local-preference 60

set metric 0 (if you don’t want to listen to others meds)

Set community 4006:30 additive

Set community 4006:20 additive

Set community 4006:500 additive

Set community 4006:<AS#> additive

Transit Peer Config

                        

                        

• Don’t Send: No Exports, No Advertise

Peers or Transit

• Send: Customers, Internal

Transit Peer Config

                        

                        

• De-prefer routes for congested outbound– Set Local Pref normally for routes with AS-Path Length=1 or 2– Set Local Pref Lower for all other routes– Effect: Only most direct routes flow through that connection.

Others flow through other transit, if available

• OPN’s and sending OPN routes– Send special routes – usually for servers and services – only to

your own network, and OPNs– Have a special community list or policy specifying the routes.

Transit Tricks

                        

                        

• The update source for these BGP sessions will be the facing interface address of the router.

• Soft Reconfiguration should be used.

• Communities must be recognized.

• Send out only customer and internal routes.

• Apply an import ACL to the routes that prevents reception of martian routes, and assigns proper communities and local preference.

• Option to use local preference to prefer unconditionally all or only some routes coming from a free peer.

Private/Public Peers

                        

                        

Peer Configurationneighbor free-peering peer-groupneighbor free-peering send-communityneighbor free-peering version 4neighbor free-peering next-hop-selfneighbor free-peering-full soft-reconfiguration inboundneighbor free-peering-full distribute-list martians inneighbor free-peering route-map <peername>-in inneighbor free-peering route-map cust-routes out

route map cust-routes deny 5match community-list 4 route-map cust-routes permit 10match community-list 1 route-map <peername>-in permit 10set local-preference 80set community 4006:30 additiveset community 4006:20 additiveset community 4006:700 additiveset community 4006:<AS#> additive

Per-Peer neighbor <neighbor-ip> remote-as <neighbor-as D>neighbor <neighbor-ip> peer-group free-peeringneighbor <neighbor-ip> description Peer Name

                        

                        

• Don’t Send: No Exports, No Advertise

Peers or Transit

• Send: Customers, Internal

Free Peering Routes

                        

                        

• The update source for these BGP sessions will be the facing interface address of the router.

• Soft Reconfiguration should be used.• Communities must be recognized. This includes

communities sent from customers.• Send out selected routes, based on customer request.• Apply an import ACL to the routes that prevents reception

of martian routes, and assign proper communities and local preference.

• The import filter must also accept only specific customer routes. – We recommend using Rtconfig to query RADB and generate the ACLs.

Customer Peers

                        

                        

• Full Routes – Customer, Peers, Internals, Transit. – AKA “A Full View”

• Customer Routes– Customer and Internal Routes. – Good for weaker routers (Cisco)– AKA “A Partial View”

• Default Route– Send only a default route - 0.0.0.0/0, pointed to the

router interface– Limited utility

What Type of Routes Can We Send?

                        

                        

Special Considerations for Customers

• Carefully Filter routes – the farther downstream you get, the less clueful (generally)

• Filtering can be based on AS or Prefix• The generally accepted practice is to filter by IP

Access List at ingress (use radb tools if possible)• Customers do not have to advertise the same

routes everywhere – peers do!

                        

                        

Customer Configuration – Full Routesbgp {

group <location-customername> {type external;description <peer-name>;peer-as <neighbor AS #>;neighbor <ip address>;import <customername>-in;

}}policy-options {policy-statement <customername>-in {

term term1 {from policy <location-customername>;then {

local-preference 100;nexthop self;community + customer;community + field

community + ATL; community +

<customername>;}

}}

policy-statement atl-myco { from { route-filter 209.49.143.0/24 exact accept; route-filter 199.5.0.0/16 exact accept; } then reject

                        

                        

bgp {group <location-customername> {

type external;description <peer-name>;peer-as <neighbor AS #>;neighbor <ip address>;import <customername>-in;export custroutes;

}}policy-options {policy-statement <customername>-in {

term term1 {from policy <location-customername>;then {

local-preference 100;nexthop self;community + customer;community + field;community + ATL;community + <customername>;

}}}

Customer Configuration – Partial Routes

policy-statement atl-myco { from { route-filter 209.49.143.0/24 exact accept; route-filter 199.5.0.0/16 exact accept; } then reject

policy-statement custroutes { term term1 { from community [no-export no-advertise]; then reject; } term term2 { from community [internal customer custback]; then accept; }

                        

                        

• Cisco – neighbor a.b.c.d default-originate• Juniper - A little more complex...

bgp {group <location-customername> {

type external;description <peer-name>;peer-as <neighbor AS #>;neighbor <ip address>;import <customername>-in;export default-originate;

}}routing-options {

static {route 0.0.0.0/0 {

nexthop <loopback address>;no-install; }

}

Default Route Only

policy-statement default-originate {

from route-filter 0.0.0.0/0;

then {

nexthop self;

accept;

}

                        

                        

Question and Answer

• Confederations• General BGP Questions

                        

                        

The New Way gives us…

• Less complexity• More stability• More flexibility for traffic management• Greater Survivability• Lower Engineering and Administrative costs.• Increased Uptime• A Scalable, Next Generation IP Network

                        

                        

• RFC 1771 A Border Gateway Protocol 4 (BGP-4)• RFC 1965 Autonomous System Confederations for BGP• RFC 1930 Guidelines for creation, selection, and

registration of an Autonomous System (AS) • RFC 1997 BGP Community Attributes• Nussbacher, Rudnev, and Hares, Global BGP Community

Values, Internet Draft, 12/99• Halabi, Bassam; Internet Routing Architectures• Freedman, Avi, Lecture Notes: January 1999 NANOG

Conference Session: “BGP 102”

Bibliography

                        

                        

In Tribute to the Memory of...

• MindSpring Enterprises, Inc.

• Brandon Ross, Netrail• Avi Freedman, Akamai• Khalid Raza, Cisco

Very Special Thanks to…