BGP: From JUNOS

  • BGP: Uses TCP port number 179.
  • Peer Type: External BGP session: Two BGP routers in different AS networks. TTL set to 1. Internal BGP session: Two routers in same AS networks. TTL value set to default 64.
  • BGP states:
    • Idle: Initial state when we configure BGP
    • Connect: Waiting for TCP session to be completed
    • Active: local router is trying to initiate a TCP connection with peer.
    • OpenSent: Local router sends BGP open msg and waits for an open msg from peer.
    • Open confirm: Local router received valid open msg. Sends KA msg to peer and for ACK.
    • Established: when a KA msg is received from peer. Fully operational state.
  • Routing information base:
    • Adjacency-RIB-in: Created for each neighbor and stores received routes after import policy applied. If there is AS path loop for a prefix, those routes are not stored.
    • Local-RIB: Best path from each destination from adjacency-RIB-in table.
    • Adjacency-RIB-out: Local-RIB is copied to each neighbors’ table after export policy.
  • Reachability to ‘Next-hop’ for routes from EBGP router inside an AS: Possible actions on border;
    • Setting next-hop attribute – best practice
    • Using IGP passive interface
    • Advertising connected routes inside IGP using export policy
    • IGP adjacency with remote AS router – worst case
    • Using static routes.
  • Basic Configuration:
    • Configure AS number inside [edit routing-options].
    • Best practice is to have peer groups and configure neighbors. Eg: “set group ebgp-peer
    • Configure each neighbor and peer-as (if EBGP). Eg: “set neighbor 1.1.1.1 peer-as 1
    • Set the type of BGP session. Eg: “set type external
    • For IBGP set the local address to loopback. Eg: “set local-address 2.2.2.2”. Similar to “update-source” in IOS.
  • Show commands:
    • Show bgp summary – to view summary of neighbors, status, number of routes and states which is denoted by <#Active>/<#Received>/<#Damped>
    • Show bgp group – to view configured peer groups.
    • Show bgp neighbor – to view about a neighbor in details
    • Show route receive-protocol bgp <address> – to view received routes from a neighbor
    • Show route advertising-protocol bgp – to view advertised routes to a neighbor
    • Show route protocol bgp – to view local routes
    • “next-hop self” option has to be manually configured via export policy for IBGP peer groups. Eg: “ from protocol bgp; then next-hop self

BGP Message Types and Packet Format:

  • Common BGP header:
    • Marker – 16 octets – All set to 1 to detect a loss of synchronization. An open message with authentication configured contains the authentication data.
    • Length – 2 octets – Total length of BGP message. Possible values: 19 to 4096
    • Type – 1 octet- Type of BGP message is located
      • 1 for open message
      • 2 for update message
      • 3 for Notification
      • 4 for Keepalive
      • 5 for route-refresh
  • Open Message:
    • Two peers negotiate the parameter of the peer session. Fields are;
    • Version – set to constant value of 4
    • Local AS – 2 octets – Senders AS value
    • Hold time – 2 octets – proposed hold-time value by sender. Lowest of hold-time is negotiated. Default value: 90 seconds
    • BGP Identifier – 4 octets – Local router ID
    • Optional Parameters Length – 1 octet – total length of following field.
    • Optional Parameters – Variable – contains parameter in (Type, length, value) format.
  • Notification message:
    • If BGP peer detects an error with a session, it sends notification message to remote router and immediately closes BGP and TCP session.
    • Error Code – 1 octet – six error codes have been defined
      • 1 – message header error
      • 2 – open message error
      • 3 – update message error
      • 4 – hold time expired
      • 5- finite state machine error
      • 6 – cease
    • Error-sub code – 1 octet – more specific information about the error.
    • Data – variable – content depends on error code.
  • Keep alive message:
    • Have only 19-octet message header and no other data.
    • To maintain adjacency, periodic KAs are exchanged.
  • Update Message:
    • After common BGP header, below fields are available;
    • Unfeasible Routes length – 2 octets – specifies the length of the withdrawn routes
    • Withdrawn routes – variable – Each route is encoded as (Length,prefix) tuple. The 1-octet length field displays the number of bits in network mask. Prefix tuple has IPv4.
    • Total path attributes length – 2 octets – length of path attributes that follows
    • Path attribute – variable – each attribute is encoded as TLV format.
    • NLRI – variable – Each route is encoded as (Length,prefix) tuple. The 1-octet length field displays the number of bits in network mask. Prefix tuple has IPv4.

Path Attributes: (Attribute name – attribute code – Attribute type)

  • Origin – 1 – well known mandatory
  • AS path – 2 – well known mandatory
  • Next hop – 3 – well known mandatory
  • MED – 4 – optional nontransitive
  • Local preference – 5 – well known discretionary
  • Atomic aggregate – 6 – well known discretionary
  • Aggregator -7-  optional transitive
  • Community – 8- optional transitive
  • Originator ID – 9 – optional nontransitive
  • Cluster List – 10- optional nontransitive
  • Multiprotocol reachable NLRI – 14 – optional nontransitive
  • Multiprotocol unreachable NLRI – 15 – optional nontransitive
  • Extended community – 16 – optional transitive
  • All well known attributes are transitive
  • Well know: All BGP implementation should know these attributes
    • Mandatory – all update packets should have this attribute
    • Discretionary – need not be available in all update packets
  • Optional : BGP implementation need not to know these attributes
    • Transitive – BGP router should forward this attribute to another AS peers even if it didn’t understand.
    • Nontransitive – Attributes shouldn’t be advertised beyond local AS.
  • Each attribute is encoded as Type-length-value format.
    • Type – 2 octets
      • Optional Bit – bit 0 – set to 1 if optional. Set to 0 if well known.
      • Transitive bit – bit 1 – set to 1 if transitive.
      • Partial bit – bit 2 – value of 0 implies all BGP routers in path recognized this attribute
      • Extended Length bit – bit 3 – set the size of Attribute length field to 1 octet (if this bit set to 0) or 2 octets (if this bit set to 1).
      • Unused – bits 4 to 7 – set to 0
      • Type Code – bits 8 to 15 – Attribute code value.
  • Origin:
    • Value field is 1 octet: Origin of IGP has value 0, EGP has value 1 and unknown origin (incomplete) has value of 2. Lowest value is preferred.
  • As-path:
    • Value field is encoded segment type – 1 octet- value of 1 implies ‘AS set’ and value of 2 implies ‘AS Sequence’, which is default.
    • Segment length- 1 octet – Length of the segment value. Should be multiples of 2.
    • Segment Value – contains AS numbers.
  • Next-hop:
    • Value field is 4 octet contains next-hop attribute
  • MED:
    • MED is 4 octets. JUNOS interrupts the absence of this attribute as 0.
  • Local Preference:
    • LP is 4 octets. Default value in 100 and only available in IBGP updates.
  • Atomic Aggregate:
    • No value field. Presence of this attribute implies the route has been aggregated and alerts other routers that packet may not transverse the included AS networks.
  • Aggregator:
    • Value field is 6 octets- AS number: router ID
    • This attribute is assigned to route when routing policy advertise aggregate route to BGP.
  • Community:
    • Value field may contain many community values each occupying 4 octets (AS number: local defined value)
    • Well known communities: ‘No-export’: routes can be advertised to neighboring AS.  But neighboring AS routers should not advertise these routes to any other AS. Community value: 0xFFFFFFF01
    • ‘No advertise’: routes can be advertised to immediate peer but that shouldn’t advertise to any other router. Community value: 0xFFFFFF02
    • ‘No-Export-Subconfed’: Same as ‘no-export’, but AS is replaced by sub-AS. 0xFFFFFF03
  • Originator ID:
    • Value field contains 4-octet router ID of the router that announced the route to first Route-reflector (RR). This attribute is set by this first RR.
  • Cluster list:
    • Used in RR scenario to prevent routing loops. Each RR is assigned unique 32-bit value and prepends this value when it advertises a route.
  • Multi protocol Reachable NLRI:
    • To advertise routing knowledge other than IPv4 unicast.
    • Well know attribute, ‘Next-hop’ may not be available as the information about prefix/next-hop is encoded in this attribute. It contains following fields;
    • AFI – 2 octets – type of network layer information. 1- IPv4, 2-IPv6 and 196 – Layer 2 VPN
    • Sub AFI – 1 octet – 1- Unicast, 2- multicast and 128- labeled VPN Unicast.
    • Length of next-hop – 1 octet – length of following field
    • Network address of next-hop – variable – has next hop address.
    • NLRI – variable- it has 1-octet length field followed by variable prefix field.
  • Multi protocol unreachable NLRI:
    • Counterpart of MP-reach-NLRI. Has AFI, Sub AFI and withdrawn routes alone
  • Extended community:
    • It is encoded as 8-octet value. First 2 octets are called ‘Type octet’
    • First octet implies the length of administrator value and assigned number. 0x00 implies- administrator field is 2 octets (AS number) and assigned number is 4 octets. 0x01 implies- administrator field is 4 octets (IPv4 address) and assigned number is 2 octets.
    • Second octet implies actual type of community. Defined values are: 0x02 for route-target and 0x03 for route-origin
    • We can configured extended community as “set community <name> members target:65010:1111” or “origin:1.1.1.1:2222”

BGP decision algorithm:

  1. Next-hop should be reachable in inet.0 table.
  2. Highest LP. Only step which prefers higher value than lower
  3. Shortest AS path
  4. Lower origin value
  5. Lower MED value (when routes from same neighboring AS)
  6. Routes from EBGP peer preferred over IBGP.
  7. Lower IGP metric
  8. Shortest cluster list length
  9. Lowest router-ID
  10. Lowest peer-ID (interface address)
  • Router ID and peer ID check will be skipped if we configure “multipath” command.
  • When there is more than one link between EBGP peers, best way would be to have neighborship between loopback address. Configure “multihop” and “local-address” commands in EBGP group.
  • Juniper by default uses ‘per-prefix load balancing’
  • Graceful restart:
    • The restarting router sends an open msg with ‘restart state’ bit set in GA attribute. The helper router marks all routes as stale and once the restarting router comes online, the helper sends all routing knowledge and finally sends ‘End-of-RIB’ marker.
    • ‘End-of-RIB’ marker is an empty update message (23 octets) to notify the restarted router to start decision making on received routes – For faster convergence.
    • Default ‘restart time’ is 120 seconds and the minimum is negotiated.
    • Default ‘stale-routes-time’ is 300 seconds and it is locally significant.
  • Authentication:
    • MD5 authentication can be enabled at global, group or neighbor level.
    • Each TCP segment is transmitted with 16-octet MD5 digest.
    • Command: “authentication-key xxxx”
  • Avoiding connection collision:
    • passive’ command: when configured, local router will not initiate the TCP session and wait for neighbor router to start the TCP connection.
    • ‘allow <n/w range>’ command: Allows TCP connection to be originated by any routers whose peering address falls within the n/w range.
  • Prefix-limit:
    • Inside peer-group mode, we can configure “ prefix-limit maximum 100” to specify maximum prefix-limit. This only logs a message.
    • To tear down the session, use “tear-down <% value> idle-timeout x | forever”
  • Route Damping:
    • Applies only to EBGP learned routes. Will not apply for IBGP-learned routes.
    • When dampening is enabled using ‘dampening’ CLI inside a group, all prefix are assigned default figure of merit value of 0.
    • 1000 points are added when routes are withdrawn. 1000 points are added when routes are re advertised. 500 points for any path attribute change.(For a flap: >2000 points).
    • Suppression threshold: when a route reaches this value, it is suppressed. Default: 3000
    • Reuse threshold: when the value reaches below this threshold, routes are reused. D:750
    • Decay timer: Controls rate by which value decreases. Default: 15 mins (possible: 1 to 45)
    • Max. Suppression time: Regardless of value, routes are reused after default 60 minutes.
    • ‘Show route damping history’ – to see withdrawn routes that have value above 0.
    • ‘<Same> decayed’ – to see active routes which has value above 0 but < suppression.
    • ‘<Same> suppressed’ – to see inactive routes which has value > suppression threshold.
    • “Clear bgp damping” to clear the damping value and reuse immediately.
    • We can also use routing policy by configuring damping profile using ‘damping <name>’
    • Damping can be disabled for some prefix, enabled for some by route policy. Example.

Advanced BGP concepts:

  • Modifying ‘Origin’ attribute:
    • Default action: Routes injected via routing policy receive origin value of IGP (I).
    • Modify using route policy. Eg: [from protocol isis; then origin incomplete; accept]
  • Modifying ‘AS path’: Can be modified via configuration statements or via routing policy
    • ‘remove-private’ command checks for private AS numbers (64512 to 65534) in AS path.
      • Private AS is removed before default local AS prepend action.
      • Only checks most recent AS value in the path until global AS is located.
      • Hence, it will not remove the buried private AS numbers in mid of AS path
      • Typically used to remove private AS numbers assigned to their customer site.
    • ‘local-as  <value>’ command uses assigned value as AS number for EBGP peer
      • Local AS number along with global AS number will appear in AS path.
      • To avoid above, use ‘local-as xxxx private’ command.
      • Typically used when migrating from old AS number to new AS number.
    • ‘as-override’ command:
      • When local router finds peer AS number in AS path, it replaces that AS with its own AS value. Typically used while providing backbone service for BGP peers.
    • ‘autonomous-system xxxx loops <value>’ command:
      • Above command under routing-options mode allows local router to allow its local AS number to appear in the path more than once.
    • ‘as-path-prepend “AS1 AS2” command in the route policy action prepends given AS number after performing default own AS prepend. Recommended to use own AS only.
    • ‘as-path-expand last-as count <value>’ command in policy action repeats the last AS number in the path for ‘value’ times (max 32) before its default own AS prepend.
  • Modifying ‘MED’ attribute:
    • By default, MED values are compared only when update comes from same neighbor AS.
    • JUNOS automatically groups updates from same AS to compare MED(deterministic way)
    • ‘path-selection always-compare-med’ command at BGP global level makes router to compare MED values from all neighboring AS.
    • ‘path-selection cisco-non-deterministic’ command makes routers to evaluate in the order routes are received. Most recent version is compared with previous-recent route.
    • To manually set MED via configuration statements in external peer-group mode.
      • ‘metric-out xxx’ value to set MED values for all routes advertised to EBGP peer.
      • ‘metric-out igp’ to set the same IGP cost as MED. Varies when IGP cost increases or decreases. To make vary only when cost decreases use ‘minimum-igp’.
    • To set MED via routing policy, use ‘metric <values> / ‘metric igp’ / ‘metric minimum-igp’.
  • Modifying “Local preference”:
    • Default value advertised to IBGP peer is 100.
    • Use ‘local-preference’ command in neighbor mode or policy action statement.

IBGP scaling methods:

  • Reason for full mesh IBGP peering: to avoid loop. BGP uses AS path to avoid loop and as IBGP will not change AS path attribute, designers decide IBGP-learned routes shouldn’t advertise to another IBGP peer. To full reachability, all IBGP peers should have full mesh connection.
  • ‘Route reflection’ and ‘confederation’ are two methods to avoid scalability issues.

Route reflection:

  • A router is assigned as ‘route reflectors’ (RR) which reflects the IBGP learned routers from one client to another client.  RR along with clients is called a ‘cluster’ in network.
  • New attributes added to avoid loop with above scenario:
    • Cluster ID: A 32-bit unique value assigned to RR to represent a cluster in network. Router-ID of RR is used (when one RR in cluster) or unique value (when many RRs).
    • Cluster list: Similar to AS path. Here cluster IDs are prepend by RR when it advertise.
    • Originator ID: RR set this value to the router ID of the client from which it first receives the route. This attribute along with cluster list is used to check for any loop.
  • When RR receives a route from EBGP peer, it advertises that route to all clients in the cluster and to all IBGP non-client peers without adding cluster-list and originator ID
  • When RR receives a route from IBGP client peer, it advertises that route to all other clients in cluster and to IBGP non-client peers with RR attributes added. Also advertised to all EBGP peers.
  • When RR receives a route from IBGP non-client peer, it advertises that route to all clients in cluster with RR attributes added and also to all EBGP peers without any RR attributes.
  • Only the best path selected in RR will be advertised in above all cases.
  • ‘Hierarchical RRs’ is employed to address scalability issue with IBGP sessions between RRs.
  • A cluster can have more than one RR for redundancy purpose.
  • Configuration:
    • A cluster ID is configured inside a peer group using ‘cluster <value>’ command. Router assumes all IBGP neighbors in the groups as route reflection clients.
    • Same concept when configuring hierarchical route reflection.
    • In case of two route reflectors in a cluster, same cluster ID is configured on both RRs.
  • How loops are avoided: when RR receives a route with its own cluster ID available in cluster-list, those routes are rejected.

Confederations:

  • Global AS network is divided into network sections called ‘sub-AS’ or ‘member AS’.
  • Modified form of EBGP is established between sub-AS called confederation BGP (CBGP).
  • Few terms in confederation:
    • AS confederation: collections of all local sub-AS which is viewed by others in internet.
    • AS confederation number: Globally unique AS number assigned to us.
    • Member AS and member AS number: other name for sub-AS and local AS number assigned to a Sub-AS. Usually private AS numbers are used.
  • CBGP operates like EBGP except few below modifications:
    • Member AS numbers are added to ‘AS path’ attribute in one of the two AS segments:
      • AS confederation sequence: ordered list of member-AS and is the default segment used by JUNOS software.
      • AS confederation set: Unordered list of member-AS and is generated due to route-aggregation. Above two are not included in calculation AS path length.
    • Other attributes like NH, LP, MED are not modified or deleted by default.
  • Routes received can be advertised to all EBGP, all IBGP and all CBGP sessions.
  • When advertising routes to other neighboring AS routers (EBGP), JUNOS removes all member AS from AS path segment by default and prepend its own global AS number and removes all non-transitive attributes. ‘remove-private’ command is not required at the border router.
  • RR concept can be used inside sub-AS to address IBGP scalability issue within sub-AS.
  • Configuration:
    • Below commands are used inside routing-options configuration mode:
      • ‘autonomous-system  <member AS number>’
      • ‘confederation  <global AS number> members [ all member-AS numbers]’
    • As CBGP session is established between loopback address, configure ‘multihop’ and ‘local-address’ commands inside CBGP peer groups.
  • How loops are avoided: By checking whether its member-AS value is present in AS confederation list. If so, routes are rejected.

Multiprotocol BGP:

  • BGP protocol can be used to exchange routes other than IPv4, called Multi protocol BGP (MBGP).
  • MBGP abilities are advertised by sending capability option in BGP open message. Fields are;
    • Capability type – 1 octet – set to 0x01 to represent MBGP.
    • Capability length -1 octet – set to 0x04 for all MBGP negotiations.
    • Address family identifier (AFI) – 2 octets – encodes the type of network layer information. Possible values are 1- IPv4, 2-IPv6 and 196- Layer 2 VPN.
    • Reserved – 1 octet – set to 0x00
    • Subsequent address family identifier (SAFI) – 1 octet – subdivision within AFI. Possible values are 1- unicast, 2- multicast, 4-labeled unicast, 128 – labeled VPN unicast. 129.
  • Except for Ipv4 unicast, all other network layer reachability information (NLRI) is advertised and withdrawn using MP-Reach-NLRI and MP-Unreach-NLRI attributes.
  • IPv4 unicast:
    • Routes received for this AFI/SAFI are placed in inet.0 routing table.
    • No configuration is required as this is default mode. Use ‘show bgp sum’ to see table.
  • IPv4 multicast:
    • These routes are used to check RPF check. Routes will be placed in inet.2 routing table.
    • Configure ‘family inet multicast’ command to advertise this MBGP support.
  • IPv4 labeled unicast routes:
    • Used to advertise MPLS label of IPv4 routes. Separate routing table within inet.0 is used.
    • Configure ‘family inet labeled-unicast’ command to advertise this MBGP support.
  • IPv4 labeled VPN unicast routes:
    • Used in layer-3 VPN to advertise extended community and MPLS label between end LSR.
    • Configure ‘family inet-vpn unicast’ command. Routes will be placed in bgp.l3vpn.0 table.
  • IPv4 labeled VPN multicast routes:
    • Similar to IPv4 multicast. To check RPF check for customer multicast traffic in L3 VPN.
    • Configure ‘family inet-vpn multicast’ command. Routes placed in bgp.l3vpn.2 table.
  • Layer-2 VPN:
    • Customer routes are not exchanged with ISP as in L3 VPN. ISP provides only L2 logical circuit between customer end points. Uses 196/128 as AFI/SAFI values.
    • Configure ‘family l2vpn unicast’ command. Routes will be placed in bgp.l2vpn.0 table.
  • To advertise multiple address families, all ‘family’ commands are configured for a group.

Inactive Reasons:

  • Inactive reason: Not Best in its group – Route Metric or MED comparison
  • Inactive reason: Not Best in its group – Router ID
  • Inactive reason: Unusable path  <<< when next-hop is unreachable. Hidden route.
    Advertisements
    This entry was posted in bgp, jncis, Junos and tagged , . Bookmark the permalink.

    Leave a Reply

    Fill in your details below or click an icon to log in:

    WordPress.com Logo

    You are commenting using your WordPress.com account. Log Out / Change )

    Twitter picture

    You are commenting using your Twitter account. Log Out / Change )

    Facebook photo

    You are commenting using your Facebook account. Log Out / Change )

    Google+ photo

    You are commenting using your Google+ account. Log Out / Change )

    Connecting to %s