Over the last several years, we have seen a large push towards MPLS from carriers, and it has been added to the topics for the CCIE R/S.
When I started studying MPLS, I had a bit of trouble wrapping my head around it. I think the biggest reason for that is a basic misunderstanding; The MPLS that we are familiar with from carriers is actually MPLS VPN, so I was expecting that sort of behavior out of the underlying technology. This is kind of like trying to understand IP while expecting it to act like IPSec.
At a basic level, MPLS is a “Layer 2.5” encapsulation that is used primarily for forwarding IP packets, although there are other uses. It inserts a label (sometimes called a tag, especially in older versions of IOS) between the L2 and L3 headers.
In this Wireshark screenshot, we can see the MPLS label is sandwiched between the L2 (PPP, in this case) and L3 (IP) headers.
The label consists of a label number (16, in this case) the Experimental bits, now used for QoS purposes, the Bottom of Stack (BoS) bit, and a Time to Live (TTL). The BoS bit indicates the last label, in case the packet has more than one label applied, for instance for MPLS VPN. The TTL functions like the TTL in IP, getting decremented at each hop.
Without anything like MPLS VPN running over it, the process goes something like this:
- The routing protocol on the router populates the routing table.
- Label Switch Routers use LDP to advertise a label value for each Forwarding Equivalency Class (FEC), such as IP route, or BGP next hop, to their neighbors.
- Each router builds it’s forwarding table (the CEF table on Cisco routers) based on the IP routing table and the label binding received from the next hop router.
- When the router receives a packet on an interface with a specific label, it looks up that interface/label combination, and finds the outgoing interface, label value, and L2 information, like the MAC address, of the next router. The router then swaps the incoming label for the outgoing label of the next router, encapsulates the frame, and forwards it.
The preceding is a simplified explanation of the steps, and in this very basic scenario, it doesn’t seem like there is much benefit to MPLS, and, if fact, with the speed modern routers can route IP, it seems to add unnecessary overhead. Generally MPLS would be used either for additional application such as MPLS VPN
BGP learned prefixes
While most routes are added to the Label Forwarding Information Base (LFIB) by IP prefix, BGP routes are added by next hop. With this behavior, all the interior routers in a BGP network need is a route and label mapping to the Provider Edge (PE) routers, not to the individual prefixes.
For example, we have a regional service provider, and we will look at a path from a customer to an upstream network. PE1 is the router between the regional carrier and the upstream network, PE2 is the connection to the customer, and P1 is a router internal to the carrier. PE1 and PE2 are running BGP, and set next-hop-self on routes that they advertise, using their loopbacks. P1 is not running BGP, all three are running OSPF for routes internal to the provider network. PE1 advertises a tag to P1 (Actually, it would normally be to send an implicit-null, called Penultimate Hop Popping), and since P1 has a route to PE1’s loopback, it advertises a tag value to PE2. When PE2 receives a packet routed via BGP to the PE1, it uses the tag for PE1’s loopback. When P1 receives a frame from PE2, it does not have a route to the final destination, but knows from the incoming label that the packet should be sent to PE1. It swaps the incoming label for the one that PE1 expects on traffic bound to it’s loopback, and forwards the frame.
The benefit to this is that the internal routers do not have to participate in BGP, saving resources on the routers, and administrative overhead in maintaining peerings. One disadvantage is that if a frame becomes unlabeled, the internal routers are not able to forward it.
Penultimate hop popping (PHP)
In the example above, PE1 would need to do a label lookup to determine that the packet was intended for it’s address, and then an IP lookup for the packet. P1 has to do a label lookup and swap. An optimization is to have P1 simply pop the label, then forward the packet to PE1 as an IP packet. In this case, the lookup and swap is basically the same on P1, but eliminates the label lookup on PE1. This is referred to as Penultimate Hop Popping.
MPLS VPN
MPLS VPN adds to the behavior for BGP by adding another label on the ingress router. BGP shares a label value between the two edge routers for the VPN. The VPN ingress router adds this to a packet bound for the remote VPN destination, then adds a second label for the path to the loopback of the egress router. As the packet is passed between the internal routers, only the “outer” label is swapped, until the packet arrives at the egress router, which pops (removes) the transport label, then looks up the VPN label, pops it, and forwards the packet to the VPN for that label.
The descriptions here are meant to be a high level look at MPLS, and are missing details and some possible options. Look for more MPLS posts here, or check out MPLS and VPN Architectures or MPLS Fundamentals.
Pingback: Practical OSI Layers – Part 2 | Route, Switch, Blog