Rack of Ethernet switches.

How Routing Works

IP Routing Logic

How do networked hosts route traffic to its destination? For most hosts, routing comes down to knowing just enough to make it someone else's problem. Hosts do not have complete routing tables describing the entire Internet. They generally know just enough to distinguish between "directly connected", meaning "on the same LAN", and "somewhere else", which makes it some router's problem. Routers know a little more about the topology a few hops away.

DS3 interfaces on a Cisco 7000 series router.

DS3 interfaces on a Cisco 7000 series router.

Rack of Cisco routers.

Rack of Cisco routers. Most are 3600 series routers, the brown one near the bottom is a 2500 series. The network connections are on the opposite side, the single cable to each router is a connection to its console port.

Every host must apply the IP routing logic for every packet it transmits. Somewhat simplified, that logic is:

  1. Is the destination directly connected?
    • If so, use ARP to find the MAC address and deliver the frame directly across the attached network.
    • If not, continue...
  2. Do I have a host-specific route?
    • If so, do what that route says.
    • If not, continue...
  3. Do I have a network-specific route?
    • If so, do what that route says.
    • If not, continue...
  4. Do I have a default route?
    • If so, do what that route says.
    • If not, continue...
  5. The packet is unrouteable! Report an error!
    • Send an ICMP "Destination Unreachable" datagram to the sending host.

That is the logic, but how does it happen?

Simple binary logic is applied to the destination IP address, the netmask, the router's IP addresses, and the routing table entries.

  1. Is the destination directly connected?
    Is this true, for one of my network interfaces?
    ( my_ip AND netmask ) = ( destination_IP AND netmask )
    • If so, use ARP to find the MAC address and deliver the frame directly across the attached network.
      Send an ARP request, a broadcast frame on the local network that will not be forwarded by routers. It takes the form, "I am looking for the device using IP address such-and-such. Who has it?"
      Expect to receive an ARP reply, a response from the requested host, of the form "I have that IP address, and here is my MAC address". Then send the packet, encapsulated as:

      MAC: destination MAC address
      IP: destination IP address
    • If not, continue...
  2. Do I have a host-specific route?
    Is there a route to exactly that one IP address? This will be a route with a netmask of /32, or 255.255.255.255, meaning "All of the address bits must be as specified."
    • If so, do what that route says.
      That route will specify forwarding the packet through some directly-connected router. So, first you have to find that router's MAC address so you can send the frame across the LAN.
    • If not, continue...
  3. Do I have a network-specific route?
    Do I have a routing table entry where the following is true, where both route and netmask are the values from that entry?
    ( route AND netmask ) = ( destination_IP AND netmask )
    • If so, do what that route says.
      That route will specify forwarding the packet through some directly-connected router. So, first you have to find that router's MAC address so you can send the frame across the LAN.
    • If not, continue...
  4. Do I have a default route?
    • If so, do what that route says.
      That route will specify forwarding the packet through some directly-connected router. So, first you have to find that router's MAC address so you can send the frame across the LAN.
    • If not, continue...
  5. The packet is unrouteable! Report an error!
    Send an ICMP message to the originating host, type "Destination Unreachable" and specifically "Network Unreachable".

    The formal definition says that this should not be done if the unroutable packet is itself ICMP, because that would be using ICMP to report errors about ICMP and that process that might spiral out of control. However, Microsoft's tracert does not correctly implement traceroute, it uses ICMP packets with artificially small TTL values rather than the specified UDP. So, many implementations will report errors for unroutable ICMP to support Microsoft's non-compliant implementation.

An Example of IP Routing Logic

Here is a simple network, the 10.1.1.0/24 network, meaning that all all hosts have IP addresses starting 10.1.1. There are three hosts connected — host1, host2, and router, with their IP addresses shown. The router has a second interface, which will have an entirely different IP address belonging to a different network. We will deal with that later.

Below is the routing table for host1, as displayed on a Linux system:

Routing diagram: two hosts and a router on a LAN
Linux% netstat -nr
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
10.1.1.0        0.0.0.0         255.255.255.0   U         0 0          0 eth0
127.0.0.0       0.0.0.0         255.0.0.0       U         0 0          0 lo
0.0.0.0         10.1.1.254      0.0.0.0         UG        0 0          0 eth0

Linux reports a gateway of 0.0.0.0 if it should send the packet directly. It says that anything on the 10.1.1.0/24 network is sent through Ethernet interface eth0, and anything on the 127.0.0.0/8 network refers to the local host and is sent through software loopback, emulated as interface lo.

It reports the default route as a destination of 0/0, meaning "I don't care about any of the bits, anything else would match this."

As for the flags:

Here is the routing table as reported on a BSD system:

OpenBSD% netstat -nr -f inet
Routing tables

Internet:
Destination        Gateway            Flags    Refs      Use    Mtu  Interface
default            10.1.1.254         UGS         5    23902      -   sis0
10.1.1/24          link#1             UC          1        0      -   sis0
10.1.1.254         00:0d:61:b1:86:53  UHLc        3     4006      -   sis0
10.1.1.1           127.0.0.1          UGHS        0        0  33208   lo0
127/8              127.0.0.1          UGRS        0        0  33208   lo0
127.0.0.1          127.0.0.1          UH          2       82  33208   lo0

This uses the same logic with a slightly different format of presentation.

The BSD routing table is also showing the ARP cache for the default router.

Host-specific routes are also reported, including one that means "If anything tries to send a packet to 10.1.1.1, that's really me, so use software loopback (127.0.0.1)."

As for the flags, it's the same as Linux with the addition of:

So, let's say that host01 wants to send a packet to host02. It resolves the hostname host02 to the IP address 10.1.1.2, and then applies the logic.

In decimal:
                            ?
10.1.1.1 AND 255.255.255.0  =  10.1.1.2 AND 255.255.255.0

In binary, the left side is:
     00001010 00000001 00000001 00000001
     11111111 11111111 11111111 00000000
AND ------------------------------------
     00001010 00000001 00000001 00000000

And the right side is:
     00001010 00000001 00000001 00000010
     11111111 11111111 11111111 00000000
AND ------------------------------------
     00001010 00000001 00000001 00000000 

So, yes, it's directly connected! host01 sends an ARP request for 10.1.1.2. Both host02 and router receive that ARP request; router ignores it, but host02 should respond.

Given the MAC address for the IP host 10.1.1.2, host01 can send the frame directly across the LAN. And, host01 will keep the MAC address for 10.1.1.2 in its ARP cache so it doesn't have to ask the question again until some time has elapsed.

The same logic would apply if host01 were sending a packet to router. But what if host01 wants to send a packet to, say, 213.24.76.9?

Is it directly connected? No!

In decimal:
                            ?
10.1.1.1 AND 255.255.255.0  =  213.24.76.9 AND 255.255.255.0

In binary, the left side is:
     00001010 00000001 00000001 00000001
     11111111 11111111 11111111 00000000
AND ------------------------------------
     00001010 00000001 00000001 00000000

And the right side is:
     11010101 00011000 01001100 00001001
     11111111 11111111 11111111 00000000
AND ------------------------------------
     11010101 00011000 01001100 00000000 

Is there a host-specific route, a routing table entry for 213.24.76.9/32?

No!

Is there a network-specific route, a routing table entry for, say, 213.24.76.0/24, or 213.24.0.0/16, or similar?

No!

Is there a default route?

Yes! Good news, no routing error. Do what that route specifies, which means making it the problem of the router. Routers tend to know more about network topology, and they tend to have default routes. When you get to the core of the Internet, the backbone routers, they have enormous routing tables because they have to know (at least generally) where everything is.

Putting It All Together, End to End

Here is a more realistic situation, where host1 wants to send a packet to remote host2. I'm not showing the ARP packets, let's assume that all the hosts and routers have already discovered each other's MAC addresses:

Network diagram: 4 networks, 3 routers, 2 hosts.

The important thing:

The IP addresses are always global, the end points.
From the original sender,
to the final destination.

The MAC addresses are always local.
The sender and destination for that hop only.

NAT

If one of the routers is doing Network Address Translation (NAT), also called IP Masquerading, then at some point a lie is told about the original sender's IP address.

Sanity Checking, Ingress Filtering, and Egress Filtering

Cisco 2514 router, Cisco 2912 Catalyst switch, and Cisco 4500 router

Cisco 2514 router, Cisco 2912XL Catalyst switch, Cisco 4500 router

Sanity checking, ingress filtering, egress filtering on a border router.

The description above is all you need to get IP routing working. Back in the day, when the Internet was a far more cooperative place, that's all you needed. But things have changed and security mechanisms must be added.

A set of packet filtering rules, sometimes called an Access Control List (ACL), is a common security function performed by routers. Another common function is something that goes by three different names: Sanity Checking, Ingress Filtering, or Egress Filtering. Sanity Checking is the general term. The other terms refer to doing it just in one direction.

As explained above, all any host needs to consider for next-hop delivery is the destination IP address. The filtering added by Sanity Checking also looks at the source IP address, in order to drop those packets obviously using a spoofed source address in an attempt to make the destination host trust them. A router does not necessarily know the location of the claimed source IP address, but if it does, and if the topology makes no sense (or is insane) based on the interface where the packet arrives compared to the known network location, that packet is dropped.

Look at this simple network diagram at right here. We want to set up sanity checking on our border router. Let's assume that we use the 44.0.0.0/8 IP block inside and we are not running NAT, just to keep this simple. The below table shows packets that do and do not make sense at our border router.

Arriving at interface Source Destination Valid?
exterior 44.*.*.* anything NO! It makes no sense for a packet from our internal networks to arrive from the outside world! This must be a spoofed packet, so drop it! To be specific, this is ingress filtering since we are applying it to inbound packets.
exterior anything other than 44.*.*.* 44.*.*.* Who knows? Our border router doesn't know where everything is, but this isn't obviously a lie so accept and forward it.
interior 44.*.*.* anything other than 44.*.*.* Yes! Of course, our border router doesn't know if the destination is really routeable or not, but let's try.
interior anything other than 44.*.*.* anything NO! It makes no sense because all we have inside is the 44/8 network. Some host inside our organization is either horribly misconfigured or else it is trying an IP spoofing attack against some outside host. Drop the packet! To be specific, this is egress filtering since we are applying it to outbound packets. The Internet would be a much better place if every organization and ISP did egress filtering. If an ISP does it, then their customers cannot launch IP spoofing attacks against other networks.

Cisco calls sanity checking Unicast Reverse Path Forwarding or Unicast RPF. To enable it, you would do something like the following in the startup configuration:

interface Serial 0
  ip address 1.2.3.4 255.255.255.255
  ip verify unicast reverse-path
  no ip redirects
  no ip directed-broadcast
  no ip proxy-arp

Other Pages