Understanding NAT and the Shift to IPv6

In the last section, we spent quite a bit of time talking about IPv4 addressing, and the limitations of a 32-bit IPv4 address space. In this section, the two topics we're going to cover network address translation and IPv4. IPv6 have their origins in the late mid to late 1990s when the networking community really began to understand that the exhaustion of the IPv4 address space was going to be an important concern. As we'll see both network address translation and IPv6 have a number of other important advantages as well and are seeing increasingly widespread deployment. So let's take a look.

Well, let's start with Network Address Translation, known as NAT. And here's how NAT works. The idea is pretty simple.

Within a local area network, a home network, an institutional network, an internet cafe, all of the devices within that network have an IP address from... a special range of addresses known as private IP addresses. is an example of such an address range.

Now, datagrams that are exchanged between hosts within this network use these addresses to exchange data. as always. There is no NAT involved. But what about communication outside of this local area network?

That's where NAT comes into play. In particular, all datagrams from any of the hosts inside this network, and there could be tens, hundreds, or thousands of such devices that are sent to hosts outside of the network, will all use the same single 32-bit IP address. So for this example here, although all hosts have IP addresses of the form or sometimes just as a NAT box.

So for NAT, all devices in the local network have a 32-bit address that comes from one of three what are called private IP address ranges that you see here. If you've looked at the IP address on your laptop, tablet, or computer, When you're attached to a home network, an institutional network, or a cellular network, you've maybe seen it has an address in one of these ranges. Take a look.

And NAT has a number of advantages. There will only be a single, same, 32-bit source IP address. address that will be used for all datagrams coming from all hosts behind the NAT router.

One can change the addresses of hosts within the local network without having to notify the outside world since they're all drawn from this private address range. The network can change ISPs without having to change addresses of devices in the local area network, and there are security benefits as well. Devices inside the local network are not directly addressable by or visible to the outside world.

So let's see how NAT is implemented. Well, there's three things a NATed router will need to do. First, for the outgoing datagrams, the NATed router will need to replace the source IP address and port number for every outgoing datagram with the NAT IP address and a new source port number.

It's important to know that NAT is transparent both to the local hosts and to the remote hosts. All a remote host is going to see is an arriving datagram. It's got an IP address and a port number. port number as usual, and the remote host is going to respond using that IP address and port number, as usual.

Second, the NAT router is going to need to remember every translation pair, mapping the local source IP address and local source port number to the NAT IP address and a new source port number. This translation will be stored in a NAT translation table. And third, when datagrams arrive from the external internet and are destined for a host within the local network, the NATed router will need to replace the destination IP address and port number for every incoming datagram with the corresponding IP address and source port number stored in the NAT table.

This will all become clear if we take a look at NAT in action. In this example, the local network addresses are in the range and the NAT address used by the router is 138.76.29.7. We see the NAT translation table here, which is initially empty.

In step 1, host sends a datagram with source port 3345 to destination IP address 128. 119.40.186 port 80. It's a web server. The datagram reaches the NAT router, which then changes the datagram source IP address from and source port 3345 to IP address 138.76.29.7 and source port number 5001 and updates the NAT table accordingly, as we see here. Note that the destination address and port number are both unchanged in the outgoing datagram. In step 3, the remote host is replied.

Note that the reply arrives with a destination address of 138.76.29.7, that's the NAT IP address, and destination port 5001. So the NAT router is now going to have to perform the inverse mapping. When the datagram arrives, the router indexes into the NAT table. using the destination IP address and destination port number to obtain the local IP address and the local destination port number 3345 for the host and the process on that host in the home network. The router then rewrites the datagram's destination address and destination port number and in step four forwards the datagram into the home network. Well let's wrap up our study of NAT.

by just noting that at least initially NAT was pretty controversial. We have a network layer device mucking around with port numbers which are really an end host issue. And a purist might say, well you know if you really want to solve the IPv4 address space deletion problem then let's do it with IPv6. After all that's why IPv6 was even developed in the first place.

And there are complications created by NAT. What if an external host wants to initiate contact to a host behind a NATed router? This problem is known as NAT traversal. It can be done, but honestly, it's a pretty ugly hack.

So while NAT does have advantages, it has disadvantages as well. However, network operators have really voted with their feet. NAT's widely deployed and very much here to stay for quite a while. Well, next we're going to take a look at IPv6. And as we've noted, the primary motivation for IPv6 was the much larger 128-bit address space.

But there are a number of other important innovations in IPv6 as well. And we'll encounter one really new important idea here known as tunneling when we study IPv6. So let's get started.

So yes, absolutely. The main motivation for IPv6 was the need for a much larger address space, but there were some other motivations as well. First, IP headers need to be processed at nanosecond speeds. This absolutely wasn't true back in 1981 when IPv4 was standardized, but has been true for a while given link line rates.

IPv6, as we'll see, enables fast IP forwarding by simplifying some of the more complicated aspects of IP before router processing. those arising from variable length headers, for example, datagram fragmentation and reassembly, and the need to recompute the checksum at every hop. And secondly, until the 1990s, the internet was really conceived of as more of a datagram-oriented network. The datagram was the primary abstraction. However, more recently, the notion of a flow, or we might say a connection between endpoints, has become an increasingly important abstraction.

There's been a desire, as we've seen, to provide services on a per-flow basis, not just on a per-datagram basis. IPv6 raises this notion of a flow to a first-class object by introducing the notion of a flow label into the IP header, as we'll see next. So let's take a quick look at the IPv6 datagram format.

We see here the 128-bit IPv6 source and destination addresses that we've been discussing. We also see the 16-bit flow label field that we just mentioned. And it's important to realize that this field's a mechanism for labeling flows, but that IPv6 doesn't mandate how a flow is defined or how this field's to be used. These are really policy issues that are up to the ISP. IPv6 provides mechanisms, but not policy for flow handling.

The 8-bit traffic class field is like the type of service field in IPv4. It can be used to give priority to certain datagrams within a flow or to a particular class of traffic. And like the flow label, this field is really about mechanism, not policy.

The version, payload length, hop limit, next header, or upper layer fields, and payload are just as we saw in IPv4. So all of this hopefully looks pretty familiar to you. And given that this is pretty familiar then, perhaps the more interesting issue then is what fields were in IPv4.

but are not in IPv6. In particular, there's no checksum, fragmentation reassembly, or options fields, and as we mentioned earlier, this makes the header a fixed length and allows for faster processing. Fragmentation and reassembly needs to be done at the endpoints, and options can be accomplished by passing an IPv6 datagram payload up to an upper layer protocol at the router. Let's wrap up our discussion of IPv6 by considering the following question.

If we currently have an IPv4 network, but eventually want to transition to an IPv6 network, how do we actually accomplish that transmission? We've seen that IPv4 and IPv6 are very different protocols. Well, do we have a flag day, say, where everyone transitions from IPv4 to IPv6 globally? turn off all of their IPv4 equipment and turn on all their IPv6 equipment, well that's difficult for a lot of reasons that you might want to think about.

Instead we really want IPv4 and v6 to coexist, to interoperate, as routers and hosts continue to migrate to IPv6 as new equipment is introduced. So what we'll have, what we do have today, is an internet that has some routers that are IPv4. and some routers that are IPv6, and some routers that are both IPv4 and v6.

But how do we provide this coexistence while keeping the internet running? It sort of seems like trying to change the engines on an airplane while the airplane is flying. Well, fortunately, it's not that hard.

And the key technique that's used to allow IPv4 and IPv6 networks to interoperate is known as tunneling. And we're going to take a look at that here. And the key to tunneling, as we'll see, is for a datagram, say an IPv4 datagram, to contain as its payload an IPv6 datagram. A datagram inside a datagram. But an IPv6 datagram inside an IPv4 datagram?

What's that about? That just somehow doesn't sound right. Well, it's actually not as quite as odd as it might sound.

Let's see. And here's the way to think about tunneling. Remember what we learned earlier when we were looking at layering and encapsulation way back in our introduction to networking?

In this figure here, we see two IPv6 routers physically connected by Ethernet. The link layer Ethernet frame between these two routers carries an IPv6 datagram as its payload, as we see here. This is 100% business as usual, nothing new here. Well, now let's take a look at two IPv6 routers again. But assume that in addition to knowing how to do IPv6, they also know how to do IPv4, just like they know how to do Ethernet.

And now, rather than being connected by Ethernet, these two routers are connected to each other. These two IPv6 routers are connected to each other by a network of IPv4 routers. How do these two IPv6 routers forward an IP datagram to each other?

Well, the answer, of course, is by using the IPv4. network that connects them. In this case, instead of sending their IPv6 datagram to each other in an Ethernet frame, they simply put their IPv6 datagram into an IPv4 datagram and address and send that IPv4 datagram to the other.

This is the process that's known as tunneling. Let's take a look at an example. Well, in this example, routers A and F are IPv6 only, routers C and D are IPv4 only, and routers B and E can do both IPv6 and IPv4. Let's suppose that IPv6 router A over here on the left needs to send a datagram, an IPv6 datagram, to IPv6 router F. Let's walk through what happens.

Well, A's forwarding table says that IPv6 router B is the next hop, and so the IPv6 datagram here is forwarded to B. And note carefully that this is an IPv6 datagram. The source address is A, the destination address is F.

Those are both IPv6 addresses. And there's also a flow ID field here, which only exists in IPv6. So there's nothing new here in getting from A to B.

But now let's take a look at what happens at router B. And this is where it gets interesting. It receives the IPv6 datagram from A. sees that the destination is F and that the next hop router in the IPv6 network is router E.

So that's critical. The next hop router in the IPv6 network is E. So B needs to say, well, how do I forward this IPv6 datagram to E?

Well, remember now, B and E are connected by an IPv4 network. And that's good because B and E both speak IPv4 and IPv6 and are connected by an IPv4 tunnel. And in fact in B's forwarding table there will be an entry that says to get to F forward this through the outgoing interface that's an ipv4 tunnel to e so b creates an ipv4 datagram addresses that datagram to e's ipv4 address and that's critical puts the ipv 6 datagram as the payload into the IPv4 datagram and forwards that datagram into the tunnel. And let's take a careful look at this IPv4 datagram that's being forwarded from B to C.

into the tunnel. Note that for the IPv4 datagram, the source is router B and the destination address is router E. That's for the IPv4 datagram. And of course, inside that IPv4 datagram is the IPv6 datagram, the original datagram that started from A, and that's got a source of A and a destination of F.

But that's just the payload inside this outermost IPv4 red datagram shown here. here. The IPv4 datagram is then forwarded through the IPv4 network and arrives at E using the mechanisms that we already know really well.

Within the IPv4 network, this is really just another IPv4 datagram and it's addressed to E. There's nothing new here. At IPv4 destination E, E says, okay, I'm the destination for this IPv4 datagram, the red datagram, and so it looks inside and finds an IPv6 datagram.

It then extracts the IPv6 datagram, looks at the IPv6 destination address F, looks up F in its forwarding table, and forwards the IPv6 datagram on the link towards F. So in this example, we've seen that IPv4 is really sort of considered almost as a link layer technology that directly connects to IPv6 routers. The path through the IPv4 network can be thought of as a link layer.

abstractly as a tunnel that directly connects the two IPv6 routers. And in this way, through the use of tunneling, both IPv4 and IPv6 can coexist to forward datagrams end-to-end along a path of mixed IPv4 and IPv6 routers. The key thing, as we've seen here, is that at the boundaries between the two types of technologies, IPv4, IPv6, we have routers that can do both IPv4 and IPv6. IPv6, and this is true of all modern routers. We'll see later that tunneling is used extensively in cellular networks to support mobility, so it's a general and therefore an important concept.

But you know, I found that sometimes it's a difficult concept for students to get their arms around it first, so you might want to think about it some, or listen to this section again, as it's an important technique that transcends just IPv4, IPv6 interoperation. Well, we've... We've seen that IPv6 was standardized in the late 1990s. How widely is it deployed today, say 25 years later? Well Google reports that almost 30% of the clients that access its services do so using IPv6.

And in the US, the National Institute of Standards reports that about a third of all US government domains are IPv6 capable. So that's some progress, but IPv4, 25 years later, is still by far the more more dominantly used technology. And you might think, well, why is that the case?

Well, certainly the widespread deployment of NAT has eased the pressure on the IPv4 address space and made the adoption of IPv6 less critical. But it's also interesting to think as a contrast about what's changed at the application layer in the last 25 years. The emergence of the web, social media, media streaming, gaming, telepresence, and more.

That's an amazing amount of change at the application. application layer, and it just shows how easy it is to build, to innovate, and to deploy at the application layer, while building, innovating, and deploying new, well, plumbing at the network layer takes a lot longer. In this section and the last, we've taken a deep dive into the Internet Protocol, IP, which lies at the heart of the Internet's network layer, and we've covered a huge amount of ground. We looked at the IPv4 datagram format.

We looked at IPv4 addressing. We looked at network address translation, and we looked at IP version six. And so maybe you're feeling almost a little exhausted now, but hopefully you've learned a lot. Coming up next, we're gonna take a look at generalized forwarding, which will really set the foundation nicely for looking at software defined networking when we take a look at the control plane.

Transcript for:Understanding NAT and the Shift to IPv6

Transcript for:
Understanding NAT and the Shift to IPv6