How does network address translation work?
I have a laptop in front of me and a server in the cloud. I establish a TCP connection between them by listening on the server,
jim@remote:~$ nc -l 0.0.0.0 12345
then connecting from my laptop:
jim@local:~$ nc 184.108.40.206 12345
Now a new TCP connection is established,
and we can see this connection from both machines using
jim@local:~$ lsof -nPi TCP | awk 'NR==1 || /12345/' COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME nc 36761 jim 3u IPv4 0x86646b399e45805 0t0 TCP 192.168.1.4:64125->220.127.116.11:12345 (ESTABLISHED)
Then from the server:
jim@remote:~$ lsof -nPi TCP | awk 'NR==1 || /12345/' COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME nc 1878 jim 4u IPv4 21055 0t0 TCP 10.142.0.2:12345->18.104.22.168:64125 (ESTABLISHED)
A TCP connection, traditionally, is identified by four things: the client IP address and server IP address, and the client port and server port. What are these values in the above connection?
There are at least a couple of oddities here.
My laptop thinks it’s connected to
but the server thinks its own IP address is
And the server thinks it’s connected to
but my laptop thinks its own IP address is
The reason for this oddity is network address translation, or NAT. This is a process done by routers, machines which sit on the path between my laptop and my server. Each router doing NAT is in two networks, i.e. the router has two network interfaces, one in each network, each with its own assigned IP address.
My TCP connection involves at least three networks, two routers, and six IP addresses! Here they are diagrammed:
Machines Networks vvvvvvvv vvvvvvvv +-my local area network-+ | | MY LAPTOP-----------192.168.1.4 | | | +-------------192.168.1.254 | | | | | +-----------------------+ MY HOME ROUTER | +-Internet--------------+ | | | +-------------22.214.171.124 | | | +-------------126.96.36.199 | | | | | +-----------------------+ GCP ROUTER | +-GCP subnet------------+ | | | +-------------10.142.0.1 | | | MY SERVER-----------10.142.0.2 | | | +-----------------------+
The three networks are my local area network (LAN), the Internet proper, and a subnet on Google Cloud Platform (GCP). The routers are my home router (a Sagemcom something-or-other which my ISP provided), and an unknown router on Google Cloud Platform. Notice that each router spans two networks, and performs NAT between them. Each router translates packets between one network and the other. Both routers are doing NAT, but different types of NAT.
The GCP router’s behavior is the simpler of the two.
The GCP router is doing “basic”, or one-to-one NAT.
When the GCP router sees an IP packet on the Internet to
the GCP router puts a corresponding packet on the GCP subnet destined for
the IP address of my server.
That is, the router modifies the destination address of the IP packet.
Most other things about the IP packet are conserved,
such as TCP port numbers.
In the other direction,
when the GCP router sees an IP packet on the subnet from
it copies the packet to the Internet,
modifying the source address to
The GCP router’s policy thus creates a one-to-one relationship
between the IP addresses
The GCP router’s one-to-one NAT procedure is stateless
(except to remember the rule which binds those two addresses together).
By contrast, my home router is doing one-to-many NAT.
When my home router receives a packet on my LAN,
if the packet’s destination address is not in the LAN,
the home router puts a corresponding packet on the Internet,
with the source IP address modified to
so that the home router will receive the destination’s response.
So far, so similar.
But one-to-many NAT must do more than this, because when the home router receives a response packet on the Internet, there is no way to tell which host on my LAN to forward it to. The router’s modification of the outgoing packets’ source addresses must have an inverse. This is not possible at the IP level, because there are many IP addresses on my LAN, but the router only has one source IP address available on the Internet! To get more addresses, one-to-many NAT works at the TCP level, which has a 16-bit source port field. This effectively gives the home router 65,536 addresses on the Internet.
So my home router,
when copying the packet from
notes down in a “translation table”
that all response on the Internet to port
should be forwarded to
When the home router receives a packet on the Internet,
the router looks up the packet’s destination TCP port
in the translation table
to find the host on the LAN to forward the packet to.
What happens if another host on my LAN, say
opens a connection with source port
I tried this by listening on port
12346 on the server,
then connecting to it from my Android phone,
nc to use a specific source port with the
The server shows two connections to the same IP address and port,
jim@remote:~$ lsof -nPi TCP | awk 'NR==1 || /1234/' COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME nc 13020 jim 4u IPv4 40075 0t0 TCP 10.142.0.2:12345->188.8.131.52:64125 (ESTABLISHED) nc 13021 jim 4u IPv4 40078 0t0 TCP 10.142.0.2:12346->184.108.40.206:64125 (ESTABLISHED)
Both of these connections work - but how?
If the server sends both packets for both connections to
how can my home router distinguish them?
The answer is the distinct ports
My home router’s translation table is not as simple as
port -> ipaddr.
Actually the translation table is
(port,port,ipaddr) -> ipaddr,
and contains the entries:
(64125, 12345, 220.127.116.11) -> 192.168.1.4 (64125, 12346, 18.104.22.168) -> 192.168.1.6
Because this is a more specific lookup than just one port, the router can have more than 65,536 addresses to work with.
But still - what happens if two separate hosts on my LAN choose the same source port
and they connect to the same IP address and port?
I tried it,
nc -l 0.0.0.0 12345 twice,
then connecting from two hosts,
both using source port
Both connections still work as expected! How? My server reports these two connections:
jim@remote:~$ lsof -nPi TCP | awk 'NR==1 || /12345/' COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME nc 13020 jim 4u IPv4 40075 0t0 TCP 10.142.0.2:12345->22.214.171.124:42202 (ESTABLISHED) nc 13041 jim 4u IPv4 40278 0t0 TCP 10.142.0.2:12345->126.96.36.199:1024 (ESTABLISHED)
Notice that the server is connected to remote port
My home router noticed that there was a clash,
and rewrote the source port to a new free port!
So my translation table is still too simplified,
and it’s actually a
(port,port,ipaddr) -> (ipaddr,port),
with the entries:
(42202, 12345, 188.8.131.52) -> (192.168.1.4,42202) (1024, 12345, 184.108.40.206) -> (192.168.1.6,42202)
There are many things I have not covered in this post. For instance, when do entries get removed from the translation table? What does my home router do with packets which don’t match an entry in the table?
I wrote this because I felt like it. This post is not associated with my employer. This site is hosted by Netlify (who are great, but I'm not associated with them either).