TL;DR
- Tunneling TCP connections over a TCP-based VPN leads to conflict between the reliability mechanisms of the two connections, resulting in decreased bandwidth and stutter-y performance. Always tunnel TCP over UDP when you can.
- In rare cases, overly restrictive firewalls may block UDP traffic, in which case you should obfuscate the tunnel’s UDP traffic as TCP to bypass restrictions.
- The adoption of UDP-based QUIC will make more internet traffic be on UDP, forcing firewalls to be less restrictive towards UDP traffic.
The Setup
Modern tunneling (VPN) protocols1 like WireGuard use UDP for its data transfer, which may seem odd at first glance: don’t we want the reliable transfer that TCP claims to offer?
It turns out that the exact mechanisms that ensures delivery for TCP is what makes it such a terrible tunneling protocol.
TCP is designed to operate over IP, which provides no delivery guarantees: any packet may be lost or dropped along its route. To guarantee delivery on lossy IP, TCP dictates that for every packet sent, an acknowledgement (an ACK packet) be sent by the receiver to tell the sender that the packet was successfully received (kinda like a “10-4” over the radio). But what if we never get a timely (read: within a certain timeout) acknowledgement? Then TCP will assume the packet is lost and do the following:
- Retransmit the packet in case it was lost along the path of transmission
- Increases the timeout such that future acknowledgements may arrive within this higher timeout in case the network topology has changed and packets now take longer to reach their destination.
This mechanism works well enough on its own, and countless internet-connected devices operate this way every day. However, the trouble for TCP-based VPN protocols comes from their need to nest an end-to-end TCP connection (say, between your browser and blog.carldong.me) inside a tunnel TCP connection (between your computer and the VPN server).
In this setup, when your browser connects to blog.carldong.me, it will start the end-to-end TCP connection and every TCP packet sent along this connection will be queued for the VPN to wrap it inside another TCP packet before actually going out on the wire (or over the air for you WiFi-using degenerates).
Here’s what that looks like in action (with ACKs and all):
The Drawback of Being Earnest
Don’t feel like reading? Skip to the animation!
Now, let’s say the end-to-end connection sends a packet A
that gets wrapped into packet w(A)
and sent out over the tunnel connection, but somewhere along the way something goes wrong2 and the tunnel connection never receives an acknowledgement for w(A)
, what happens then? Well, the tunnel connection, being the well-behaved and delivery-ensuring TCP connection it is, will keep trying to retransmit w(A)
until it receives an ACK.
Here comes the crux of the problem: the end-to-end TCP connection has no idea that it’s operating over a delivery-ensuring connection – it still assumes that it’s operating over lossy IP! Therefore, after its timeout, the end-to-end TCP connection mistakenly retransmits packet A
, thinking that packet A
has been lost, which is wholly unnecessary as the tunnel connection is still holding on to w(A)
and trying its best to retransmit it. The tunnel connection, upon receiving this retransmission packet (let’s call it A'
), will blindly wrap it into w(A')
and try to deliver it, not knowing that delivering A'
is, again, wholly unnecessary.
This whole interaction wastes CPU time, clogs up the tunnel connection’s queues, and most importantly lengthens the time until subsequent packets with new data can be sent. In fact, if the end-to-end connection has a much lower timeout than the tunnel one, the end-to-end connection may queue up multiple retransmissions (A'
, A''
, A'''
, etc.), all of which are congestion inducing, and, say it with me this time: WHOLLY UNNECESSARY.
Here is the whole ordeal, animated:
This situation is what’s referred to as TCP-over-TCP meltdown (or simply TCP meltdown): TCP connections operating inside other TCP connections have no idea that the tunnel connection already ensures the delivery of its packets, and the end-to-end connection’s blind eagerness to ensure delivery causes congestion, unnecessary retransmissions, and degraded throughput.
UDP to the rescue
So, what can we do about this? Well, one insight comes from the fact that end-to-end connections only assume that they’re operating over lossy IP. What if we tunneled them inside an IP-like lossy connection?
This is precisely how TCP-over-UDP works: UDP is lossy, completely stateless, and doesn’t have any delivery-ensuring algorithms that can interact poorly with its end-to-end connections. Put simply, UDP leaves deliverability and other transport layer characteristics up to the end-to-end connection and acts as a dumb, lossy data wrapper.3
In fact, UDP’s header is substantially smaller than that of TCP’s (64 bits vs 160 bits), allowing more data to be delivered per packet.4
The simplicity of UDP is what makes it a great tunneling protocol, and why modern VPN protocols have all flocked to it. Sidenote: The Linux kernel even has a builtin fou
module that tunnels unmodified (read: unencrypted) IP packets over UDP.
A cruel world network
Before you start declaring yourself a UDP maximalist though (wait until you read about their NAT traversal potential!), there is one important real-world downside: network admins often configure their firewalls to be overly restrictive towards UDP traffic (“why would anyone use UDP for anything other than DNS port 53?”).
I’ve personally seen this in WeWork’s Wi-Fi, UC Berkeley’s Campus Wi-Fi, and certain nation states.
In such cases, you’d need to masquerade your UDP traffic as TCP traffic using something like udp2raw
. udp2raw
establishes a fake TCP connection that masquerades UDP packets as TCP packets5, but without any acknowledgements or retransmissions. This avoids the TCP-over-TCP meltdown problem while still having the packets appear as TCP packets to network observers.
A hopeful future
A reason for optimism comes from the next major version of the HTTP protocol: HTTP/3. It is now supported on all major browsers and uses QUIC, which basically multiplexes reliable streams over UDP (kinda like TCP-over-UDP!). Should HTTP/3 gain widespread adoption among clients and servers (read: websites), more and more of the world’s internet traffic will operate over UDP. Network admins the world over will be forced to be more lenient towards UDP traffic, hopefully obviating the need for tools like udp2raw
!
It isn’t every day that a technical improvement also improves the free flow of information over the internet. And for that, I’m grateful!
My secret hope is that in future HTTP/TLS versions, a confidential, pseudorandom-appearing, and shapable transport protocol like the one currently proposed in BIP324 is used to defeat deep packet inspection and make popular protocols like HTTP indistinguishable from all other protocols that uses this style of transport. But we’ll save that for another time (since you’ve read this far, consider signing up below! 😄).
-
To be precise, we’re only talking about layer-3 tunneling protocols here ↩︎
-
As noted by hobbified on lobste.rs, “nothing exceptional has to happen” to trigger dropped packets and dropping packets “is a perfectly normal response to fluctuations in bandwidth demand along the route” ↩︎
-
Why don’t we just tunnel over IP itself? Well, IPIP does exactly that, but it is easily detected and isn’t easily multiplexed. ↩︎
-
Depending on the specific VPN protocol, the checksum field in UDP’s header may also be repurposed for additional data space in theory, yielding an additional 16 bits. ↩︎
-
You can even have it simulate sequence number increments! ↩︎