I have been taking a free networking class from Stanford University’s online “open source” education platform. I have really been enjoying the first unit of the course as it has started filling in some gaps in the foundation of my understanding regarding networking, the internet, and TCP/IP. I highly recommend this to anyone that has been in IT for a while but has never taken a more “academic” approach to their work. Okay, so that is my plug for free education. You can check out more here if interested: https://lagunita.stanford.edu/

OSI 7-Layer Model, TCP/IP 4-Layer Model
One of the gaps in my understanding of networking has to do with the OSI 7-layer networking model and the more simplified TCP/IP 4-Layer model (which was developed by DARPA? and predates the OSI Model). I didn’t even realize there was anything other than the 7-Layer model until taking this class and furthermore didn’t realize that while the OSI model gets talked about and referenced more frequently, academia (I think… and perhaps the industry) is shifting to using the simpler 4-layer model for discussion, understanding, and development regarding networking. Please don’t take any of this as gospel truth, this is just my understanding based on coursework and reading. I also find it much easier to think about and reference the 4-layer model. If you are curious how the two compare, this technet article is an interesting read, Technet: TCP/IP Protocol Architecture. Okay, so for this article, I will be sticking with what I am most comfortable with at this point and will be talking about and referencing the 4-layer TCP/IP model and discussing how VPN works.

Layering and Encapsulation

Without going into (too) lengthy of an explanation, the whole premise of internet communication relies on the concepts of layering and encapsulation. The essential idea is that (in the case of the TCP/IP model) you have a four layer “stack” of protocols or technologies that are in one sense independent of each other, but each receives a “service” from the layer below and provides “service” to the layer above. The way this is done, in part, is via the concept of encapsulation. During end-to-end communication, each layer envelops (encapsulates) the layer above it. Furthermore, each layer is going to do its defined “job” regardless of whether or not the layer above it is working correctly.

A simple analogy is the postal service…

You write a letter in English. You put it into an envelop with an address. This is picked up by a postal worker and forwarded through the postal system. The it is delivered to the recipient, who then opens the letter and reads it. What if the recipient doesn’t speak English? Will the postal system fail to deliver the letter? No. It will still proceed through the postal system and be delivered. Something being done incorrectly at an “upper layer” isn’t going to affect the functionality of lower service layers.

What happens if the postal service decided to replace all postal workers with flying robots? Does this change the content of your letter? Does this mean that your letter must take a different route as it gets forwarded through the postal network? No, and No. The “layers” of the postal service can be changed independently without necessarily impacting the layers above or below.

So that is a very brief (and I am certain utterly inadequate…) explanation of layering.

Regarding encapsulation… lets stick with the same analogy…

Your letter is put in an envelope. The envelope is picked up by a postal worker who puts it in a box with all mail destined for a certain zip code, and the box is kept on the postal truck. The truck travels to a distribution center where the box is unloaded and then forwarded on to another distribution center based on the zip code on the box. From here, the box (with your envelope containing your letter) is then placed on another truck. The trunk then goes to a specific address where a postal worker takes the envelope out of the box and delivers it to the recipient. The recipient then opens the envelope and reads the letter. Each “layer” encapsulates the other layers within it. The truck (part of the distribution network) holds the box that holds the envelope that holds the letter.

So how does VPN work???

VPN involves “encapsulation within encapsulation” (my terminology at least) + encryption to achieve a logically “local” connection to a network from a remote host or location. Based on my reading I created the following diagram to help me wrap my head around it.

I apologize if the diagram is ugly as sin and (possibly) contains errors… I created this quickly for myself and then found the whole thing so interesting I wanted to write an article about it…

To explain the diagram in brief, on the left is the 4-layer TCP/IP model with an example given of what you might find in each layer. So we have the following:

  • Layer 4: Application Layer (HTTP, FTP, etc… for ease of writing I am saying TLS goes in the App layer…)
  • Layer 3: Transport Layer (TCP, UDP, etc. + Port)
  • Layer 2: Network Layer (IP)
  • Layer 1: Link Layer (physical medium + MAC addresses)



On the right is an diagram showing how VPN encapsulation works and if you look at it for a few minutes you will discover what I mean by “encapsulation within encapsulation.” For “non-VPN” network traffic you would only see one “round” of encapsulation. For example, an HTTP Get request (application layer) encapsulated within a TCP segment(s) (Transport layer) encapsulated within IP Packets (Network Layer) encapsulated within an Ethernet Frame (Link Layer) which is sent over the wire (Link layer).

However VPN does something interesting because we are talking to a different network but we want to make it logically like our traffic is on the same network… So what we do is encapsulate our HTTP GET Request (application layer) within Transport and Network layers that are intended for the local destination network… We then send this to our VPN Gateway (so the VPN client on your computer) which uses TLS to encrypt all of this information and then begins to package it all up (encapsulate) into another Transport Layer and Network layer for the purpose of getting all of this information routed to the destination network VPN Gateway. This all gets spit out over (encapsulated by) the Link layer. Once this arrives at the destination VPN Gateway it is “unpacked” down to the TLS encrypted data which is then decrypted and then the unencrypted Network Layer (which contains the TCP Layer and Application Layer data) is routed through the destination network normally (like other local traffic) and it all logically appears to be local because those inner Transport, Network, and Application layers were all packaged up for the destination network.

Furthermore, anyone that might “intercept” this traffic “mid-stream” as it crosses a public network (your ISP’s network or your local coffee shop wifi network, etc.) could see the source and destination VPN gateway addresses but beyond that will have no knowledge of the true source and destination client/hosts nor of the data being exchanged between them because all of this is encrypted via TLS.

Anyhow, I realize this article isn’t exactly exhaustive and I am hoping it is (at least) accurate. I just wanted to do a write-up of this as it helps cement a rather complex concept/framework in my head. I do hope it at least has shed some light on the topic for you and perhaps piques your interest and causes you to do a bit more digging yourself.

1 of 1

4 comments on: A little bit about how VPN actually works… – The 4-Layer TCP/IP Stack + TLS

  1. Jounghoo Lee
    Reply

    I surfed the internet for an hour, this is the best diagram I have ever found. Thank you so much

    • nbeam
      Reply

      I love diagramming stuff :). I am so glad this was helpful!

  2. Sascha Aegerter
    Reply

    I agree, this is the best diagram I found as well and I liked the explanation! Just a side note, isnt TLS part of the transport layer?

    • nbeam
      Reply

      It’s been years since I wrote this and honestly I am not sure off the top of my head. What I can say is I was going through a free course via Stanford Online at the time and they referenced a 4-layer model vs. the 7-layer model I saw when studying for MS certs so that might be why the difference (or I was just wrong, or you are wrong šŸ™‚ or somehow we are both wrong lol). Based on that little note on the left-hand side I wrote on the diagram, I would guess you are probably correct though. I think app layer is definitely more like your API calls and whatnot and it makes total sense that TLS happens below that.

      If I had free time, I would love to revisit all of this again – it was fun to be in learning mode for a while and not just work work work (which granted, involves a lot of learning but not like this)…

Join the discussion

Your email address will not be published. Required fields are marked *