Immerse Yourself Into Computer Networking

I am convinced that anyone who is starting to learn programming, regardless of their goal, they will likely need to have a basic understanding of networking.

I’ll cover some basics, how browsers work, what’s operating system’s role in all of this, as I delve deeper into this topic throughout this post, we will explore various aspects of networking in greater detail.

At the end, you will be able to implement a protocol such as HTTP!

Let’s start with your smartphone

Your phone must have a WiFi and Bluetooth module, they’re little chips located somewhere on device’s motherboard. Operating system powers them up and makes them available for your applications.
Apps that require internet connection utilize the low level API provided by the OS to send (and receive) data, which is then sent through the WiFi module before being transmitted to your router and ultimately reaching its final destination.

Data representation in network transmission

Of course all digital information consists of nothing but a series of zeros and ones, but it’s crucial to interpret these binary sequences.

That’s where the usefulness of internet packets AKA IP datagrams comes in. They contain the actual payload, following with source and destination information in the form of IP addresses, as well as other additional information that I’ll not cover in this post.

Internet Packet Breakdown

So, every packet can be read by any other device, such as router then addressed to a specific destination using that IP Header… Routers in the network use this information to determine the next hop for the packet, and the process repeats until the packet reaches its final destination.

The Vital Role of TCP in the Internet

There are many protocols on top of IP, but most used ones are TCP and UDP.

TCP and UDP packets have a similar structure. Both protocols use the same IP (Internet Protocol) packet format to transmit data, which includes a header that contains information about the source and destination of the data, as well as error-checking information. However, the main difference between TCP and UDP packets is the type of transport layer protocol used to transmit the data.

TCP provides a reliable, ordered delivery of data,
while UDP provides a fast, but unreliable delivery of data.

So, if latency is important like for online games, voice chats and similar - UDP is considered a better option…

But, on top of TCP are built the most used high-level protocols of all time:

HTTP (mostly used in web for transferng HTMLs, static content, etc.)
FTP (file transfer protocol, obviously for transfering files)
SMTP (protocol designed for mailing)

Because TCP provides true reliability, you have the ability to define your own protocol. This is why there are many application-layer protocols built on top of it, it’s a lot easier than using UDP, you don’t have to worry about any lost or duplicate packets.

You can actually see almost all the protocols on different layers:

Protocols

How do browsers work?

Let’s finally start with something useful lmao.
Every web browser must have an address bar where you can type in a URL and be taken directly to the page. If the input does not resemble a URL, the browser will redirect you to a search engine with your entered query (Google or whatever is in your browser settings).

Domains

For example - you’re heading to https://google.com. Before the :// you define the protocol you want to use, most likely it will be HTTPS (also could be ftp, or smtp and so on…).
Then your device gets IP address of the domain name google.com, almost certainly it already has the address cached somewhere. So no requests will be made to DNS servers.

Security

Well, with the IP address obtained, browser can make a TCP connection to the server and start making HTTP requests to it. There’s a security layer sitting in between TCP and HTTP called SSL or the improved one TLS.
Without these, anyone including your internet provider could read contents of HTTP in plain text.

Also there are many other important security mechanisms like CORS that restricts access to resources on a web page from a different domain, but that’s another topic.

Resources

When visiting a website, the browser initiates a single GET request to retrieve the main HTML file. I guess we all know how a HTML file looks like… There’s a header part and the body part.

We are particularly interested in the header section as it contains various tags that inform the browser about additional resources required to fully and accurately render the page.
(E.g. link tag mostly for CSS and favicons and script with src attribute for JS.)

Header Contents

Those extra resources are asked also by the browser in form of a HTTP request, so it is necessary to understand that server can’t push any data to the client without client asking!
(the newer versions of HTTP somewhat allow this operation)

Rendering

This is the pipeline how your browser renders everything with requested HTML and CSS.

Rendering Pipeline

Also here’s awesome post by Mozilla if you want to learn more.

HTTP Breakdown

Now that you have gained a little bit knowledge about networking (or perhaps you already have a good understanding of this) - we can look in detail how HTTP works.

Here’s the flow:

Server-Client TCP connection is opened.
HTTP data is transfered
Connection is kept alive for possible future requests

HTTP is text-based protocol - it’s just a string!

HTTP Request
^^^
This is a request made by a browser.

Request line and request headers are just separated by \r\n, so the actual request looks like this:
GET /doc/test.html HTTP/1.1\r\nHost:...

Okay, on the first line you can notice GET keyword. It’s just a way of telling what request will do -

GET usually means we’re telling the server that we want to GET something, whether it be HTML, CSS, JS, images - doesn’t matter.
POST we’re sending data to the server (could be login information, image upload, etc.)

This isn’t a strict “rule”, so it depends how a web server is written. Some websites do retreive the data with POST request 🤦

Next, the path is specified where the resource is located on, in this case /doc/test.html.
And finally the HTTP version, here’s a good article to see what are differences between versions.

And that’s enough of what is required to make a legit HTTP request to a server!

Writing a web server

You probably SHOULDN’T write a web server from scratch as there are so fucking many of them, just to name a few:

Apache HTTP Server, Nginx, Lighttpd, Cherokee, Caddy, Tomcat, Glassfish, WildFly, Node.js, Ruby on Rails, Django, Flask, Express.js, Litespeed, Yaws, Cowboy, Gunicorn, Hiawatha, Swarm and more.

Just use whichever you find the best for you.

But for learning purposes, let’s write one more!
To be honest, this won’t be a complete web server with multithreading, routing, authorization and other goodies that provide servers listed above.

So we’re writing a simple program that will listen for incoming connections, parse HTTP and return back a HTML file.
Let’s do it in Rust, why not.

Create a socket

Most programming languages provide some kind of a low-level API to create sockets, send data through them etc. In Python there’s socket from standard library, in Rust we have TcpListener from std::net.

So, let’s make it usable in our code as “TcpListener”:

1

use std::net::TcpListener;

TcpListener provides bind method, which returns a new TcpListener bound to an address:

1
2
3


fn main() {
    let listener = TcpListener::bind("localhost:5550").unwrap();
}

Accepting new connections (clients)

As we can see in docs, TcpListener has incoming method which returns Incoming iterator so we can iterate on this thing for incoming connections! Isn’t that awesome!?

1
2
3


for stream in listener.incoming() {
    let stream = stream.unwrap();
}

Great, we have established a bidirectional communication channel with the client that is both readable and writable.

Parsing HTTP

As soon as connection is opened, we need to read the HTTP request sent by the browser.
There are several ways to do it. I’ll use BufReader, it can be used to wrap a stream and provide more convenient reading methods, such as read_line and read_until.

1
2
3
4
5


for stream in listener.incoming() {
    let stream = stream.unwrap();

    let reader = BufReader::new(&mut stream);
}

As we can see it accepts mutable reference to our stream we opened. Now we have a beautiful API to read those bytes >:)

To parse HTTP, it has to be splitted into multiple lines as I explained before. One parameter == one line…
We only care about the first line as it contains the path we need.

1
2
3
4
5
6
7
8


for stream in listener.incoming() {
    let stream = stream.unwrap();

    let reader = BufReader::new(&mut stream);
    
    let mut request_line = String::new();
    reader.read_line(&mut request_line).unwrap();
}

Let’s print request_line to stdout and try to connect with browser to our “server” :D

1

println!("Request: {}", request);

Aaaaand then type localhost:5550 in browser to connect:

1
2
3
4
5
6
7


Request line: "GET / HTTP/1.1\r\n"
Request line: "GET / HTTP/1.1\r\n"
Request line: "GET / HTTP/1.1\r\n"
Request line: "GET / HTTP/1.1\r\n"
Request line: "GET / HTTP/1.1\r\n"
Request line: "GET / HTTP/1.1\r\n"
Request line: "GET / HTTP/1.1\r\n"

Yay!
Because we send nothing in return, browser is retrying to send request over and over until it reaches some timeout value.
That’s the reason why we see multiple outputs.

So let’s fix that.

Returning appropriate HTML file

“The browser is like a hungry beast, anxiously pacing back and forth, eagerly awaiting its next meal in the form of a response.”

Uhm, alright. Time to write some HTML. Let’s make two files - index.html and 404.html.
We’ll check later the path browser requested, if it is / then index.html will be returned, otherwise - 404.html.

1
2
3
4
5
6
7
8
9


<!DOCTYPE html>
<html>
  <head>
    <title>A Simple HTML Page</title>
  </head>
  <body>
    <p>Hello World!</p>
  </body>
</html>

For 404.html just put into <p> 404 message or whatever….

Let’s decade which file to send. That would be a simple if statement in our case.

1
2
3
4
5
6
7


let mut response = String::new(); //to be composed, will be sent later into stream

if request_line.trim() == "GET / HTTP/1.1" {
    // send HTTP with status code 200 and index.html file as HTTP body
} else {
    // status code 404 and 404.html file
}

.trim() is here to eliminate CLRF, so from this GET / HTTP/1.1\r\n to this GET / HTTP/1.1.

Ok, let’s now compose a legit HTTP response!

Response structure:

First line should be HTTP version followed with status code
Below are HTTP header parameters, we do not need them now
And after those - as you might have guessed, it’s the HTTP body.

HTTP Response

Status line:

1

response.push_str("HTTP/1.1 200 OK\r\n");

HTTP header and body separator:

1

response.push_str("\r\n");

And index.html contents:

1
2
3


response.push_str(
    &fs::read_to_string("./index.html").unwrap()
);

It should look like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


HTTP/1.1 200 OK

<!DOCTYPE html>
<html>
  <head>
    <title>A Simple HTML Page</title>
  </head>
  <body>
    <p>Hello World!</p>
  </body>
</html>

Finally let’s send response to byte stream!!!

1

stream.write_all(response.as_bytes()).unwrap();

For “Not Found” page is almost the same:

1
2
3
4
5


response.push_str("HTTP/1.1 404 NOT FOUND\r\n");
response.push_str("\r\n");
response.push_str(
    &fs::read_to_string("./404.html").unwrap()
);

Here’s full code working code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35


use std::{
    io::{prelude::*, BufReader},
    net::{TcpListener}, fs,
};

fn main() {
    let listener = TcpListener::bind("127.0.0.1:5550").unwrap();

    for stream in listener.incoming() {
        let mut stream = stream.unwrap();

        let mut reader = BufReader::new(&mut stream);

        let mut request_line = String::new();
        reader.read_line(&mut request_line).unwrap();

        let mut response = String::new();

        if request_line.trim() == "GET / HTTP/1.1" {
            response.push_str("HTTP/1.1 200 OK\r\n");
            response.push_str("\r\n");
            response.push_str(
                &fs::read_to_string("./index.html").unwrap()
            );
        } else {
            response.push_str("HTTP/1.1 404 NOT FOUND\r\n");
            response.push_str("\r\n");
            response.push_str(
                &fs::read_to_string("./404.html").unwrap()
            );
        }

        stream.write_all(response.as_bytes()).unwrap();
    }
}

And that’s pretty much it.

Output

The end

In conclusion, understanding the fundamentals of networking and computer communication is crucial for anyone looking to work in the field of technology. By exploring the basics of routing, packets, and network protocols, you have gained a valuable foundation of knowledge that will serve as the basis for further learning and development.

ALSO, the process of writing your own HTTP server can seem daunting, but by learning about socket programming, HTTP request and response structures, and the handling of client-server communication, you have gained a deeper understanding of the mechanics behind the World Wide Web and the internet.

I hope it was an interesting read! Bye.