Today we're going to discuss the browser and how it works. It's important to have a good understanding of how browsers work if you're going to be a web developer. You need to know how a particular browser is going go interpret your code, how it's going to talk to the server that gives it your code, and how to debug your code when things go wrong. Luckily in 2019 the tools available to help you are numerous and feature-rich.
Note: Throughout this article I'm mostly going to be using Firefox as my browser example. However, the tools available in Firefox are practically identical in any modern browser. Chrome and Edge both have great developer tools that are pretty much on par with Firefox. I choose to primarily use Firefox simply because it has a smaller memory footprint on my computer and they tend to be quick to implement new developer features.
How does the internet work?
To learn how a browser works we first need to understand how the internet works from a high level. Most people already have some idea of how it works but let's touch on it briefly.
You're probably familiar with "The Cloud". It's an ominous sounding term but all it really means is "a computer somewhere else". The internet technically isn't really a thing on its own. It's just the term we give to the largest network in the world, but really there are tons of networks. Your house is probably a tiny network called a Local Area Network (LAN). Your computers, phones, tablets, and probably many other devices all talk to a router or gateway of some kind in your home. Either they are connected via a cable known as an Ethernet cable or they are connected via Wifi. Then from there your home router is connected to a server provided by your Internet Service Provider (ISP).
So how do all these devices know where to send data? Every device on a network has an address. You've probably heard the term IP Address. I'm not going to turn this into a networknig tutorial, but suffice it to say every device is assigned an IP address. All the devices on your network get their IP addresses from your router. This is how the router tells them all apart. When your devices make requests to servers on the internet the router makes note of where that request came from on your home network and ensures any replies to that request are routed accordingly back to the device that is expecting the reply.
Your devices get IP addresses from your router, but your router gets its own IP address as well. That address is assigned by servers controlled by your ISP. This is what you pay them for. They maintain the servers that your home network talks to. Requests from your network are routed through your router, then through a server or servers owned by your ISP, then it's forwarded out to the public internet. Your request might bounce through several servers as it travels networks all over the globe until it reaches it's destination. Facebook for example. They would have a server waiting to receive requests. They take in the request and then lots of things happen on their end to fulfill that request. If you were logging into Facebook then they'd forward your request to some authentication service on their own internal network. If you were sending a Facebook messenger message then that request would be forwarded to a different internal server to handle that request. You get the idea.
Once they've retrieved all the relevant internal data that is required to fulfill your request their servers will send a response. The response bounces around the same network path the request traveled. All the way until it's back at your ISP who sends the response to your router which sends that response to the device that made the request. It's kind of crazy when you really think about it. Just how far your data travels just to do something trivial. And it all happens in the blink of an eye. Okay, maybe longer if your internet is slow :P
Okay but how do browsers work?
The requests we talked about above have to start somewhere. That somewhere is a program on your computer or other device. Any program can make a network request, but a browser is especially designed to do so. When you use an app on your phone, that app is tailored to the service it's provided by. The Amazon app for example. It is tailored to allow you to browse products on Amazon. It knows exactly when to make requests to their servers so it can provide you with what you need. A browser is not tailored to a specific service. It needs to be able to handle all kinds of requests to all sorts of services.
Note: The truth is that today many apps/programs are just browsers wrapped in a simplified interface. Developers do this because it allows them to build a website that can be both loaded into a browser as well as stuffed into a customized container which most people know as an "app". Apps are nice because they are easier to access than opening a browser and typing a web address or accessing a bookmark. When you're using an app it doesn't show you an address bar or other browser features. The app can just focus on being a good interface for one single service without the user being confused.
A browser works by making a request to some server, receiving the response, and then processing that response. The processing of the response is where the magic happens. That's the part of the process that turns simple text into a working web application in your browser. When you put
facebook.com into your address bar and press enter, your browser immediately reaches out to Facebook servers to ask for data. Facebook servers then respond with what is essentially a simple text file. It's too big to paste below but you can see the actual text response if you follow this link.
Right now that file should look like nonsense to you. Truth be told it is mostly nonsense, even to me. The reason for that is because we are not seeing the code in the form it was written. It has been condensed to remove whitespace and other irrelevant data that only humans need. Computers don't care if your code is all on one line and super hard to read, so they remove all that helpful stuff when putting their code on their live servers. It helps reduce network lag because there is less data to send over the internet.
Even though we can't comprehend the response from Facebook.com, the browser can. It parses the HTML file and uses the code there as a road map for constructing a web page. The code instructs the browser on how to render elements on the page and style them properly so they look as the author intended them to. The file also contains instructions that tell the browser to make additional requests to the Facebook servers in order to fetch more data. Images for example. The HTML file sent down by the server won't contain any images themselves; it will just contain URLs to where an image lives on the internet. The browser knows to follow those URLs and make new requests to go get them.
So how can we learn more about what a browser is doing?
It's time to introduce you to the developer tools (aka devtools)! Every modern browser has devtools but accessing them can vary from browser to browser. In Microsoft Edge you simply hit
F12. In Chrome and Firefox you use
ctrl + shift + I. If all else fails, devtools are always somehow accessible via the browser's own menu.
Once you have the browser's devtools open you should see a window pop up. Usually this window is docked to the bottom of the browser window.
Note: I set my devtools to default settings for the purposes of the above screenshot. I prefer darker themes when I'm programming as they are easier on my eyes. Further screenshots will show the same tools but with the dark theme enabled. If you also prefer a dark theme you can change that in the devtools settings by clicking the three dots in the top right corner of the devtools window.
There are a ton of useful features inside here. We'll touch on a lot of them but we aren't going to dive into anything in super high detail yet. Later as we learn to write code we'll be able to take better advantage of the tools provided to us. That said, you are free to play around inside the browser devtools as much as you like. Which brings me to my first point.
For the rest of this article we'll go over all the important devtools features one by one. I'm only going to be giving a brief overview of each feature. We will be diving into these tools much more deeply in the future. We will spend some additional time on the network tab though because I think that will help with understanding how the browser works when making requests to different servers.
The inspector tab is the most commonly used feature in devtools. It shows you the HTML of the page, but more than that it shows you a parsed view of the HTML. In other words, it turned the HTML document into an interactive model that updates in real time. If some code on the page updates the HTML you will see that reflected in the inspector. As you hover over items in the HTML you'll notice that different areas of the page become highlighted to show you which parts of the page that HTML element controls.
In addition to HTML markup you can also see all CSS applied to individual elements. The inspector is an amazing tool for debugging CSS because it can be hard to picture in your head exactly how certain HTML styles are going to look. When you run into problems in CSS it can be hard to know where you went wrong without a tool like this. With devtools you can see what the browser sees. If your CSS rule is being overridden by another CSS rule, the inspector can show you that.
The console is where all the errors go. If something went wrong while the browser was parsing your page then it will report it in the console. This includes things like not being able to load an image or a script. It also shows errors that occur in a running script. This view will come in handy as we learn to code.
We're going to skip over a few tabs and jump right to the storage tab. This view shows all the information your browser is storing for the page you're currently on. This includes cookies, local storage. There are other types of storage available in the browser as well but they are outside the scope of this quick example.
Cookies are bits of data that your browser stores on your behalf. It associates the data with a specific website and sends that data along with every request made to that website. Have you ever clicked a "remember me" checkbox when logging into a website? Doing that sets a cookie on your machine with a special ID number. When the server sees that cookie the next time you navigate to their website, it will know to log you right in and use that unique ID to find your account information. Not ticking the "remember me" box does the same thing except the cookie is set to expire after you close the browser so that it won't remember you the next time you go there.
In concept there is nothing inherently malicious about a cookie. It's just a useful piece of data a website can tell your browser to store, so that the next time you go to the website it can remember something about you. You've probably heard the term "tracking cookie" before. There's nothing special about a tracking cookie; it's just a cookie like any other. It becomes a "tracking cookie" when the server uses it to track things about you. You can think of it like a store putting a sticker on you as you walk in. The sticker just says some random number, but that random number is actually the number of a row on a spreadsheet they have somewhere. Whenever you walk into the store with your sticker on they can look you up on their spreadsheet. Depending on what they have on their spreadsheet they can probably tell a few things about you, such as what you bought last time, how long you spent in the store, etc. The more you shop there with your sticker on, the more data they can gather about you and your tastes.
It's already a subject of debate just how much information a company should know about you. The real moral quandry comes into play when that store you shop at sells their spreadsheet of data about you to other stores so those stores know stuff about you too. In our fictitious scenario they only know very public things about you and not much about your true identity. The most they could do is simply try to improve your shopping experience by tailoring things to your preferences. The digital world is more complicated though. Facebook and other sites do often know your true identity and more.
The network tab is the place we're going to spend the most time today. We're going to look at the anatomy of requests made by the browser, as well as examine the responses sent back from the server. The first thing to do after opening the network tab is refresh the page you're on. The devtools do not track requests made before opening the devtools window. In order to see all requests made during the loading of a page we must refresh the page while leaving devtools open. If you're on chevtek.io when you do that you should see the network tab fill up with requests similar to below.
When we load the page the browser first makes a single request to the URL you put in your address bar. You can find that request in the network tab by looking in the "Type" column for "HTML". The "File" column should also reflect the end of the URL that you put into the address bar. In this case, I was on the chevtek.io home page so the "File" that was requested is a simple slash (
Ghost is responsible for returning the HTML for the site. All the other requests made were prompted by the HTML document. Once the browser parsed that initial request the HTML contained instructions to make additional requests. Let's take a peek inside that initial request and see what it looks like.
One of the first things you'll see in the request details window is the
Request URL. In this case I was on the chevtek.io homepage so the URL is literally just
https://chevtek.io. The other thing you'll notice is the
Request Method which is set to
GET. There are a handful of request methods but the big two are
POST. The major difference between the two methods is that
POST contanis what is called a "request body" while
GET does not.
GET requests are meant for fetching data. A simple request is made to some URL and the server listening at that URL returns a response. In our example request above that's all that happened.
POST requests are meant for sending data. For example, if you fill out a sign-up form on a website all the data you entered into the form fields will be sent to some URL as a
POST request, along with a request body containing the data you entered.
There are other request methods as well such as
OPTIONS are requests you don't traditionally make yourself from code, but
DELETE are virtually identical to
POST except they exist to provide a different semantic meaning. Semantically,
POST requests are meant for creating data. Our sign-up form for example, would likely be used to create a new account record for you in the website's database.
PUT is meant for modifying data. So filling out an "edit profile" form would generate a
PUT request with a request payload containing the data you entered; then that data would be used to modify your existing account record. With that understanding,
DELETE should be pretty obvious. That request method is used to signify that the request is intended to delete data on the server.
Many developers and programs do not respect the semantical differences between the various request methods. Oftentimes developers will stick to using only
POST request methods, ignoring the rest. There's nothing technically wrong with that other than the fact that it's a bit lazy. New developers coming into a project will be appreciative if you take the time to craft requests that respect those request semantics.
The next section of the request details window shows "Response Headers" but collapse that section for now and scroll some more to "Request Headers". They put response headers first because as you start developing those will traditionally be what you'll concern youself with the most during debugging. For our purposes though, we want to look at the anatomy of a request before we study the response.
Both requests and their subsequent responses contain headers. Headers are simple name/value pairs just like cookies which we described earlier in this article. They are used primarily to describe different aspects about the request. The headers in the above screenshot are formatted neatly and sorted alphabetically for easy viewing, but if we click "Raw headers" in the top right we can see the text version of our headers in the traditional order.
Host: chevtek.io User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Firefox/68.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate, br Referer: https://chevtek.io/p/20003bbe-92a3-48ea-8624-26ce82569346/ Connection: keep-alive Upgrade-Insecure-Requests: 1 Cache-Control: max-age=0
First we have
Host. All requests must contain a
Host header in order to be a proper HTTP request. It tells the server what website to serve up. You may wonder why we need the
Host header when the request is already being made to a full URL such as
https://chevtek.io/. The reason is because when you make a request to a domain name, that domain name gets resolved to an IP address that points to an actual server somewhere. You can use an online tool such as this one to type in a domain name and resolve it to an IP address.
Once that domain name is resolved to an IP address, the domain name or host name is lost. The server running at that IP address will receive the request but if that server hosts multiple websites at different domains then it won't know which website to actually return data for. That's what the
Host header is for. The server will be able to read that header and return data for that host name.
The next header on our list is
User-Agent. In my example request to the Chevtek homepage my user agent is set to
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Firefox/68.0. This giant string is unique to my specific browser. I made the request with Firefox on 64-bit Windows 10. The string will change from browser to browser and operating system to operating system. These days this string is rarely used for anything other than analytics. For example, I use Google Analytics on this blog and it logs all sorts of information about visitors that come here. Visitors' user agent strings allows me to look through that data and see how many people are using specific browsers.
You can see above that Chrome is far and away the most popular browser. My analytics is only able to know which browsers my visitors are using thanks to that user agent string.
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate, br
I don't want to spend too much time on these ones but briefly:
Accept tells the server what kind of response it is expecting to get,
Accept-Language tells the server what language to return content in case the server is localized to return different content for different languages, and
Accept-Encoding tells the server what kinds of compression protocols the browser making the request understands. This article is already super long so if you'd like to learn more about those headers I encourage you to look them up :)
The last one I'm going to go over is
Referer. The remaining headers are outside the scope of this article but you are welcome to look those up as well. The
Referer header is something that your browser tracks for you as you browse. As you navigate from page to page your browser will remember the previous page you came from and will send that page's URL with the new request.
This allows the server to see where a visitor was "referred" from. For example, you may have noticed that in my sample request to the Chevtek homepage the
Referer header is set to
https://chevtek.io/p/20003bbe-92a3-48ea-8624-26ce82569346/. That URL is the address of this very article I'm writing before I publish it (once published it will be
https://chevtek.io/part-6-how-do-browsers-work). This is because I navigated from this post to the home page in order to generate the request I used for this article. Firefox accurately kept track of the page I came from and sent it along to the server as the "referring address".
There is a lot to cover when it comes to browsers but we managed to cover a fair amount. I hope this introduction to browsers, devtools, and the anatomy of requests has given you some insight into how things work. It's perfectly fine if a lot of this went over your head. As I've said in previous posts, you're not meant to understand everything we're covering in these initial stages. Things like browser devtools will become more and more familiar as you need to use them when writing your actual code.