Introduction to HTTP Caching
I am sure you have noticed that when you open a website for the first time, it takes some time to load. However, when you open it again after some time, the site lodes much faster. Imagine if a site took the same time to load every time, how slow browsing would feel. Well, this is all thanks to a nifty idea called HTTP Caching. Let’s take a look at what it is, how it works and how it is used to make Internet Browsing a speedier experience.
What is HTTP Caching?
HTTP Caching is the idea of storing some commonly or frequently used data at someplace that is quick to access. With this, there is a very good chance that most needed data can be accessed much faster because the computer doesn’t have to reach too far to get it.
In the case of Web Browsing, caching is considered when your web browser such as Chrome stores a copy of a website or web-app on the local storage. Once a website is cached, a web browser won’t have to redownload all the data from the server and this will make browsing a lot faster.
Cache Headers in HTTP
HTTP Caching has two major cache headers, the first one is called “Cache-Control” and the second one is called “Expire”. Let’s take a look at both:
You can consider Cache-Control as a Switch to toggle the caching on in the user browser. Once this header has been added in, it enables caching for all supported web browsers. If this header is not present, no browser will keep a cache of the web-page contents even if it supports caching.
The Cache-control has two types of privacy settings, the first one is Public and the second one is Private.
In the case of Public, the resources can be cached by any intermediate proxy such as Content Delivery Networks (CDN).
A Cache-Control header with Private response will tell the browser that the caching will only be done for a single user and not for any intermediate proxy.
The value “max-age” in the Cache-Control header sets the time for which the content will be cache. This time is in seconds.
The Expires header is used when Cache-Control is present in the code. This is a simple HTTP Cache header that sets a date from which any cache resource is considered invalid. Once the cache is expired and the user loads the website, a web browser will simply request all content of the page once again.
Above discussed headers simply tell the browser when to retrieve the data from the web-server. Conditional Requests, on the other hand, tell the browser how to retrieve it. Conditional Requests tell a browser how it can ask the server if the copy of data in the cache is outdated.
In this process, the browser sends some data about resources it has cached into its memory and after reading this data, the server decides if the data is outdated or not.
In time-based requests, it is checked if the requested resource was changed on the server or not. If the cached copy in the browser is the latest one, then the server will return code 304.
To set Conditional Request on the time basis, you can use “Last-Modified” in the response header.
Last-Modified: Fri, 08 Jul 2018 15:25:00 GMT
In Content-Based requests, the MD5 Hash (or any other viable option) is checked for both, server copy and cache copy. This tells if the data is same or not, in case the data is different the MD5 checksum will not match and the server will send a fresh copy of resources.
This is done via “ETag” in the header. The value of it is the digest of resources.
Almost all modern browsers include some development related tools that let you check resources, source code and other aspects of a web page. Among them, you can find a tool to see the headers returned by any application.
On Google Chrome, to see these headers, you can right-click on any empty area of a web-page and click on “Inspect” or press CTRL+SHIFT+I to open DevTools. In this tool, click on the Network tab and press CRTL+R to reload to see all the headers of the page.
Use Cases in HTTP Caching
Below are some uses cases of HTTP Caching which are as follows:
For Static Assets
For static assets of a page such as images, JS Files and any CSS files, you can opt to aggressively cache the contents. Not having to load these files will result in impressive performance improvement. For this use case, go for the Cache-Control Header with the max-age value of more than a month or even a year.
For Dynamic Contents
In the case of dynamic contents of a page, you will need to think yourself for what files should the browser cache and for how long. In case the content will be changed frequently, you will need to make sure that the time duration you pick for caching won’t result in any problem for the user.
Caching of Private Content
As we discussed in the Cache-Control section, in case the content of the page is private in nature, you can prevent it from being cached by intermediate proxies such as CDNs by adding “Cache-Control: private” in the header.
Another safer approach is to not cache any private content at all.
Implementing HTTP Caching
Now that you know what HTTP Caching is and how it works, let’s look at how you can implement it on your website. Implementation of HTTP Caching is a bit different for different server types. In our case, let us take a look at implementing caching via the .htaccess file.
To enable the Caching on site, you can add the headers in the .htaccess file on your server for example:
Header set Cache-Control "max-age=31536000, public"
The above will cache all to, pdf, flv, jpg and other mentioned formats mentioned in the “File Match” for one year.
HTTP Caching is one of the most important tricks that make browsing your site a faster experience for your visitors and now that you can be seen how it works, you can implement it on your sites and web-apps to make them faster for your users and for saving your server bandwidth.
This has been a guide to the HTTP Caching. Here we discussed the implementing, Conditional Requests, cache header and Use cases of HTTP. You can also go through our other suggested articles to learn more –