Static site generation and CDN: brothers and rivals?
Have you thought about using a static site generator or CDN (Content Delivery Network) for your next website? If you are not familiar with these concepts, or simply haven’t given them much thought, take a few minutes to discover them, compare how they work and the ways they can be used as part of a Drupal project, as they could just be the right choice for your future website.
What is static site generation? A throwback to the past?
If you are picturing the earliest websites, these were indeed static websites, with no PHP execution on the server side, no search functions, no AJAX… Well, let’s just say they were pretty basic! They were nothing but manually written HTML code and CSS for the page layout. This was simply placed on a web server to make the content available and display it in a web browser. So, is today’s interest in static site generation some kind of retro fashion or hankering after a golden age? Or does this approach hold a fantastic secret?
Does this mean we want to return to the past and teach contributors to use HTML and CSS?
No, this is not the aim! CMSs are excellent tools for rich and structured contribution, including workflows, media management and more. What interests us in particular here is the generation of a static website using the contribution made in a Drupal system. This involves “saving” the HTML code generated when the website is displayed, once the contribution is fully complete. Only the carbon copy of the website is put online, leaving the Drupal system in the shadows, accessible to nobody in the world but your contribution team.
In terms of technology, several tools are available:
The Tome module
The Tome module is fully integrated in the Drupal interface. It can be used to generate the carbon copy of the website, using the Drupal render engine, leaving your usual technical team entirely free to modify the theme that it has already built. This module also offers the advantage of connecting to hosting services such as AWS, GitPages and Netlify, and simply performing the deployment of the new version of your carbon copy.
All of the following solutions are based on static generation tools that are not specific to Drupal, but which can be linked by modules.
Here, the idea is to use Drupal only as a content source and to leave the final rendering to the tool Gatsby. We therefore leave our Drupal theme, as well as any associated features, completely to one side. The interface between the two tools is provided by Drupal’s API layer, while page formatting is performed by Gatsby. However, as Drupal is not limited to content alone, the API will also need to be used to retrieve blocks, views or menus, in order to be able to reorganize visitor navigation. Here again, a Drupal module enables simplified integration of the two tools and hosting. Gatsby also documents its integration with the various CMSs on its website.
HTTrack and wget
One last solution is available to us, which is very easy to imagine, but not quite so easy to use day-to-day: carbon copy generation using the tools HTTrack or wget, to be executed on the server side. These two tools are totally agnostic in relation to the technology used and provide no interface with the hosting or pages in the administration of your preferred CMS.
A miracle solution?
No, unfortunately! You should rather see static site generation as an additional tool in the development of your communication strategy and, as with any tool, you need to know what its strengths are:
The page is loaded instantly by the web server, with no need for PHP calculations, no database requests... Nothing! It’s sort of like the early days of the web, but with more style. This enables visitors to access content faster. From a page rendering perspective, it is similar to the way a Varnish server works, which will put pages that have already been generated in a buffer memory, in order to serve them without requiring a new calculation of the page via PHP.
Low hosting cost
Because an Apache-type web server is the only thing needed to ensure correct operation of the website, the project’s infrastructure cost will be tiny. The project running in Drupal does need to be executed somewhere though, ranging from a local installation, in the case of a single contributor, to installation on a server, which can be very small, as it will have no load to handle.
Because all the Drupal, PHP or MySQL applications, which are generally the weak points in a project, will not be accessible publicly, it will be difficult for a malicious user to find a flaw. Additionally, if the back office is not public, the security updates of these tools will not be critical, insofar as they will not put the website being produced in danger. However, if you plan to develop your project over time, then you should pay attention to them!
Less risky website updates that can be rolled back easily
Because website updates will be limited to replacing HTML files with other HTML files, all the complexity of a script potentially executing incorrectly in the production environment is removed and, in any case, rolling back changes is also very simple. The management of GIT files can further simplify these processes.
So, what about the weak points?
Forget about personalization
More technical “final” publishing of content
As we have seen, content writing and publishing tasks are necessarily carried out in two stages, and the environment we contribute in is not necessarily the same as that in which the audience consults the content. In short, there are a few more factors to be taken into account, which are more complicated to manage than simply pressing the “record” button for your content. We have also seen that it is possible to automate a good deal of the workflow, depending on the tools you select. These steps do require a degree of attention though, according to the level of technical knowledge of the team that will run your project.
Which projects lend themselves to a static website?
While producing a static website offers clear benefits, it will not be the right solution for all projects. It is a good choice if you are producing a “conventional” corporate website, a personal website to display your CV, a one-page website for a specific event, or a blog. However, if your aim is to create a sports betting website or the next Facebook, you will need to look at other solutions!
If your project lies somewhere between these two extremes, I would suggest thinking about a hybrid solution based on pages that are generated statically to cover non-authenticated users, and then switching to pages generated dynamically by Drupal for the sections dedicated to authenticated users.
How does a CDN work?
A CDN (Content Delivery Network) can be pictured as a network of access points to your website. When you have set up a CDN and a visitor wants to consult your website, their request will be made automatically via the closest access point. Two scenarios can be envisaged:
A visitor consults a page via an access point for the first time: in this case, the CDN will request the page from the main server (the one where you perform your deployments), then open it for the visitor and, above all, cache it in its memory.
A page has already been requested via an access point: the CDN will directly open the requested page, from its memory, without having to communicate with the main server.
The value of a CDN lies in the various access points spread around the globe. It offers the huge advantage of physically placing content as close to the visitor as possible, in order to save precious display time, which we all look for when we are browsing. Transferring data between San Francisco and Paris, for example, will obviously take longer than between San Francisco and Los Angeles.
So, what happens when we modify content on the website? The art of using a CDN is all about managing these updates. Bear in mind that when a page is generated by a web server, it is accompanied by a TTL (Time To Live), which expresses the time during which the information carried on the page will remain valid. Based on this information, the CDN knows whether the content it is caching is still relevant or not. If it is not, it simply requests the original server again to obtain the updated information.
It is obviously difficult to know when you will need to update your content in advance and, if your TTL has been configured to one day, it may be difficult to explain that a correction to an error in your content will not appear until tomorrow. This is why it is possible to invalidate all or part of content cached by the CDN.
It is also important to remember that the CDN can be configured to cache only a specific type of data, for example images and CSS stylesheets, and that you must configure paths that are not to be cached by the CDN, such as your administration pages and visitor account management pages.
With this in mind, we can see that the CDN itself creates carbon copies of the website (or part of the website) at each of the access points around the world, both making the website faster and reducing the number of requests sent to the server by reducing the number of pages it has to calculate.
Is a static site generator unnecessary if you have a CDN?
Of course I cannot give you such a black and white answer!
If you have a global audience, a CDN will always reduce the distance between your content and visitors, with or without a static site generator. The two solutions are by no means conflicting, so there is no need to choose between the two.
You could even envisage using a CDN for images, documents and stylesheets only (as these are the largest things to send), while the HTML pages remain distributed by the original server. The two solutions each offer complementary benefits and you should choose your tools based on your project’s environment. A CDN can be easily set up for various types of projects and rapidly save you precious seconds. A static site generator, on the other hand, will ensure maximum security at low infrastructure costs, but with an approach that differs from the usual contribution tools.