Nolan Nicholson

Static Website Compilation

Last time, I discussed some of the challenges of rendering code in a modern-looking webpage, using helpful features like syntax highlighting. The most immediately available solution, a JavaScript library from Google, yielded good-looking code, but it required a server call to Google and some JavaScript execution for a result that we know will be the same every time - and that would therefore be more efficient just to bake into the page HTML, if we can. As implemented, it also required us to put our code snippets in the HTML page itself, making it difficult to develop and test the code.

This, it turns out, is symptomatic of a broader issue: there is a lot of "work" that is generally useful to be able to do automatically on HTML. Formatting code is just one example: it's also really useful to have things like page templating, so that much-reused components like headers and navigation bars don't have to be repeatedly copied and pasted.

This task of programmatically modifying HTML is usually what JavaScript is for. But using JavaScript for these tasks means performing them every single time a user loads your page. If your page is viewed n times, then pre-formatting the HTML beforehand is an O(1) algorithm, where formatting the page using JavaScript is O(n). The n executions just happen on other people's computers.

So, I've just finished reorganizing my website to use Jinja for automatic template compiling, and to perform some other automatic formatting (namely, syntax highlighting) through Python in the process. Jinja is a templating engine that dynamically generates HTML. It is most commonly used from within the web framework Flask, to generate HTML for dynamic websites. But that doesn't mean it can't be used for static websites too! In this case, I just use the Flask app to get a context, in which I can save the results of Flask's render_template() function to a file:

with app.app_context():
    # Render syntax highlighting template
    f = open("highlighting.css", 'w')
    f.write(HtmlFormatter().get_style_defs('.highlight'))
    f.close()

    # Render individual pages (but not the base template)
    for filename in os.listdir('templates'):
        if '.html' in filename and filename != 'base.html':
            print("Compiling template {}...".format(filename))

            # Link back home (for pages other than index.html)
            link_home = filename != 'index.html'

            rendered_html = render_template(
                    filename, link_home=link_home)
            f = open(filename, 'w')
            f.write(rendered_html)
            f.close()

Using Python to bake the HTML beforehand also has the side benefit of letting you load text from external files into your HTML - a task which is harder than one might think to do at runtime, basically requiring either PHP or AJAX. Here, we can simply load it using Python and bake it right into the rendered HTML:

def code_snippet(snippet_filename, n_start=1, n_end=0):
    # Open the code snippet and trim it to the desired lines
    f = open('templates/code_snippets/' + snippet_filename, 'r')
    lines = f.read().split('\n')
    n_end = len(lines) if n_end == 0 else n_end
    text = '\n'.join(lines[n_start-1:n_end])

    # Produce the highlight-formatted HTML
    hl_fmt = HtmlFormatter(linenos="table", linenostart=n_start)
    highlit_text = highlight(text,
            get_lexer_for_filename(snippet_filename),
            hl_fmt)
    return "<div class='codeholder'>\n" + highlit_text + "\n</div>"

With that defined, including external code in the template is as simple as:

``` \{\{ code_snippet('compile.py', 9, 22) | safe \}\} ```

Below is the full script that performs the templating. All of the page templates are stored in the library templates. They all extend a single template, base.html. Syntax highlighting is then performed using the Python library Pygments.

import os
from flask import Flask, render_template
from pygments import highlight
from pygments.lexers import get_lexer_for_filename
from pygments.formatters import HtmlFormatter

app = Flask(__name__)

def code_snippet(snippet_filename, n_start=1, n_end=0):
    # Open the code snippet and trim it to the desired lines
    f = open('templates/code_snippets/' + snippet_filename, 'r')
    lines = f.read().split('\n')
    n_end = len(lines) if n_end == 0 else n_end
    text = '\n'.join(lines[n_start-1:n_end])

    # Produce the highlight-formatted HTML
    hl_fmt = HtmlFormatter(linenos="table", linenostart=n_start)
    highlit_text = highlight(text,
            get_lexer_for_filename(snippet_filename),
            hl_fmt)
    return "<div class='codeholder'>\n" + highlit_text + "\n</div>"

# Add code_snippet() to Jinja2's globals
# so it can be called from templates
app.jinja_env.globals.update(code_snippet=code_snippet)

with app.app_context():
    # Render syntax highlighting template
    f = open("highlighting.css", 'w')
    f.write(HtmlFormatter().get_style_defs('.highlight'))
    f.close()

    # Render individual pages (but not the base template)
    for filename in os.listdir('templates'):
        if '.html' in filename and filename != 'base.html':
            print("Compiling template {}...".format(filename))

            # Link back home (for pages other than index.html)
            link_home = filename != 'index.html'

            rendered_html = render_template(
                    filename, link_home=link_home)
            f = open(filename, 'w')
            f.write(rendered_html)
            f.close()

Writing and hands-on development are done on the templates; then, a call to python3 compile.py puts together the "real" HTML files. I run this script manually, but I envision it could easily be worked into a CI workflow and done automatically.

Fun fact: as of right now, the templates are still accessible on this site. This page, for example, is made from two templates: base.html (for the formatting and objects common to all the pages on this site) and pagerendering.html (for the content specific to this page.) But they are unrendered templates - so don't expect anything pretty!

Overall, the result is quite maintainable, while still allowing a huge degree of freedom (you can still just specify a page's HTML completely from scratch with no templating and just pass it through, if you want to), and it performs all of the "busy work" just once - instead of delegating it to JavaScript on the reader's side, on every single page load. This idea isn't necessarily specific to purely static websites either; it could be used to lessen the redundant work done by a web server, or simply to help reduce load times by trimming some JavaScript fat.

I hope this is helpful! I made it in part because I didn't see a solution out there for avoiding some of the busy work of static website composition while still staying close to simple, clean HTML. (That second part disqualified full-fledged CMS's like WordPress and Medium.) But if there is such a solution I missed, please drop me a line - I'd love to know about it, and I'd still be happy to have gotten the experience putting this together.

Edit: After I posted this, a reader named Vitaly Potyarkin reached out to tell me about Pelican, a more full-featured static site generator written in Python. Pelican does everything my solution above does and more: it can take in input in reStructuredText or Markdown in addition to HTML, it can perform syntax highlighting through Pygments, it handles metadata and localization (topics I didn't even touch on here), and its architecture is open enough that you can easily extend its functionality to handle anything it doesn't natively cover. I'd likely recommend it over what I showed above - unless you are extremely interested in rolling a minimal solution yourself.


Home