'Tell browsers to cache HTML content, but let them pick up updates immediately

Suppose we're building a mostly static website — let's say a blog. We want to make the best of whatever caching layers sit between the server and the reader's eyeballs; maybe the server is running on the author's home network and they want to make the most of Cloudflare's cache. But we also want to allow the author to modify the website at their leisure and have readers pick up the changes immediately.

For external resources referenced from an HTML document, let's say a stylesheet, the typical way to achieve this is to serve the stylesheet from an endpoint that corresponds to the current version of the stylesheet, which the server instructs clients to cache, and then to link to that endpoint from any HTML documents which include it. If the stylesheet is modified, serve the new version from a different endpoint, and change all HTML documents to refer to the new endpoint.

That's great for external resources, but how do we cache the HTML document itself?

Approach 1: Using redirects

Serve each version of the HTML document at a unique endpoint, maybe content-addressed, and respond to requests for the document's canonical path with 302 Found redirects to the current version.

Problem: The reader sees the url for the current version of the page in their address bar. This url is likely machine-generated and ugly, and more importantly, if they copy the url and link to it somewhere else, the link is to the current version of the page which might be outdated or even unavailable in the future.

Approach 1.1: Use window.history.replaceState()

Same as above, but have the "current version" endpoints include a script that uses window.history.replaceState() to change what's displayed in the browser's address bar back to the canonical path for the page.

Problem: Same problem as above, but only for users with javascript disabled. Also it's weird behavior that could cause confusion, and possibly break pages that make use of the History API (weird for a blog to do, but I'd like to find the best general approach).

Approach 2: Load content with javascript

Serve a barebones HTML document at the page's canonical endpoint that links to any external resources and then loads the content dynamically. Instruct clients not to cache this page. Each version of the content is served from a unique endpoint, and clients are instructed to cache it. Using the fetch streaming interface and document.write should achieve similar performance to letting the browser load a static page.

Problem: Users with javascript disabled can't see the content. They might be shown a link to the current version of the content instead, but users who follow the link have the same problems described in the section on redirects above.

Approach 3: <iframe>, <object>, etc.

Embed the content from the canonical page. As usual, the content is hosted at a version-specific endpoint, with instructions to cache, the canonical page has instructions not to cache.

Problem: Links affect the nested browsing context and don't change what's displayed in the address bar.



Solution 1:[1]

I'm not entirely satisfied with this answer, but I think it's the best available solution for the general case as of the time of writing. Loading content dynamically (approach 2) may be better in some contexts, like if javascript is required for other reasons. If I'm right that there isn't a better approach, that's not acceptable. This style of caching is broadly useful and shouldn't require such an awkward approach.

Please chime in with other approaches if I've missed something better, even if it's only better in a limited context!

Use <object> or <iframe> as in approach 3, and set target="_parent" on all included <a> tags

It's important to use target="_parent" rather than target="_top" so that if third parties embed the canonical page in an iframe, clicking on links doesn't escape the iframe and cause the user's browser to navigate away from the third party's page.

Example:

File: page1.html

<!doctype html>
<html lang=en>
    <head>
        <meta charset=utf-8>
        <title>Page 1</title>
        <style>
            html, body {
                margin: 0;
                padding: 0;
                overflow: hidden;
                height: 100%;
            }
            #include {
                margin: 0;
                padding: 0;
                width: 100%;
                height: 100%;
            }
        </style>
    </head>
    <body>
        <object id="include" type="text/html" data="/blob/36a333faa933f67db4fef10f03c9bc92dc85b9ee7b
7ae5fc82ae706440fe0cfa.html"></object>
    </body>
</html>

File: page2.html

<!doctype html>
<html lang=en>
    <head>
        <meta charset=utf-8>
        <title>Page 2</title>
        <style>
            html, body {
                margin: 0;
                padding: 0;
                overflow: hidden;
                height: 100%;
            }
            #include {
                margin: 0;
                padding: 0;
                width: 100%;
                height: 100%;
            }
        </style>
    </head>
    <body>
        <object id="include" type="text/html" data="/blob/d5995a8f5f41837f97a53acdb942a41a843cc5
c0801ea7d2842643d90d0fe924.html"></object>
    </body>
</html>

File: blob/36a333faa933f67db4fef10f03c9bc92dc85b9ee7b7ae5fc82ae706440fe0cfa.html

<!doctype html>
<html lang=en>
    <head>
        <meta charset=utf-8>
        <title>Page 1 <!-- this will never be displayed --></title>
    </head>
    <body>
        <p>Page 1</p>
        <p><a target="_parent" href="/page2.html">Go to page 2</a></p>
    </body>
</html>

File: blob/d5995a8f5f41837f97a53acdb942a41a843cc5c0801ea7d2842643d90d0fe924.html

<!doctype html>
<html lang=en>
    <head>
        <meta charset=utf-8>
        <title>Page 2 <!-- this will never be displayed --></title>
    </head>
    <body>
        <p>Page 2</p>
        <p><a target="_parent" href="/page1.html">Go to page 1</a></p>
    </body>
</html>

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1