Skip to content
This repository has been archived by the owner on Jan 11, 2023. It is now read-only.

Cache busting with url versioning params #179

Open
arxpoetica opened this issue Mar 9, 2018 · 16 comments
Open

Cache busting with url versioning params #179

arxpoetica opened this issue Mar 9, 2018 · 16 comments
Assignees
Labels

Comments

@arxpoetica
Copy link
Member

arxpoetica commented Mar 9, 2018

So I just ran into the favicon not updating unless I manually break the cache like so:

<link rel='icon' type='image/png' href='/favicon.png?v=2'>

Without ?v=2 it wasn't picking up my changed icon. I think this is something fundamental enough it ought to be built in to the platform, some sort of cache breaking mechanism. Potentially this sort of thing:

<link rel='icon' type='image/png' href='/favicon.png?v=%sapper.version%'>

Would it be as simple as picking up package.json major version (semver) upgrades 0.x.x? I suppose it ought to be a manual break, though, because I can imagine people wanting to be able to manage cache busting (images, css, js, etc.). Still, it would be convenient to have a single env var or some file or some mechanism somewhere in one place that we could update easily and have it ripple out through the whole app.

Thoughts?

@thgh
Copy link
Contributor

thgh commented Mar 9, 2018

+1 for cachebusting utilities in sapper

%sapper.version% is not specific enough though. The ideal solution would provide a hash of every file. That way, only actual changes get invalidated. How about:

<link rel='icon' type='image/png' href='/favicon.png?v=%sapper.cachebust%'>

When the cachebusting directive is found, it tries to find the filename in question, loads the file, calculates the hash and fills it in.
If the array of assets that are provided in the sw manifest would include the hashes, the service worker would know which assets to redownload. Or is that the job of the browser cache?

@Rich-Harris
Copy link
Member

Sapper isn't actually serving your favicon; that's (typically) done by serve-static. My understanding was that it won't supply cache headers unless you tell it to, and then it will use ETags to ensure that it doesn't serve stale assets.

Maybe favicons are treated slightly differently by browsers? So any cache-busting mechanism would be favicon-specific, rather than based on a hash of the file in question. Maybe a %sapper.timestamp% that's generated upon build?

@Conduitry
Copy link
Member

Yeah I definitely recall stuff about different browsers caching favicons super aggressively, much more so than other files. Searching for 'favicon caching' yields a lot of results about people trying to purge the favicon cache in various browsers.

@arxpoetica
Copy link
Member Author

arxpoetica commented Mar 10, 2018

@Rich-Harris I concur around serve-static doing it's thing correctly. The point in cache-busting is outside the purview of browsers and CDNs and servers putting the correct cache headers on assets and respecting the specified headers appropriately. That's almost an orthogonal discussion. That discussion is about how long and whether even to cache something with (as you said) ETag and Cache-Control and Expires headers. They do their job.

The point is there's no easy mechanism in place to deliberately break cache when one needs to, and the easiest method I have (traditionally) found for doing this is with a simple ?v=x cache version busting method. Maybe you don't agree this is necessary, but when, for example, I replace an image with a newer version under the same name, browsers won't necessarily download the new version--all dependant on what the headers tell it to do. A cache bust says "forget it, redownload anyway" without having to change names of css/js/img/svg/etc. files. Favicons are only one aspect of this.

I'm on the fence about what @thgh is proposing. If it's a super simple equation / hash to keep track of all those changes, fine. But if it's kind of an intense thing to build, I favor getting this done sooner with a simpler single cache update var across the board.

@Rich-Harris
Copy link
Member

The %sapper.cachebust% idea would be a little tricky to implement — we go from doing a simple string replace to having to figure out where we are in the template, backtracking until we see something that looks like a filepath, resolving that file, reading it, computing a hash... there are lots of opportunities for bugs to creep in. Something like this...

<link rel='icon' type='image/png' href='/favicon.png?v=%sapper.cachebust("favicon.png")%'>

...would be slightly better, but it still basically means implementing a templating language rather than doing a nice simple (and fast!) string replace.

Even %sapper.timestamp% involves creating and tracking a new manifest. On reflection I'm not sure it's a good solution, since it feels like a bit of a hacky fix for the problem in question, with lots of false positives. False positives are something we probably want to avoid here, since they would mean people end up downloading data they don't need.

Isn't manually incrementing the cache busting number the right solution here? If you know that you need to forcibly cache-bust a specific asset, you can just append ?v=2, ?v=3 etc to that asset as necessary.

If the array of assets that are provided in the sw manifest would include the hashes, the service worker would know which assets to redownload. Or is that the job of the browser cache?

My understanding is that when you create a new cache in the service worker (which in sapper-template happens on each new build, because of timestamp), it falls back to the regular HTTP cache that sits behind the service worker. So assets are redownloaded if the cache control headers/ETags say they should be.

@arxpoetica
Copy link
Member Author

arxpoetica commented Mar 10, 2018

So assets are redownloaded if the cache control headers/ETags say they should be.

I'm not sure I follow, but it may also be that I need to understand exactly what the cache headers do; admittedly I'm not 100% schooled in it all. When I researched it prior for a project that needed tight control of caching, I was on a bit of a deadline that didn't allow a thorough investigation...I may need to brush up...

(Starting here: https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching)

@arxpoetica
Copy link
Member Author

Well...here's an interesting philosophy:

If a resource (especially a downloadable file) changes, change its name. That way, you can make it expire far in the future, and still guarantee that the correct version is served; the page that links to it is the only one that will need a short expiry time.
(https://www.mnot.net/cache_docs/#TIPS)

Arguably, ?v=x is a tried and true mechanism for "changing a name." In that sense, @Rich-Harris this is inline with what you're saying about a manual change. (I admit I'm feeling bugged--ha--there's not a simpler solution! The lazy developer in me doesn't want to have to remember that--it's certainly easy to forget--)

Would it be crazy to keep a manual manifest somewhere? That would at least keep things in one place...that admittedly is also a bit cray cray...

Maybe we should just document this somewhere and call it good? 🤔

@Rich-Harris
Copy link
Member

I'm not sure I follow, but it may also be that I need to understand exactly what the cache headers do; admittedly I'm not 100% schooled in it all.

Don't take my word as gospel, but I think it goes like this: you open a page that registers a service worker, and the service worker is byte-different from the previously registered one, so the browser installs it. During installation, the service worker deletes the old cache and creates a new one (because it has a different name, cache${timestamp} or whatever).

It adds all the assets that are specified in the manifest to the new cache. But it doesn't go directly to the server, it goes to the HTTP cache (this part is transparent and out of our control). Say the service worker wants /goats.jpg — if the HTTP cache says 'we have an asset matching that URL, and the cache control headers say it's good for the next ten days', we don't download it again. Or it might say 'the headers [or lack thereof] say we may need to redownload it, but I just checked with the server and the ETag matches what we've got already, so we're okay to use this one'. In other words, by default the right thing will happen, but if you want more aggressive caching by controlling the headers from serve-static.

The same logic applies after the cache has been populated, if the fetch handler says to hit the network instead of serving from cache${timestamp}.

I think documentation is the proper response here, but I'm slightly hesitant to let the Sapper docs become a repository for general advice about building web apps — there are lots of places that explain this stuff far more thoroughly (and correctly, probably) than we could attempt.

@arxpoetica
Copy link
Member Author

I had the same thought about general documentation—send them elsewhere. However, if people (like me) keep bumping into (for example) the favicon not updating, etc., lets at least give them a "what for" (and why we don't self-manage it) and send 'em off to learn about it elsewhere. Since I'm the one spoutin' here, I'm happy to do that PR. ;)

@buhrmi
Copy link

buhrmi commented Aug 24, 2019

Hi guys, I'm also running into this. Currently, whenever I change a static file, I tell the users to hard-refresh the page (Shift+F5) to clear caches. It would be cool if sapper could incorporate automatic cache busting for statically linked assets, similar to how nuxt.js does it: https://nuxtjs.org/guide/assets

Thoughts?

@flayks
Copy link

flayks commented Mar 20, 2020

Facing the same issue for CSS invalidation on the bundle.css file generated by Svelte with Sapper, a solution/lead would be helpful 👌

@benmccann
Copy link
Member

I believe the standard way to do cache busting is to ask the bundler to add a hash to the filename. I raised an issue in the template to add hashes to the static filenames and provided the code to do so there. If I don't hear any objections I'll send it as a PR.

@happycollision
Copy link
Contributor

You could still do a simple string replacement if we are given a place to register which files need to be hashed.

<link rel='icon' type='image/png' href='%sapper.cachebust.favicon%'>
<link rel="stylesheet" href="%sapper.cachebust.globalCss%" />

I don't know where that'd be. A new sapper.config.js file?

module.exports = {
  // paths are from static root
  cachebust: {
    globalCss: "css/global.css",
    favicon: "favicon.png"
  }
}

@happycollision
Copy link
Contributor

happycollision commented Nov 21, 2020

Or if there was a place to hook into the string replacement in general, at build time, we could add necessary build steps to populate it. (maybe this already exists?)

@happycollision
Copy link
Contributor

happycollision commented Nov 21, 2020

As a POC, I added the following to src/node_modules/server.mjs (it gets blown away at first build, but a quick undo and it pops back)

import templateConfig from '../../../sapper.config' // This would have to change, but still...

/// snip

const body = template()
  .replace('%sapper.base%', () => `<base href="${req.baseUrl}/">`)
  .replace('%sapper.scripts%', () => `<script${nonce_attr}>${script}</script>`)
  .replace('%sapper.html%', () => html)
  .replace('%sapper.head%', () => head)
  .replace('%sapper.styles%', () => styles)
  .replace(/%sapper\.cspnonce%/g, () => nonce_value);
res.statusCode = status;
res.end(templateConfigReplace(body)); // <-- that is the new bit: templateConfigReplace

/// snip

function templateConfigReplace(body) {
  return templateConfig.replacements.reduce((acc, { find, replace }) => {
    return acc.replace(find, replace)
  }, body)
}

And then created the new file I referenced called sapper.config in the repo root.

import fs from "fs"
import crypto from "crypto"

function hashFile(pathToFile) {
  return (
    pathToFile + "?v=" + crypto
      .createHash("md5")
      .update(fs.readFileSync("static/" + pathToFile, "utf-8"))
      .digest("hex")
  )
}

export default {
  replacements: [
    {
      find: "%sapper.cachebust.indexCss%",
      replace: () => hashFile("index.css"),
    },
  ],
}

Then in my template.html

    <link rel="stylesheet" href="%sapper.cachebust.indexCss%" />

It all worked just fine. I know it's been a while since this topic has been discussed, but giving a low-level way to do this might be nice. You could even drop it further and just allow us to provide templateConfigReplace directly from a guaranteed-to-exist config file. What do you think, @Rich-Harris?

@happycollision
Copy link
Contributor

Even more control would be to provide body, html, head, styles, nonce_value, nonce_attr, req to a single function and require it return the template string. No need to force the api into a find/replace only kind of thing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants