Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Do not merge, discuss after 1.0.0] Service Worker added for offline documentation #247

Closed
wants to merge 1 commit into from

Conversation

Nooshu
Copy link
Contributor

@Nooshu Nooshu commented Apr 20, 2018

In a recent document discussing the "Design System and Frontend documentation" there was a question about where the documentation should sit. One of the options was "All documentation lives in the Design System and the main Frontend README is just a link to the Get started guide in the Design System". The only risk listed with this approach was "Users who clone the repo or download the zip from GitHub won’t have access to offline documentation." This PR has been created to solve this issue for Design System users who require documentation while they are offline.

A service worker has been added with all the heavy lifting done using Workbox. Workbox is built into the metalsmith build pipeline. Workbox handles the file manifest and looks for changes between builds. Once a change is detected, the manifest is updated. Old cache is cleared and replaced with the newest version.

It is now possible to view all the DS docs while offline. There may be considerations around when this PR is merged as a service worker does add another level of caching that can sometimes complicate development. The level of debugging available in the console for 3.1.0 of Workbox has been vastly improved so a developer gets a much clearer picture of what is happening. We could hide all this behind a 'development' / 'production' check or even a simple boolean to disable when required.

Current issues

  • Service worker doesn't understand what to do with a URL not ending in '/' once offline
  • Duplication of many of the CSS files due to iframe usage and relative URL's
  • Base64 encoded font file is adding 18MB to the pre-cached assets (~75%)

Todo

  • Restrict the service worker to a specific domain since it will be sitting on *.gov.uk.
  • Add maximum file size limit for cached assets as a safeguard in future releases.
  • Add basic set of sanity checks to flag a large increase in the number of files or total filesize.
  • Add development / production check for the console debugging.
  • Remove the need for the Workbox CDN for workbox-sw.js, service from local filesystem.
  • Offline detection for the user.
  • Investigate the possibility of a 'kill switch'.

@govuk-design-system-ci
Copy link
Collaborator

govuk-design-system-ci commented Apr 20, 2018

@Nooshu Nooshu force-pushed the workbox-service-worker branch 2 times, most recently from 9e80a74 to 78af22c Compare April 20, 2018 13:49
@NickColley
Copy link
Contributor

NickColley commented Apr 20, 2018

Hey Matt, thanks for this awesome contribution.

What network strategy does workbox use?

I imagine we'd want a 'network falling back to cache' approach? This ensures people always see the most up-to-date content. Stale content could refer to older versions of GOV.UK Frontend which would be problematic.

@Nooshu
Copy link
Contributor Author

Nooshu commented Apr 20, 2018

Good point @NickColley, I'll add in the runtimeCaching caching option and set to network first. I can't see what the default is for Workbox so best to be explicit.

globPatterns: [
'**/*.{html,js,css,png}'
],
swDest: paths.public + 'sw.js',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I know it's common for 'sw' to be used in tutorials and such but we prefer to avoid abbreviations. Could anywhere there's SW could we use 'ServiceWorker' instead?

Copy link
Contributor

@alex-ju alex-ju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 This is a great feature for the Design System.

I'll leave a first comment about the file structure (I'm consciously being picky):

  • workbox-cli-config.js should be in /config
  • build-service-worker.js should be in /lib
    I'm happy to do the above changes myself if they're agreed.

I'll try and put more thoughts into the build integration (potentially with injectManifest) the next days.

@Nooshu
Copy link
Contributor Author

Nooshu commented Apr 23, 2018

Placement of files has now been updated. Both elements and frontend are served from the same port on localhost, so when switching between projects the service worker clashes. I've updated the frontend port for the moment, but we could consider changing elements port instead?

Now explicitly using networkFirst as the caching strategy for all URL's.

module.exports = {
globDirectory: paths.public,
globPatterns: [
'**/*.{html,js,css,png}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jpg?

tasks/serve.js Outdated

// setup synchronised browser testing
metalsmith.use(browsersync({
open: false,
open: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this flag do, why is changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was originally true (in master). I have reverted it back to true here. It opens up a new browser window every time you npm start. This was becoming insanely annoying as I was testing so I disabled it.

@NickColley
Copy link
Contributor

I might be being too cautious but I'd like to try the following before approving:

  1. Run production build
  2. Go offline
  3. Confirm production build works offline
  4. Go online
  5. Make update to content
  6. Run production build
  7. Confirm production build is updated as expected

@NickColley
Copy link
Contributor

Looks good based on those tests above, we noticed that there's a runtime cache that caches Analytics.

If possible it would be good to avoid this, as it might make Google Tag Manager unpredictable.

@Nooshu
Copy link
Contributor Author

Nooshu commented Apr 24, 2018

I've been unable to exclude analytics from the caching. This looks to be something Workbox does automatically. What I have done is set the caching strategy on those files to networkFirst and set an expiry of 1 hour so they will always be up to date.

I asked in the Workbox repo about the runtime caching. Sounds like setting a short expiry for these files is a good strategy.

}
},
{
urlPattern: new RegExp('https://www.googletagmanager.com/gtm.js'),
Copy link
Contributor

@NickColley NickColley Apr 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we could do /https:\/\/www\.google.*\.com\/.*\.js/ here instead, which matches any javascript file from a domain that starts with google?

If that's no good, this seems fine for me my main worry is that these URLS can change since they're injected at runtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can do that. I was trying to be explicit as I'm unsure what other Google URL's we may be using in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go with this for now then. 👍

],
swDest: paths.public + 'service-worker.js',
importWorkboxFrom: 'local',
skipWaiting: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry didnt notice this earlier, should we have skipWaiting and clientsClaim here? I was under the impression these are used in development and break out of the service worker lifecycle?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this may be OK since we're doing a network first approach

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The waiting phase is useful when you're only running one version of your site at once, but since we don't need that feature we can make the new service worker activate sooner, so I think setting it to true is good.

],
swDest: paths.public + 'service-worker.js',
importWorkboxFrom: 'local',
skipWaiting: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this may be OK since we're doing a network first approach

@Nooshu
Copy link
Contributor Author

Nooshu commented Apr 25, 2018

Any feedback on this PR @36degrees? Don't want to merge until you've looked over it.

Copy link
Contributor

@36degrees 36degrees left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely have some reservations about merging this.

I am excited by the possibilities – and I think it could be really useful, but I don't think that it's a feature we would prioritise ourselves at this stage, and so if we were going to merge this I think we would want to be confident that it wasn't going to cause us any problems down the line.

It's adding a reasonable amount of extra complexity which I don't think is well understood by the team.

From testing, Chrome downloads 23.1MB of data on first page load (as tested in incognito mode). Would this also happen if a user visits from a mobile phone on their perhaps-limited data plan? I don't think that's OK.

screen shot 2018-04-26 at 09 40 54

I don't think we can be confident that this is 'just going to work' and I don't think we want the overhead of manually 'testing' it regularly either. The trailing slash issue I've raised as a comment is a good example, but to provide another – from reading the config I think that if we started using SVGs to illustrate patterns they would not be cached by the service worker, so the user would get a broken experience. I do not think we would catch this. I think we need to be more confident in the approach and/or that we would be able to catch problems (e.g. using automated testing)

We don't understand the implications that this will have on analytics - how will offline views be recorded? If users 'come back online' at a later date (e.g. they come online on a Monday after being offline whilst travelling on a Friday) will the statistics for the Friday then change 'retrospectively'? Is this behaviour something that's going to cause headaches for the performance analysts?

Personally, I don't know enough about Workbox or service workers to be sure that my concerns are exhaustive – I think there may be other implications (especially in terms of performance and security) that we haven't considered.

There's a lot of work still to do here and I don't think now is the right time to be doing it.

@@ -2,11 +2,11 @@
<ul class="app-mobile-nav__list">
{% for item in navigation %}
<li>
<a class="govuk-link js-mobile-nav-subnav-toggler" href="/{{ item.url }}">{{ item.label }}</a>
<a class="govuk-link js-mobile-nav-subnav-toggler" href="/{{ item.url }}/">{{ item.label }}</a>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am guessing that this change (and the other trailing slash changes) is attempting to fix an issue with resolving trailing-slash-less paths when falling back to the service worker's cache?

I don't think we can rely on this – as well as the navigation there are links from components and patterns to other related components and patterns, etc. As an example, there's a link from the error summary component to 'error messages' (under 'How it works') which doesn't end in a trailing slash, and which is broken when offline.

I don't think 'enforcing' trailing slashes is really enforceable, and I don't think we can rely on regularly testing this functionality either – is there a way we can modify the service worker to resolve either way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is certainly something that can be investigated further. Without a trailing slash the SW will assume /about is referring to /about.html. It looks for the trailing slash before appending the directoryIndex setting. There is a prettyUrl setting that didn't fix the issue. There is a promising comment here that we can look at.

I think as @alex-ju mentioned above we should switch to the injectManifest method, this gives us a lot more fine grained control over how the SW works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue has now been fixed in the service worker without the need for modification.

@Nooshu
Copy link
Contributor Author

Nooshu commented Apr 26, 2018

From testing, Chrome downloads 23.1MB of data on first page load (as tested in incognito mode). Would this also happen if a user visits from a mobile phone on their perhaps-limited data plan? I don't think that's OK.

Yes, very good point. And this raises questions on how SW's actually handle this issue. I haven't seen a setting that only allows downloading on a wi-fi network rather than a data plan. This also raises the question of what experience do we want an offline user to have? Is it better to have no offline experience at all than a slightly broken one (missing images for example)? Personally I was thinking it would be useful to have the full set of documentation to refer to while working remotely (commuting on a train for example), but maybe that isn't the case?

We have the ability to define routes and include / exclude any type of asset, as well as defining different caching strategies where required. It also raises the question of what is taking up the 23.1MB and can this be optimised in anyway.

We don't understand the implications that this will have on analytics - how will offline views be recorded? If users 'come back online' at a later date (e.g. they come online on a Monday after being offline whilst travelling on a Friday) will the statistics for the Friday then change 'retrospectively'? Is this behaviour something that's going to cause headaches for the performance analysts?

At the moment offline tracking isn't enabled. If this is a requirement then there is a module in Workbox that can be enabled. A custom dimension would be setup in GA so we could distinguish between and online / offline user.

@Nooshu
Copy link
Contributor Author

Nooshu commented Apr 26, 2018

Two main problems that are causing the increased cache size is:

  1. Images. CSS and JavaScript files are being duplicated multiple times
  2. Many of the duplicated CSS files contain their own copy of the Base64 encoded font files

The Base64 encoded fonts are 200KB. After investigation changing the font loading strategy reduces the cache size by 72%. Duplicated assets still need to be investigated.

@Nooshu Nooshu changed the title Service Worker added for offline documentation [DON'T MERGE] Service Worker added for offline documentation Apr 26, 2018
@Nooshu
Copy link
Contributor Author

Nooshu commented May 3, 2018

Closing for the moment. Will reopen again once I have an update.

@Nooshu Nooshu closed this May 3, 2018
@Nooshu
Copy link
Contributor Author

Nooshu commented May 24, 2018

Adding a small comment from x-gov slack. Gavin Wye from MoJ asked:

Is there a way I can get the design system to run locally so I can work when I don’t have an internet connection?

Looks like there is a user-need for this in the future.

@Nooshu
Copy link
Contributor Author

Nooshu commented Jun 12, 2018

There has been significant progress with this branch as the following changes made by the DS team have really helped reduce the overall size of the asset precaching:

The fonts are no longer included via Base64 encoding, and assets that were repeated across the codebase multiple times have been reduced almost completely.

The only assets that are now repeated across the codebase are:

  • images/govuk-crest-2x.png
  • assets/images/govuk-crest-2x.png
  • images/govuk-crest.png
  • assets/images/govuk-crest.png
  • images/govuk-logotype-crown.png
  • assets/images/govuk-logotype-crown.png
  • components/fieldset/address-wrapper/index.css
  • patterns/addresses/multiple/address-wrapper.css

The cost of these repeating assets is very small, only 14KB in total.

All these changes have had a huge effect on the amount of data being pre-cached. A reduction of 79%, from 23.1MB to 4.9MB. This is to store the whole of the Design System documentation and assets for offline usage.

new-storage

The largest single asset that is now being pre-cached is the application.js file which comes in at 320KB. See the image below for more information on the largest assets.

largest-assets

A preview of the Design System using the Service worker and offline mode can be viewed here.

@NickColley
Copy link
Contributor

@Nooshu it looks like the assets are not being GZIPed?

@NickColley
Copy link
Contributor

I think we'd want to have a simple test plugin that checks the assets in the workbox manifest for sanity and breaks the build if it is very odd.

@36degrees
Copy link
Contributor

36degrees commented Jun 12, 2018

The only assets that are now repeated across the codebase are:

images/govuk-crest-2x.png
assets/images/govuk-crest-2x.png
images/govuk-crest.png
assets/images/govuk-crest.png
images/govuk-logotype-crown.png
assets/images/govuk-logotype-crown.png
components/fieldset/address-wrapper/index.css
patterns/addresses/multiple/address-wrapper.css

Addressed the duplicated images in #315

@Nooshu
Copy link
Contributor Author

Nooshu commented Jun 12, 2018

Rebased #315 and can confirm the duplicate images have been removed from the service worker manifest. The CSS files still exist:

components/fieldset/address-wrapper/index.css
patterns/addresses/multiple/address-wrapper.css

@Nooshu Nooshu changed the title [DON'T MERGE] Service Worker added for offline documentation [Do not merge, discuss after 1.0.0] Service Worker added for offline documentation Jun 12, 2018
@NickColley
Copy link
Contributor

Since we're moving to GOV.UK/Design System we'd need to be very careful that this Service Worker is only scoped to /design-system

@Nooshu
Copy link
Contributor Author

Nooshu commented Jun 12, 2018

After discussion with @igloosi and testing in a new incognito window it can be seen that the data that is transferred across the network is 1.4MB in total (214 files transferred, 201 files cached):
data-transfer

This is then cached and uncompressed to around 7.6MB in the browser:
cache

These figures have been confirmed on multiple devices.

@Nooshu
Copy link
Contributor Author

Nooshu commented Jun 12, 2018

@NickColley good point about the URL, I'll add that to the list now.

Update: Scope of the SW has now been restricted to the path the service worker sits in.

@Nooshu
Copy link
Contributor Author

Nooshu commented Jun 13, 2018

500KB limit has now been added to workbox-config-cli.js, any file over this size will be ignored by the precache manifest.

@Nooshu
Copy link
Contributor Author

Nooshu commented Jun 13, 2018

I think we'd want to have a simple test plugin that checks the assets in the workbox manifest for sanity and breaks the build if it is very odd.

@NickColley I'e added a very simple set of checks that output a warning to the console if the total file size or the total number of files increases over a certain threshold. The previous state is written to a file in the filesystem, this seemed like a fairly simple way to store this state, but I'm open to ideas of how to improve this.

It doesn't fail the build at the moment, it just shows a warning (screenshot below).

warning

@Nooshu
Copy link
Contributor Author

Nooshu commented Jun 14, 2018

Removed the need to use the Google CDN to serve Workbox. Following the guide here and using the CLI command: workbox copyLibraries lib/workbox/ the required files are now copied to the deploy directory on build.

@Nooshu
Copy link
Contributor Author

Nooshu commented Jun 15, 2018

Started to investigate the use or routing to allow different caching strategies to be used and a different cache per type of asset. Commit investigation here Only the HTML files are being precached, with all other assets sitting in a separate cache.

  • Images in govuk-ds-cache-images using a staleWhileRevalidate strategy. Two requests are made one to the network one to the cache. Cache is used if available then replaced in the background if there is an update from the network.
  • CSS / JS in govuk-ds-cache-css-js using a networkFirst strategy. Asks the network for the request first then falls back to the cache if network unavailable. Useful for assets that update often.
  • Fonts in govuk-ds-cache-fonts using a cacheFirst strategy. Once cached the files won't be updated again, but since the fonts are revisioned a new version once added will have a totally different name.

UPDATE: Fonts cache now has a 7 day expiry added using the workbox.expiration.Plugin. Fonts will be cached for a long period of time then cleared out after 7 days if they are ever updated.

@Nooshu
Copy link
Contributor Author

Nooshu commented Jun 19, 2018

Added offline detection functionality. When a users connection is dropped a small red tag will be shown in the bottom right corner of the DS. This is to tell the user they are using the cached version of the DS. This tag will be removed when the connection is reestablished.

The design uses the same styles as the "beta" tag only with a different BG colour. I expect the design to be changed and the actual code structure to be changed in the future as required. See screenshot for a preview:

offline-tag

@Nooshu
Copy link
Contributor Author

Nooshu commented Jun 26, 2018

  • Rebased branch with all the changes from the 1.0.0 release week.
  • Upgraded WorkBox to 3.3.1 which includes a couple of new features for cache maintenance. I've included the purgeOnQuotaError setting for images and CSS and JS since they will most likely be the largest assets in the cache in the future.
  • Changed CSS / JS to use a cacheFirst strategy since we are fingerprinting these files so the cache will be updated every time a new version is released. Old cache expires after 7 days.

@Nooshu
Copy link
Contributor Author

Nooshu commented Jun 29, 2018

I've just reviewed the latest setup with all the 1.0.0 changes and HTML only being precached. With GZip compression enabled the total initial download with the SW enabled is 848KB. This setup allows a user to access all pages offline, all functional with the fonts and JS. There may be a few images missing since these aren't being precached.

html-precache

@Nooshu
Copy link
Contributor Author

Nooshu commented Jul 2, 2018

I've been investigating the possibility of implementing a "kill switch" for the service worker. This is in the unlikely event that something odd happens in the SW and there is a need to kill it in client browsers, as this isn't something we would expect a user to do.

Fortunately this requirement is already part of the service worker specification as of 2016. Essentially the service worker JS file sits outside of the standard browser cache. When a user navigates to a page that sits under the service worker scope or when the service worker is woken up, it will make a request to the server and look for an updated service worker JS file. If it finds an update it will update this in the background without the user even knowing it happened.

There is a great write-up here by Jeff Posnick:

...browsers will always go to the network if the age of the service-worker.js entry in the HTTP cache is greater than 24 hours. So, functionally, there's no difference in using a max-age of 1 day or 1 week or 1 year—they'll all be treated as if the max-age was 1 day.

Browser vendors want to ensure that developers don't accidentally roll out a "broken" or buggy service-worker.js that gets served with a max-age of 1 year, leaving users with what might be a persistent, broken web experience for a long period of time. (You can't rely on your users knowing to clear out their site data or to shift-reload the site.)

It is still recommended that a max-age Cache-Control header is set for the service-worker.js file of some very short time period, 1 minute or even 0, but this can be relaxed if there are no plans to update the service worker very often.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants