Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation search not returning relevant results. #9758

Closed
1 of 2 tasks
IshwarKanse opened this issue Aug 6, 2018 · 4 comments
Closed
1 of 2 tasks

Documentation search not returning relevant results. #9758

IshwarKanse opened this issue Aug 6, 2018 · 4 comments
Assignees

Comments

@IshwarKanse
Copy link

IshwarKanse commented Aug 6, 2018

This is a...

  • Feature Request
  • Bug Report

Problem:
Documentation search is not returning relevant results. For example try searching for pods, daemonset, static pod, ingress. This is happening from the past few days.

Search results for pod.

screenshot from 2018-08-06 15-00-58

Search result for Daemonset.

screenshot from 2018-08-06 15-00-58

Proposed Solution:
Include the most relevant results first for the searched term.

Page to Update:
https://kubernetes.io/docs

@calind
Copy link

calind commented Aug 6, 2018

curl -sI https://kubernetes.io/docs/home/:

HTTP/2 200
cache-control: public, max-age=0, must-revalidate
content-type: text/html; charset=UTF-8
date: Mon, 06 Aug 2018 15:12:06 GMT
etag: "89ba3d9fabf749f5b399d2a17554de59-ssl"
link: </images/favicon.png>; rel=preload; as=image, </css/callouts.css>; rel=preload; as=style, </css/styles.css>; rel=preload; as=style, </css/custom-jekyll/tags.css>; rel=preload; as=style, </js/custom-jekyll/tags.js>; rel=preload; as=script, </js/script.js>; rel=preload; as=script
strict-transport-security: max-age=31536000
x-robots-tag: noindex
age: 2996
content-length: 33459
server: Netlify

Looks like there is a x-robots-tag: noindex so the results are not indexed in search engines.

@calind
Copy link

calind commented Aug 6, 2018

Looks like it was introduced by bf4b937.

@IshwarKanse
Copy link
Author

Yeah, looks like an indexing isse. Even in the Google search for Kubernetes, the homepage is missing. Getting some documentation links only.

@chenopis
Copy link
Contributor

chenopis commented Aug 16, 2018

Postmortem

Owner: Jared Bhatti
Collaborators: Tom Van Waardhuizen, Zach Corleissen, Andrew Chen, Luc Perkins
Status Action items in progress

Problem Summary

Impact: An unintended addition of a X-Robots-Tag: noindex header for pages on the Kubernetes website led to those pages being de-indexed by Google and thus opaque to Google searches.
Root Cause: At the moment, the X-Robots-Tag: noindex header is added to all Kubernetes website pages that are not part of the production build, to prevent pages intended for staging and other non-production environments from being indexed by Google. A pull request that added HTTP/2 server push headers for JavaScript and CSS assets inadvertently added the X-Robots-Tag: noindex header to pages in all environments, including the production site.
Duration of problem: K8s.io was removed from Google Search for approximately 48 hours.
User impact: The Kubernetes website remained fully available to all internal and external users, yet the loss of search indexing made documentation pages effectively unavailable to users.
Detection: External users filed GitHub issues alerting us that pages were unavailable via search
Resolution: A later pull request ensured that the X-Robots-Tag: noindex header would no longer be added to pages in the production environment.

Background

In July 2017, when kubernetes.io hosting was moved from GitHub Pages to Netlify, it enabled SIG Docs to offer additional versions of the Kubernetes documentation on separate subdomains, such as https://v1-8.docs.kubernetes.io/. Consequently, the X-Robots-Tag: noindex header directive was added to the build commands in the Netlify deploy settings (see screenshot below) to prevent the versioned docs from polluting the Google search results related to the main https://kubernetes.io production site.

download

Then, after kubernetes.io was migrated in May 2018 from using the Jekyll static site generator to Hugo, the site build commands and variables were moved into the netlify.toml file. This was done to ensure that all Hugo-based versions of the website use the proper build commands but also meant to unbury the build commands from the Netlify control panel and make the build control mechanisms more transparent.

To maintain the search index cleanliness with this setup, k/website PR #9150 added the noindex directive to the default build commands. This was overridden for the production website by using the Netlify context for the master branch ([context.master]). At the time, the reasoning for doing it that way was to protect against the general problem, which was search pollution caused by versioned branches or staging branches, such as the one used to preview the Hugo migration changes (https://hugo-migration.docs.kubernetes.io/, now defunct). The production build was treated as the exception because master is the only branch that should be in production and indexed.

qcakn5b4y5p

However, in two separate and unrelated PRs meant to utilize the HTTP/2 server push functionality (#9225) and upgrade the site's version of Hugo (#9703), the noindex control mechanism was inadvertently dismantled and added to the layout/index.headers file (lines 4,5), which affected the master branch and the production build. Consequently, Google search began complying with the noindex directive and dropping kubernetes.io links from search results (see screenshots below).

5wiz9okd0hk

vormnfxyegt

Impact

Kubernetes site search is crucial to document discovery. The site’s navigation is complex enough that casual users are unlikely to quickly find what they’re looking for without it. In this case, de-indexing the site’s pages crucially handicapped site search. Furthermore, many users access Kubernetes documentation not via site search but rather via ordinary Google searches. When Kubernetes pages are no longer accessible via Google search, one would expect a sharp diminution in general site traffic, and that is precisely what happened in this case.

Root Causes and Trigger

Pull request #9225 inadvertently caused a X-Robots-Tag: noindex header to be added to all pages in the production environment using the _headers file mechanism supplied by Netlify, the site’s hosting provider. The pull request was intended to use Hugo’s getenv function to add the noindex header only in non-production environments (see this GitHub Gist for an example of how that logic would have looked). The pull request’s creator unintentionally failed to include this logic in the reviewed version of the pull request, which made per-page inclusion of the noindex header unconditional rather than dependent on deployment environment.

Lessons Learned

There are a handful of lessons to be derived from this experience:

  1. Page headers are extremely important and should be modified only when necessary and when properly vetted via environment-specific testing.
  2. Checks need to be put in place that:
    1. Notify SIG Docs maintainers if already-active pages have been improperly de-indexed so that appropriate actions can be taken, e.g. forcing a rollback of the site to a previous version with proper page indexing.
    2. Prevent pages bearing the X-Robots-Tag: noindex header from being deployed in the production environment in the first place.
  3. More broadly, new features should be added to the website in a more conservative fashion and when risks and benefits have been properly vetted. In this case, the wisdom of implementing Netlify’s HTTP/2 server push feature for the website should have been presented to SIG Docs for discussion. That discussion could have raised awareness of the feature amongst maintainers, which may have led to closer scrutiny of the original pull request.
  4. Knowledge of infrastructure mechanisms and how they behave should not reside in only one person.
  5. Default behavior should be set in terms of a failsafe state, not design simplicity -- indexing of redundant content is preferable over no indexing of all the content.
  6. All critical infrastructure should be documented and known to all of SIG Docs.

Things that went well:

  • Once the issue was appropriately identified, a fix was enacted and deployed very quickly (within a few hours)
  • The GitHub Issues mechanism for problem reporting worked as intended

Things that went poorly:

  • Even though there were comments in the netlify.toml file identifying the noindex mechanism, they were not clear about the importance of those lines and that they should not be modified without consultation.
  • The originator of the noindex mechanism (chenopis) was cc'ed on PR HTTP/2 server push #9225 but did not respond in a timely manner, so the PR was merged anyway. The author (lucperkins) had come on board at the CNCF just two weeks prior to submitting the PR and was thus not deeply familiar with some of the basic “nuts and bolts” of the site. The PR reviewer (zacharysarah) approved the PR without understanding the downstream impact or enforcing the gate check of a SIG Docs proposal.
  • The changes that dismantled the noindex mechanism happened in two separate and unrelated PRs (HTTP/2 server push #9225 and Update to Hugo 0.46 #9703).
  • Because Google search indexing changes happen on a timescale of weeks, it will be several weeks more before the kubernetes.io links in search results return to normal.
  • Even though the Google search index for kubernetes.io can be monitored in the Google Search Console, it was not regularly checked to identify the issue earlier.
  • On August 2, 2018, a user reported in Slack #sig-docs numerous broken links in search results related to https://kubernetes-v1-4.github.io, which was a very old version of the website. In retrospect, this only bubbled up in the Google search results because the regular kubernetes.io search results were disappearing. Hence, this was a signal that was not properly understood at the time.

Where we got lucky:

  • The issue was reported by an outside user who both had a GitHub account and knew which action to take in reporting the issue.

Action Items

  • Investigate this incident — At the moment, it appears that all important details are known, including the pull request that triggered the changes in question, the extent of the problem, and the changes that need to be made (already enacted).
  • Detect future incidents — A mechanism needs to be created that can periodically—daily, hourly, upon each build, or at some other interval—check to ensure that pages on the production site are being properly indexed (i.e. don’t bear the X-Robots-Tag: noindex header). If the check fails, and pages on the production site indeed bear that header, then maintainers need to be notified immediately (perhaps via Slack or email). Alternatively, use Google Analytics and see if there’s a way we can build in a warning.
  • Mitigate future incidents — If this issue is detected again, the site needs to be rolled back to a previous state, in which the X-Robots-Tag: noindex header is not present. Netlify, the hosting provider for the Kubernetes website, enables one-click rollbacks that should make this fairly easy.
  • Prevent future incidents — While detection/mitigation steps are outlined immediately above, website builds should fail if improper headers are included in the production environment. We could, for example, add a simple post-build script that returns a non-zero exit code if the _headers file—which, on the Netlify platform, determines which headers are added to pages—contains the X-Robots-Tag: noindex header in production.
  • Ensure institutional knowledge — Document critical infrastructure and handoff knowledge/ownership to CNCF to be the source of truth, so that it can be maintained and disseminated to multiple people.

Timeline

  • July 17, 2018 — Pull request HTTP/2 server push #9225 introduced the initial problem (noindex headers on all pages for the production site)
  • August 2, 2018 — User reports in Slack #sig-docs numerous broken links in search results related to https://kubernetes-v1-4.github.io, which was a very old version of the website and was not normally listed very high in Google search results.
  • August 7, 2018 — Issue Documentation search not returning relevant results. #9758 identified the search indexing issue (this issue was ]submitted by someone outside of SIG Docs). Later in the day, pull request Fix indexing for k8s.io #9767 provided a solution to the headers/indexing problem.
  • August 12, 2018 — The Google search indexing for kubernetes.io begins to recover.

Additional items in followup meeting

  • Luc Perkins and CNCF will own the site infrastructure tooling
  • Andrew Chen will work with Luc to document the site infrastructure
  • Content will be documented in the contribute section of the docs content: https://kubernetes.io/docs/contribute/
  • Luc will build a post-build hook that will verify that header files are present when the site is published with additional tooling to come.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants