Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to reprocess old HARs #942

Open
tunetheweb opened this issue Oct 19, 2024 · 0 comments
Open

Add the ability to reprocess old HARs #942

tunetheweb opened this issue Oct 19, 2024 · 0 comments

Comments

@tunetheweb
Copy link
Member

Since the new way of uploading results into BigQuery happens as part of the crawl, and no longer processes HARs after the fact we are at risk of not being able to recover from a bad upload as much as we could before.

In theory we could run the old python pipeline (or even the Java one befofe it). In reality that code is no longer maintained, it's likely to break (and is already broken for the more recent, larger datasets), may not work on older datasets, and isn't on the tech stack we know or want to support.

It would be nice to have a "mini-crawler" which didn't actually run the tests, but instead just used the WPTAgent upload code to save a batch of historical HARs into BigQuery.

This would also allow us to handle missing code we never got round to fixing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant