feat: Automatically capture Winston logs per-snapshot #343
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
In order to debug snapshots, sometimes support needs to ask the customer to rerun a build with
LOG_LEVEL=debug
and then request that the user send those logs.Sometimes this isn't feasible as some issues may not be easily reproduce-able. These logs are also interweaved with test runner logs, which may be hard to parse.
We can tell Winston to send snapshot-specific logs to a file, and upload that log file along with other snapshot resources. This allows support to see logs when downloading a snapshot for debugging without asking the customer.
Approach
First, Winston needed to be updated to 3.x as it is much more configurable and modular than Winston 2.x. This allows us to reuse the console transport, and use separate loggers for each snapshot's own logging. The biggest change from 2.x is the format options and lack of defaults. The old format was minimally recreated for the console transport, while the file transport is left to be raw JSON.
When a snapshot is received by the agent process, the logfile is created along with the snapshot-specific logger. The snapshot logger is then passed along to the snapshot service, which then passes it along to the asset discovery service, which then passes it along to the resource service, which finally passes it along to the response service. All of these services are responsible for different aspects of each snapshot, so they each need to be aware of the snapshot-specific logger.
Notes
Since the content sha of a resource needs to match the sha used when a snapshot is created, we cannot send logs after-the-fact for an entire build. Logs can only be sent when the snapshot is created, hence only capturing and uploading snapshot-specific logs.
Support may still need to debug test runner output in a case when snapshot-specific logs don't show anything suspicious. We can automatically capture this output as well, but wouldn't be able to upload it with a build (security concerns aside). We would need a way to associate resources with a build during finalization, or have a separate endpoint to upload build logs.
TODO
Write a test that verifies a new log is created per-snapshot. These logs use a timestamp within their name to avoid overriding each other in the tmp directory, so testing this might involve stubbing the Date to write to a stable filename.