feat: Add `--max-comments-per-command` configuration #3905

glasser · 2023-10-31T21:33:54Z

what

Allow Atlantis administrators to cap the number of VCS comments that will be created for a single command. (The default value is 0, which means unlimited comments.)

why

If you're trying something like TF_LOG=debug, Atlantis can produce so many comments that it becomes challenging to read your PR, or worse still, your VCS might rate-limit Atlantis, effectively breaking your ability to use it for an extended period of time. While this PR does not change the default behavior, it's probably a good idea to set this flag to something like 10.

tests

Unit tests for common.SplitComment. (There doesn't appear to be other tests of comment splitting.)

references

Fixes Don't post comment if it's too large #416.

glasser · 2023-10-31T21:34:56Z

Hi! This is my first Atlantis PR and is a basic attempt to fix #416. I'd love to have some high-level feedback as to whether this is a reasonable approach that would be accepted before I finish it.

So far I've just focused on the top level structure of how to add a new entry; before this could be merged I would write implementations for non-GitHub VCS implementations.

Allow Atlantis administrators to cap the number of VCS comments that will be created for a single command. (The default value is 0, which means unlimited comments.) If you're trying something like `TF_LOG=debug`, Atlantis can produce so many comments that it becomes challenging to read your PR, or worse still, your VCS might rate-limit Atlantis, effectively breaking your ability to use it for an extended period of time. While this PR does not change the default behavior, it's probably a good idea to set this flag to something like 10. Fixes runatlantis#416.

lukemassa · 2023-11-04T18:08:43Z

@glasser thanks for the contribution! I think this is definitely a step in the right direction.

I've recently been a bit worried about the growing number of configuration options to server, I wonder if a first pass sets a hard-coded value like 10, or even 100? We can always make it configurable later, I just feel like the marginal benefit of comments goes way down after even just a couple, and if you need to capture 327,680 (or 3M) characters, you can (and should) be using a "debug workflow" with a custom command. We could even write a brief "how to" and link to it in the footer. I'm interested in what the maintainers have to say about the tradeoff between not changing current behavior vs stemming the increasing tide of configuration options. And also to you @glasser whether there's a hard-coded number you think would be sufficient (or even, given this is implemented, what you had planned to set it at for your setup)?

Another unrelated concern I have is clipping the end, when likely the last few lines will have a lot more value than the lines in the middle. I'm not sure really how to address that, and that could will be a future improvement, something akin to: golang/go#7181. For example if there are determined to be n comments worth of text and the max is k < n, we show the first k-2, one comment that says "elided pages of text" then one last comment with the nth part? Again this is just an idea for a possible future improvement, I don't think it needs to complicate this PR.

All that said, certainly some logs are better than the crashing behavior we have now, so I definitely want to pursue this.

glasser · 2023-11-04T19:06:32Z

I do like the idea of preserving (say) the last page. That said, you can also view the full log at the web UI anyway, right? Which in practice is what I'd do for any log that triggers this.

I was trying to avoid backwards-incompatible changes without any way to even revert it, thus the configuration. If you'd prefer an uncontrollable change, I'd get that. My project will probably limit to 10.

lukemassa · 2023-11-04T20:27:05Z

I do like the idea of preserving (say) the last page. That said, you can also view the full log at the web UI anyway, right? Which in practice is what I'd do for any log that triggers this.

Yeah good point. Let's leave this as "maybe do" for later.

I was trying to avoid backwards-incompatible changes without any way to even revert it, thus the configuration. If you'd prefer an uncontrollable change, I'd get that. My project will probably limit to 10.

That's my personal preference, others may differ. We can always make it configurable later if need be, I just want to defer adding a server flag if possible, since those are harder to remove.

I personally agree that getting back >10 comments, each with 30k characters of text, feels pretty cumbersome, and users should at that point be encouraged to pursue a different way of saving/shipping logs.

glasser · 2023-11-04T21:07:32Z

Ok, cool. I'm not sure how the process works around here -- before I go and rip out the option, is this a relatively solid decision and you're likely to merge the PR, or should we wait for more thoughts?

jamengual · 2023-11-05T01:05:59Z

@GenPage ,@nitrocode and I are the maintainers and @lukemassa has been helping with a lot of contributions lately.

let's wait for us to catch up with this PR in a few weeks.

Thanks

lukemassa · 2023-11-05T18:32:48Z

Ah sorry, I should have been more clear; I was just offering my opinion, I don't have any authority on the matter :) I'll let @jamengual take it from here with a recommendation, apologies for not clarifying that.

jamengual · 2023-11-05T19:09:05Z

Ah sorry, I should have been more clear; I was just offering my opinion, I don't have any authority on the matter :) I'll let @jamengual take it from here with a recommendation, apologies for not clarifying that.

all good @lukemassa your input is super valuable to us and is always welcome.

I hope next week we could find more time to do reviews, plus we are in the process of changing the release pipeline do is going to take some time.

brandon-fryslie · 2023-11-08T17:42:50Z

Hi, I'm just a regular engineer and I absolutely do not mean to knock the work you've put it. I really appreciate every contributor to this project and it's helped us so much. I think this PR is pretty useful functionality to have, but maybe some minor tweaks. I see a few issues:

The argument to GitubClient seems unrelated to the rest of the arguments. Could there be a better place to get this value, or otherwise have a more generic pattern for passing user configurations to to SCM providers? For example, where is CHECKOUT_DEPTH provided? Of all the arguments, it is certainly the odd one out
I echo the previous concerns about truncating. Truncating the middle is vastly preferable. You'd miss a huge amount of context with this, like how many changes were made, what failed, any error messages. Unless the logs were saved somewhere else (like disk, s3 bucket, etc), the logs of the run would be pretty much useless
I'd almost prefer some sort of "minification" before a truncation. I find most of my "many comments PRs" come from changes to inline policy tempates used across many roles. Terraform doesn't do any sort of comparison right now and lists the entire output of both the "removed" role and then "new" role. Most of that is useless for anything, so maybe "diffs that are larger than 100-200-whatever lines" get a truncated in the middle. That would dovetail nicely with functionality to save these logs off to s3 or somewhere else, while hardly impacting the usability in any way
I'm not exactly sure how to achieve this (not familiar with all of scm arch in Atlantis) but I don't see why this would be implemented separately for each SCM (github/gitlab/whatever the other one is). It doesn't seem like configuring these separately per SCM would be especially valuable

These are just my 2c, we do run into this issue sometimes. We would never use TF_LOG=lebug in Atlantis such we're admins and can run things locally, but honestly I don't think you'd ever want that output directly in a PR comment. It might contain tokens or sensitive values and lots of people can see those PR comments. It would be far better to filter ALL debug output from the comments and have a secondary storage medium for unfiltered output (that is of course highly restricted like Atlantis itself should be). Disc is easiest at first, but then you have to think about cleanup. S3 is AWS but there are tools that allow mounting a bucket as a file path. Then you can push cleanup onto the users and avoid a whole mess of complexity around filling your disk withint Atlantis itself

glasser · 2023-11-08T17:48:33Z

The argument to GitubClient seems unrelated to the rest of the arguments. Could there be a better place to get this value, or otherwise have a more generic pattern for passing user configurations to to SCM providers? For example, where is CHECKOUT_DEPTH provided? Of all the arguments, it is certainly the odd one out

Sure, it would be reasonable to have a struct for common configuration for creating comments.

I'm not exactly sure how to achieve this (not familiar with all of scm arch in Atlantis) but I don't see why this would be implemented separately for each SCM (github/gitlab/whatever the other one is). It doesn't seem like configuring these separately per SCM would be especially valuable

It's not being configured separately per SCM, just passed through. (The different SCMs do have different formats for their headers/footers and length limits, which is why the split call is happening inside the SCM code.)

brandon-fryslie · 2023-11-08T18:09:55Z

It's not being configured separately per SCM, just passed through. (The different SCMs do have different formats for their headers/footers and length limits.)

Totally missed that. awesome, great work. It wouldn't especially help our team / teams (we have 10+ atlantis instances running internally, my team runs 2 - one for dev, one for prod), but what I am seeing is that teams are simply taking the output and persisting it elsewhere via a python script.

I (highly) appreciate everyone who has contributed to this project. My ideas were mostly suggestions to spur future thoughts, I don't expect them to make it into your PR or want to complicate things. However middle truncation is kind of a base level functionality. It would almost be better to limit the maximum number of projects that can run, or to allow custom filtering inline.

Many different people are using Atlantis. If this would be e.g., a default of 10 comments per atlantis command, e.g., atlants plan, then nearly every single PR we have would be truncated. We have a monorepo (this was to get the team familiar with tf, now we're splitting new things out) with (these are made up numbers) ~40 root modules and ~40 child modules. We have 8 separate environments. So if you were to make a change that impacted 4 modules that were used in all environments, that's 32 different runs of terraform alone. It's easy to make a change that requires 50, 60 terraform executions. If you have even a single module that needs to split outputs into multiple comments, you're automatically looking at 70+ comments just for a SINGLE execution of atlantis plan. It's not uncommon to have 150+ comments from Atlantis by the time its done

I'm not expecting you to change anything our behalf. Just trying to give some perspective on different use cases in case it's useful. Huge appreciators of the maintainers along with everyone who contributes great useful features like this!

glasser · 2023-11-08T18:12:04Z

That seems like a good data point about the question @lukemassa raised as to whether it would be reasonable to have this be a non-configurable hardcoded value (with the answer "no").

brandon-fryslie · 2023-11-08T18:33:14Z

That seems like a good data point about the question @lukemassa raised as to whether it would be reasonable to have this be a non-configurable hardcoded value (with the answer "no").

We would be stuck on whatever version of Atlantis allowed us to use it, I guess :)

If there was some sort of mechanism to persist the file logs to "offsite storage" (a bucket seems logical since there are relatively few restrictions beyond access, and it removes the burden of cleanup from Atlantis entirely, let users cleanup their own bucket). Then that would completely resolve any concerns about truncation, number of comments, whatever.

Tne logs in the "Detail Link" are kind of a joke unless you're just testing a single module to learn terraform and have time to sit around and watch it. 95% of the time the data is gone by the time we need it. Since Atlantis is in a completely isolated environment, non-admins can't get anywhere near that cluster. I don't think more advanced devops people would care if it was removed entirely for the most part. Our users can't access it and it would be empty if they did. Without persistence, pre/post workflow hooks, stuff like that it's maybe between 0-5% useful in my experience. It's certainly not close to a replacement for the PR comments which are available for more than the 3-4 minutes it takes for the module to run. The single usecase is "my module has been running for 45 minutes and I have no idea what's going on, lets see whats up*. But out of thousands and thousands of deployments we've never actually needed it to debug that issue with those logs (hint: it's network connectivity or you're replacing node groups on an eks cluster).

The point there is that those live streaming logs are not a replacement for anything. Possibly "offsite" persistence via s3 could actually make those logs useful, who knows. I really apologize if one of you implemented those. Absolutely nothing personal and I'll buy you a drink next time you're in town. Sorry for being brutally honest

brandon-fryslie · 2023-11-08T18:45:48Z

We have teams persisting to gist but I don't think that's generic enough for Atlantis code. There just needs to be an extremely simple adapter to optionally persist the full, unaltered logs to a second "offsite" location. Lifecycle rules, cleanup, whatever, could be entirely left to the user. No need to complicate Atlantis.

That would remove any pressure on changes to the PR comments, truncationation, formatting anything. No matter what, you'd always be able to refer to the actual unfiltered output. The PR comments are critical right now because you can't do anything without them. They're the source of truth. Full, unaltered text output could become the source of truth, and give more flexibility to alterations in the content of the comments.

And hey you probably read those files back in when someone wants to see the output of "job" via the "Show Details" link. Then rather than the constant disappointment that greets users when they see that empty, black screen, there's a chance it could help them solve a problem

brandon-fryslie · 2023-11-08T19:13:24Z

Just to recap since my comments might be interpreted as off-topic. Here here's is gist

This directly relates to the "comments issue" because it would always provide a fallback option if something got truncated in the comments. Otherwise, we're no matter what truncation is losing some data
Right now the PR comments are the ONLY viable way to see what happened to during an atlantis execution
The "Show Details" link is not helpful because it's empty most of the time, seems to be empty as soon as the build finishes
With fully enabled and unfiltered "offsite" storage, Comments could be experimented with and made much more flexible without breaking the single interface to your job results
offsite storage (vs storing it on the volume) means that Atlantis doesn't need to implement ANY logic for cleanup. Make users do it, there are so many options, S3 has lifecycle rules for pretty much anything and a tiny lambda could do it. Atlantis shouldn't need to worry about it

There is no fallback right now if Truncation is merged and breaks a 1/10000 use case. Changes to the primary interface with 0 fallbacks are going to be inherently risks. People will come in hot with stuff like "". At least if they're in s3 or somewhere you can tell them to use that instead. like git plumbing vs porcelain commands. offsite output is plumbing. PR comments are porcelain.

jamengual · 2023-11-08T21:14:34Z

@glasser @brandon-fryslie I'm a bit conflicted by this feature. It is trying to solve a problem that does not pertain to Atlanti's core features, and I will explain why.

Terraform Debug: is usually considered sensitive; it is not a good idea to expose that anywhere it can be saved, especially in a repo where it will be held forever.
Debug logs: Debug/Trace logs are usually too lengthy and traditionally sent to other systems for reconciliation, searching, etc.

Based on that, if you want to debug a run in Atlantis, you should enable debug and use the Detail live logs to see the logs, but I won't recommend running a debug run based on my earlier comment.

Atlantis support for external log storage will be better suited for this kind of stuff, where you can send logs to s3, cloud watch, fluentd or any other system instead of your pr comments. You could have a repo server-side config like:

external-logging: enabled
external-log-facility: localhost:2020 
external-log-strategy: debug ( options could be debug, all)

That could point to your log aggregator to then send the log somewhere else to be seen by whoever is trying to debug.

That feature does not exist and will be to be built.

benoittoulme · 2024-06-18T18:18:08Z

The changes here don't seem to incorporate #3905 (comment), however seeing as they are backwards compatible, if people are happy with the approach, this could be merged as is and I could implement my proposal in a separate PR. This might be more clear in separating the implementation from the "policy".

@lukemassa sorry for the back-and-forth. I addressed the first 2 points of your comment. for 3) and 4) I would argue that this would be better in its own PR, as AFAIK no such mechanism exists. Also, I would argue (and plan to do so in my company) that post-plan or post-apply hooks would allow someone to implement such mechanism - for example by uploading the plan / command results to s3. But I doubt we can have a solution baked in atlantis that works for all.

So please, could you do another review, and thank you for your feedback!

…s a valid value that should not be overriden)

cmd/server.go

lukemassa · 2024-06-26T02:59:41Z

@benoittoulme looks good to me! @GenPage if this looks good to you we I'm happy to go ahead and merge.

NOTE: If we accept this as is, it'll need to go out in a minor release, and there should be a call in the release notes. This changes default functionality, such that if a user relies on >100 comment long plans, they will need to set this value to 0 to restore the current behavior.

Overall I still think we're more likely to fix problems than cause them with this change (famous last words...), so I'm still supportive of the non-backwards compatible approach in this instance, as long as it's properly announced with a description of how to keep prior behavior.

GenPage

LGTM, I added the feature and breaking-change and removed docs label which should highlight this in the release notes

glasser · 2024-06-27T05:44:52Z

@benoittoulme Thanks for finishing this up!

(I removed you from my fork.)

RodrigoMenezes-Vantage · 2024-08-16T00:48:32Z

Thanks for all the work everyone put into this.

Any idea when this will be getting shipped out? Just tried to use the flag only to realize it's not included in the latest release.

benoittoulme · 2024-08-19T20:53:20Z

Thanks for all the work everyone put into this.

Any idea when this will be getting shipped out? Just tried to use the flag only to realize it's not included in the latest release.

I believe this will be included into v0.29.0 release, because of the breaking change ( I am also waiting for this release :) )

benoittoulme · 2024-08-28T21:42:41Z

Thanks for all the work everyone put into this.
Any idea when this will be getting shipped out? Just tried to use the flag only to realize it's not included in the latest release.

I believe this will be included into v0.29.0 release, because of the breaking change ( I am also waiting for this release :) )

🥳 🎉 https://github.com/runatlantis/atlantis/releases/tag/v0.29.0
Thank you!

mubarak-j · 2024-09-13T19:13:48Z

@benoittoulme Thank you for working on this awesome addition, much appreciated!
I filed #4920 which seems like a regression when combining this new flag with hide-prev-plan-comments. Assuming it’s not intentional, is this something you can replicate in your setup?

glasser requested a review from a team as a code owner October 31, 2023 21:33

github-actions bot added docs Documentation provider/azuredevops go Pull requests that update Go code provider/bitbucket provider/github provider/gitlab labels Oct 31, 2023

glasser changed the title ~~Add --max-comments-per-command configuration~~ feat: Add --max-comments-per-command configuration Oct 31, 2023

glasser mentioned this pull request Oct 31, 2023

Don't post comment if it's too large #416

Closed

glasser force-pushed the max-comments-per-command branch from b611d7c to cf12f97 Compare October 31, 2023 21:47

Merge branch 'main' into max-comments-per-command

7c47256

jamengual added the waiting-on-review Waiting for a review from a maintainer label Nov 1, 2023

Merge branch 'main' into max-comments-per-command

4c1e250

cleanup

96c71a3

benoittoulme dismissed stale reviews from lukemassa and GenPage via 96c71a3 June 18, 2024 18:03

default to 100; rework the docs

515dedd

benoittoulme added 3 commits June 18, 2024 11:18

Merge branch 'main' into max-comments-per-command

9fa6a67

cleanup

9faa5a5

check if the flag max-comments-per-command was set explicitly (as 0 i…

0f9ef36

…s a valid value that should not be overriden)

benoittoulme reviewed Jun 18, 2024

View reviewed changes

cmd/server.go Show resolved Hide resolved

benoittoulme added 3 commits June 18, 2024 18:05

Merge branch 'main' into max-comments-per-command

72ee3db

Merge branch 'main' into max-comments-per-command

71d3fc0

Merge branch 'main' into max-comments-per-command

d32fbfd

lukemassa approved these changes Jun 26, 2024

View reviewed changes

GenPage added feature New functionality/enhancement breaking-change and removed waiting-on-response Waiting for a response from the user needs discussion Large change that needs review from community/maintainers labels Jun 26, 2024

GenPage approved these changes Jun 26, 2024

View reviewed changes

GenPage removed the docs Documentation label Jun 26, 2024

lukemassa merged commit 5a918e3 into runatlantis:main Jun 27, 2024
28 checks passed

BrewTestBot mentioned this pull request Aug 28, 2024

atlantis 0.29.0 Homebrew/homebrew-core#182792

Merged

1 task

benoittoulme mentioned this pull request Aug 30, 2024

chore: use markdown to display github comments truncation header #4885

Merged

mubarak-j mentioned this pull request Sep 13, 2024

Regression: Using max-comments-per-command breaks hide-prev-plan-comments=true #4920

Open

benoittoulme mentioned this pull request Oct 4, 2024

chore: fix broken formatting of first comment in case of command output truncation #4980

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add `--max-comments-per-command` configuration #3905

feat: Add `--max-comments-per-command` configuration #3905

glasser commented Oct 31, 2023 •

edited by nitrocode

Loading

glasser commented Oct 31, 2023 •

edited

Loading

lukemassa commented Nov 4, 2023

glasser commented Nov 4, 2023

lukemassa commented Nov 4, 2023

glasser commented Nov 4, 2023

jamengual commented Nov 5, 2023

lukemassa commented Nov 5, 2023

jamengual commented Nov 5, 2023

brandon-fryslie commented Nov 8, 2023 •

edited

Loading

glasser commented Nov 8, 2023 •

edited

Loading

brandon-fryslie commented Nov 8, 2023 •

edited

Loading

glasser commented Nov 8, 2023

brandon-fryslie commented Nov 8, 2023 •

edited

Loading

brandon-fryslie commented Nov 8, 2023

brandon-fryslie commented Nov 8, 2023 •

edited

Loading

jamengual commented Nov 8, 2023

benoittoulme commented Jun 18, 2024

lukemassa commented Jun 26, 2024 •

edited

Loading

GenPage left a comment

glasser commented Jun 27, 2024

RodrigoMenezes-Vantage commented Aug 16, 2024

benoittoulme commented Aug 19, 2024

benoittoulme commented Aug 28, 2024

mubarak-j commented Sep 13, 2024

feat: Add --max-comments-per-command configuration #3905

feat: Add --max-comments-per-command configuration #3905

Conversation

glasser commented Oct 31, 2023 • edited by nitrocode Loading

what

why

tests

references

glasser commented Oct 31, 2023 • edited Loading

lukemassa commented Nov 4, 2023

glasser commented Nov 4, 2023

lukemassa commented Nov 4, 2023

glasser commented Nov 4, 2023

jamengual commented Nov 5, 2023

lukemassa commented Nov 5, 2023

jamengual commented Nov 5, 2023

brandon-fryslie commented Nov 8, 2023 • edited Loading

glasser commented Nov 8, 2023 • edited Loading

brandon-fryslie commented Nov 8, 2023 • edited Loading

glasser commented Nov 8, 2023

brandon-fryslie commented Nov 8, 2023 • edited Loading

brandon-fryslie commented Nov 8, 2023

brandon-fryslie commented Nov 8, 2023 • edited Loading

jamengual commented Nov 8, 2023

benoittoulme commented Jun 18, 2024

lukemassa commented Jun 26, 2024 • edited Loading

GenPage left a comment

Choose a reason for hiding this comment

glasser commented Jun 27, 2024

RodrigoMenezes-Vantage commented Aug 16, 2024

benoittoulme commented Aug 19, 2024

benoittoulme commented Aug 28, 2024

mubarak-j commented Sep 13, 2024

feat: Add `--max-comments-per-command` configuration #3905

feat: Add `--max-comments-per-command` configuration #3905

glasser commented Oct 31, 2023 •

edited by nitrocode

Loading

glasser commented Oct 31, 2023 •

edited

Loading

brandon-fryslie commented Nov 8, 2023 •

edited

Loading

glasser commented Nov 8, 2023 •

edited

Loading

brandon-fryslie commented Nov 8, 2023 •

edited

Loading

brandon-fryslie commented Nov 8, 2023 •

edited

Loading

brandon-fryslie commented Nov 8, 2023 •

edited

Loading

lukemassa commented Jun 26, 2024 •

edited

Loading