Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comprehension / generator, opt-in feature to preserve line breaks #11753

Open
Martin19037 opened this issue Jun 5, 2024 · 3 comments
Open

Comprehension / generator, opt-in feature to preserve line breaks #11753

Martin19037 opened this issue Jun 5, 2024 · 3 comments
Labels
formatter Related to the formatter style How should formatted code look

Comments

@Martin19037
Copy link

Before I begin, here is some context:

I would love to incorporate ruff as a pre-commit formatter for one of our projects. We developed the habit of almost always separating comprehension target-expression, for-in statement(s), and optional if-expression(s) into separate lines, except for very short and trivial comprehensions. Naturally, this makes identifying these separate comprehension parts very easy at a glance, at the cost of a few more lines. Also, we have increased our line length somewhat.

In its current form, ruff will always collapse comprehensions to a single line if:

  • it is short enough to fit within the line length
  • there are no "magic trailing commas" in any of the comprehension parts

We do use trailing commas ourselves, so many of our hand-formatted comprehensions that contain trailing commas in any form remain more or less unchanged after running ruff format. The ones that do not, usually get collapsed to a single line mess (also due to our increased line length).

One toy-example for this would be:

[
    i + n
    for i in range(5)
    for n in range(5)
    if i < 3
]

which would be collapsed to a single line:

[i + n for i in range(5) for n in range(5) if i < 3]

which is not as easy to recognize at a glance as the original example.

Would it be possible to have an opt-in feature to always preserve line breaks in comprehensions and generators? The behavior would be the same as if there were a trailing comma anywhere in the comprehension, and each comprehension-part is given a separate line, maybe even expand the whole comprehension into separate lines if there is at least one line break in it.

@AlexWaygood AlexWaygood added the formatter Related to the formatter label Jun 5, 2024
@MichaReiser MichaReiser added the style How should formatted code look label Jun 5, 2024
@MichaReiser
Copy link
Member

Thanks for the detailed write up. I can understand that nested comprehensions can get hard to read.

Would it be possible to have an opt-in feature to always preserve line breaks in comprehensions and generators?

Respecting line breaks would set a new precedence for ruff/black. There are other formatters that rely on line breaks. For example, prettier doesn't collapse an object expression if it has a line break after the last property's value:

let a = { 
	a: 1,
  	b: 2
}

Whereas the following will be collapsed because it doesn't have a line break after b

let a = { 
	a: 1,
  	b: 2 }

I could see a similar rule where the comprehension wouldn't collapse if there's a line break before the closing ], but I'm very hesitant of adding it because:

  • It makes formatting changes non-reversible. A comprehension never collapses after the formatter expanded it once
  • It's non obvious why the formatter doesn't collapse some comprehensions but does collapse others
  • It will lead to less consistent code because users can now overrule the formatter.

Black's/Ruff's magic trailing comma suffers the same problem.

I'm more open if we could identify a formatting rule that makes comprehensions more readable in general without requiring manual intervention. Are there specific patterns in the body that should never collapse? E.g. if there's any nesting?

@Martin19037
Copy link
Author

Martin19037 commented Jun 6, 2024

@MichaReiser Thank you for your feedback. Your arguments sparked a discussion within our team, what the actual requirements of such a feature would be, and that the initial idea of relying on line breaks is not sufficient or feasible.

Initially, we though that the decision, whether to collapse or expand a given comprehension is mostly subjective. After checking a number of actual examples from our code base, we noticed a pattern that may describe what we are looking for. We usually collapse comprehensions that do not stray too far from the "visual span", as in, you can see and perceive the whole comprehension without having to scan your eyes along horizontally too much, if that makes sense to you.

Another toy example that may show what I mean. Here are two comprehensions that would produce the same AST:

[x for x in range(3) if x == 0]

[descriptive_name for descriptive_name in range(3) if descriptive_name == 0]

We would prefer the top one collapsed, since it is short enough. The bottom one we would expand, as it is more effort to scan/read along the line to know what is going on, even though it has the same complexity as the top one.

Using this knowledge, our suggestion would be to introduce a maximum comprehension (or generator) expression length (opt-in of course, to preserve Black compatibility). Unlike the line length parameter, it only describes the length of a collapsed comprehension expression itself, regardless of indentation or nesting. I do not know any implementation details, but I imagine it working like so:

  • If a comprehension can be collapsed using existing rules, the collapsed single-line form is measured against the max comprehension expression length
  • If we are below the expression length, the collapsed form can be kept
  • Otherwise, it has to be expanded

Regarding nesting, I imagine this also working for deeply nested and highly indented cases. If a nested comprehension expression is short enough, it may be collapsed within a larger, expanded one. The overall line-length still wins over this, if indentation or nesting is too high.

Does this make sense at all? Also is this technically feasible to implement, or are we dreaming too much here?

@Avasam
Copy link

Avasam commented Jun 11, 2024

This is something I've suggested on the formatter alpha/beta feedback as well, and probably my only reticence in using Black or Ruff as a formatter (still on autopep8 + add-trailing-comma ). And bad unwrapping is the main reason (amongst others) I don't use prettier.
#7310 (comment)

The tl;dr of my post above: there's magic commas for most of my needs where I want to force a vertical style, but nothing as such for comprehensions and conditional expressions. (unless you count using # fmt: skip on most of them). Although i could be happy enough with a single # ?


Similar comment from @Booplicate : #7310 (comment)


I feel like @Martin19037 's description fits pretty well. Having a conditional/ternary/comprehension-specific length (total line length? count just the comprehension itself?) could be weird at first, but it fits all the requirements established by @MichaReiser (consistent, reversible, no manual intervention).

Although there's also the question of whether the in operator in comprehensions should be unwrapped as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
formatter Related to the formatter style How should formatted code look
Projects
None yet
Development

No branches or pull requests

4 participants