Rewrite indentation code #31

axvr · 2023-04-27T14:01:55Z

This PR contains a rewrite of the whole Clojure indentation code with several aims:

Improve the (currently abysmal) indent performance (Improve indent performance #6),
Make indentation rules more configurable (such that Question: is there a way to configure Tonsky's indentation rule? #21 and more are possible),
Change the defaults to be more similar to the Clojure Style Guide, Emacs and VS Code defaults,
Simplify the indentation code as much as possible (currently so complex, it is near impossible to work on).

Along with these aims, this PR makes several further quality-of-life improvements to the indentation system, such as making it such that = does not alter the indentation of multi-line strings.

As of the latest change, the indentation is over twice as fast as before and implemented in about half the code.

To do

The objectives are: 1. Simplify the indentation code; previous implementation has become so complex it is impossible to maintain, 2. Significantly improve performance; previous indentation code was painfully slow, (see issue #6) 3. Maximum configurability; should be configured similarly to cljfmt and make previously impossible things possible (e.g. issue #21). As of this commit, objectives 1 and 2 have been met, but work on objective 3 has not yet begun. There will continue to be further improvements, particularly around performance and the "what if syntax highlighting is disabled?" scenario. These changes will unfortunately be backwards incompatible, but hopefully the improved performance and API will make up for it.

By utilising some ugly mutable state, this change more than doubles the performance of the indentation code. You can expect even further benefits in larger code blocks. This should completely eliminate any need for the `clojure_maxlines` configuration option.

Since maps are far more likely to across multiple lines than vectors (and when they are they tend to be much larger, e.g. EDN files!), we can further optimise indentation by checking if the line is inside a map *before* checking it is inside a vector. (This happens because of the performance improvement made with the addition of the optimised `CheckPair` function.)

The `=` operator will no longer alter the indent of lines within a Clojure multi-line string or regular expression. The previous behaviour was annoying for writing detailed doc-strings, as it made reformatting the file with `gg=G` not possible as it would screw up the indentation within the doc-strings. Now the behaviour matches that of VS Code and Emacs.

By setting the `clojure_align_multiline_strings` option to `-1` it will switch to traditional Lisp multi-line string indentation, of no indent.

Sometimes it seems that when `=`ing over a block, Vim will sometimes not re-highlight strings correctly until we have already ran `searchpairpos`. In this case, we implement a hacky workaround which detects the false-positive and recovers from it.

Still lots to do here, this is just the initial work.

axvr · 2023-04-30T19:23:30Z

I've been experimenting with a custom algorithm as per the check list item above. This version eliminates all syntax highlight checks (and therefore no longer requires the various hacks to make those checks work) plus it is approx the same speed as my first rewrite attempt and uses about the same amount of code.

However the core benefits of this version is that it works when syntax highlighting is switched off, and eliminates the performance bottle neck of syntax highlight checks. This means that it will be possible to write a version of the algorithm in Vim9 script, to get another significant performance improvement! (Maintaining both a legacy Vim script and Vim9 script versions should be feasible due to how much simpler this is than the original implementation.)

TL;DR: I'll push a new version to this branch soon which will be:

Simpler with no hacks,
Can be become even faster (via Vim9 script),
Works even when syntax highlighting is switched off.

Once this branch is feature complete, I'll share benchmarks comparing it to the original as is on master.

Not complete yet, but getting there.

Some refactoring should be possible here and further optimisations. Once all optimisations I can think of have been implemented, I'll try writing an alternate Vim9 script version. (The syntax highlight group checks used in previous implementations of the indentation code was the core bottleneck, so a Vim9 script version would not have been much faster.)

The performance of the new reader-like indentation algorithm has now surpassed the performance of my previous syntax highlighting approach with better accuracy and no hacky code required.

axvr · 2023-05-01T14:54:23Z

Performance benchmarks as of current change (+ unpushed Vim9 version). This is measuring the time to reindent a whole file on a Macbook Pro 2021, so expect much more noticable performance gains on slower hardware. 🚀

Code version	200 lines of Clojure	600 lines of EDN
Original	0.96s	4.6s (`clojure_maxlines = 0`), 3.5s (`clojure_maxlines = 300`)
New (legacy)	0.33s	2.6s
New (Vim9)	0.06s	0.46s

(I'll share new benchmarks (and the full reports generated by Vim) once this branch is feature complete, although I don't expect it to change much.)

Even with the Vim9 implementation in the code too (not pushed yet, still figuring out how best to integrate it in a maintainable way), we can expect the file size of indent/clojure.vim to be ~150 LOC less than the original!

Regarding Neovim, until Vim9 script support is added, the "New (legacy)" code will be used. If anyone would like to volunteer to write a Lua implementation of this code for Neovim please open an issue and we can discuss how to integrate it.

Previously backslashes were accidentally detected as tokens by the indentation tokeniser. This meant that character literals, would break indentation of everything after them.

SavageMessiah · 2024-07-14T16:47:56Z

Thanks for working on this, I've been trying clojure dev in neovim lately and was getting really frustrated by the fact that it would indent based on brackets in string literals. I just tried this branch and that problem is fixed.

Fixes indentation of reader conditional macros. May choose to remove this if function argument indentation is generally preferred by programmers when a keyword is in function position.

axvr · 2024-09-22T22:47:09Z

Thanks for working on this, I've been trying clojure dev in neovim lately and was getting really frustrated by the fact that it would indent based on brackets in string literals. I just tried this branch and that problem is fixed.

@SavageMessiah great to hear that! If you encounter any issues on this branch please let me know.

NoahTheDuke · 2024-09-26T14:24:01Z

Feedback:

(filter-test
 a b c)

(filter-suite
  a b c)

(letfn [(filter-test
         [test-case]
         ...)
        (filter-suite
          [suite]
          ...)]
  ...)

It seems the matching algorithm for function indentation is based on any portion of the function's name, not the whole thing, so test is indented with 1 space and others are indented with 2. Regardless of my preference (1 vs 2), these should not be indented differently, especially when they're both user-defined functions.

Additionally, letfn functions incorrectly have hanging indentation, not flow indentation, as seen below. The func body should be indented 2 spaces, not even with the brackets.

(letfn [(func [test-case]
              ...)]
  ...)

Thanks so much!

axvr · 2024-10-04T16:54:34Z

Hi @NoahTheDuke, thanks for raising and testing!

It seems the matching algorithm for function indentation is based on any portion of the function's name, not the whole thing, so test is indented with 1 space and others are indented with 2. [...]

Unfortunately I'm not seeing the same behaviour you are with filter-test vs. filter-suite, perhaps you have some config options set controlling the indentation for those? If you do and can share it, I'll happily take a further look into this issue.

Additionally, letfn functions incorrectly have hanging indentation, not flow indentation [...]

I am aware of this current deficiency in the indentation algorithm. It is possible to solve, but it will be tricky as letfn falls into what I call complex indentation, which includes others such as extend-protocol, extend-type, and method bodies of defrecord, deftype and defprotocol. I call it "complex indentation" as they alter the indentation rules of the lists within them.

I will fix it, but it will likely be after this PR is merged as their use isn't particularly common and hopefully the other improvements I've made will make up for it being a temporary issue. As a partial workaround, I'd recommend using let and fn instead of letfn where possible, as it's often no more code and is (at least in my opinion) more readable too. Alternatively you can run cljfmt before committing or hook it into Vim with a pre-write autocommand.

Asides

Why is indenting letfn so difficult? (Click this to find out.)

The challenge with the complex indentation rules is in making it both efficient within Vim script, and generic/customisable.

Emacs' clojure-mode avoids the first (by Emacs Lisp being much, much faster than Vim script 😭) and solves the latter nicely with a way of defining complex indentation rules (which cljfmt largely replicated). I may end up doing something similar, but do wonder if some flexibility could be sacrificed in favour of easier customisation and better performance...

(letfn is actually even worse than the other complex indentation rules as its rules are unique! In the old clojure.vim indentation code we had hardcoded rules just to handle it!)

Click this to read a complete tangent on my opinions about letfn overuse (that no one asked for). 🙂

Personally I feel that letfn is overused in Clojure code, as the only reason I can see to use it over plain let + fn is when the letfn'd functions need to call each other. If it were used only in that case, it could be useful as a signal that that is happening, much like how use of when over if signifies that there is no "else" branch.

(The equivalent of letfn in Scheme is letrec — i.e. let recursive — which I believe may have inspired letfn and kind of backs up my thoughts on its overuse.)

Vim script is infuriating sometimes! Turns out that the indentation tests locally passed with the previous code, but failed in CI and real life.

When the "uniform" indent style is selected, we no longer apply other list indentation rules/analysis. Unfortunately the reader conditional changes broke indentation for the following: (ns my-ns (:require [clojure.string :as str] [clojure.spec.alpha :as s])) Indenting it as this: (ns my-ns (:require [clojure.string :as str] [clojure.spec.alpha :as s])) Which is incorrect.

Makes indentation more robust against users toggling the `'magic'` option.

axvr added 15 commits April 21, 2023 01:29

A bit of code clean up and minor optimisations

5f083ff

Add fallback for when Vim was not compiled with searchpairpos

2627e1c

Fix string detection when entering a newline while in insert mode

4a3cab7

Minor performance improvement when file contains multi-line strings

d1ec02c

When clojure_align_multiline_strings is -1, no indentation

eb0463e

By setting the `clojure_align_multiline_strings` option to `-1` it will switch to traditional Lisp multi-line string indentation, of no indent.

Update indentation options sections of the README and vim help file

d69008f

Improve some of the comments in the indentation code

d73e408

Move string indent picker to a separate function

d177b3b

EDN files are unlikely to contain many lists, so check them last

e573da7

Yet more string detection fixes and slight code clean up

242dc9d

axvr force-pushed the indent-forms branch from 8279a89 to 242dc9d Compare April 28, 2023 14:21

axvr added 2 commits April 28, 2023 16:30

Initial work on list indentation

342f35f

Still lots to do here, this is just the initial work.

More indentation code clean up and readability improvements

51f5dcb

axvr added 4 commits May 1, 2023 00:58

Start of a small Clojure reader for indentation w/o syntax highlighting

cc9cda7

Not complete yet, but getting there.

Reader indent algorithm perf has surpassed the prev syntax highlight one

bdbc281

The performance of the new reader-like indentation algorithm has now surpassed the performance of my previous syntax highlighting approach with better accuracy and no hacky code required.

Small refactor to improve code clarity in indentation code

8ee73c5

axvr force-pushed the indent-forms branch from 437e727 to 8ee73c5 Compare May 1, 2023 13:26

NoahTheDuke mentioned this pull request May 2, 2023

Port performance improvements to Neovim/Lua #32

Open

Fix indentation bug during comment detection

a4beb52

axvr mentioned this pull request May 30, 2023

Compatibility with tree-sitter #33

Open

Fix accidental detection of backslashes as tokens

3d8197e

Previously backslashes were accidentally detected as tokens by the indentation tokeniser. This meant that character literals, would break indentation of everything after them.

axvr mentioned this pull request Jun 22, 2023

[Bug] incorrect indentation for curly braces with = #34

Open

axvr added 3 commits March 10, 2024 17:57

Fix escape character detection logic

cefb7bc

Simplify character column resolution

16026f7

Add some indent rules for more built in macros

d87f254

axvr added 2 commits September 22, 2024 23:33

Don't use fn arg alignment indent when a keyword is in function position

6a3d2d3

Fixes indentation of reader conditional macros. May choose to remove this if function argument indentation is generally preferred by programmers when a keyword is in function position.

Merge 'origin/master' into indent-forms

083f554

Add extra indentation test cases for records and protocols

f271dca

axvr force-pushed the indent-forms branch from 856ff1a to f271dca Compare September 22, 2024 23:14

axvr added 5 commits October 5, 2024 00:12

Begin refinements to configuration options

b86a4c0

Fix config option lookup

01edfec

Vim script is infuriating sometimes! Turns out that the indentation tests locally passed with the previous code, but failed in CI and real life.

Discovered a way to put comments within dict defs in Vim script

9b5e42c

Update README to better cover the new indentation options

3487e07

axvr force-pushed the indent-forms branch from c13ada8 to 688d5a4 Compare October 5, 2024 14:43

Replace the old multi-line string config option

3c5258c

axvr force-pushed the indent-forms branch from 688d5a4 to 3c5258c Compare October 5, 2024 14:44

axvr added 11 commits October 5, 2024 16:02

Remove the link to the #vim IRC as the link is now dead

3b4dad8

Fix typos and re-add the insert-mode completion info to the README

117b082

Update the "About" section within the Vim help doc

b26db95

Update indentation section of the Vim help doc

c4a25c5

Improve indentation style examples in Vim help doc

1742549

Minor documentation and comment refinements

9800368

In README, mention the replacements for each old indent config option

69a0b6f

Add indentation tests covering keywords in function position

058b4a7

Update indent regular expressions to set the required 'magic' mode

0a835d8

Makes indentation more robust against users toggling the `'magic'` option.

Compress comments on indent code

0d865c5

Compress more comments on indent code

052a498

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite indentation code #31

Rewrite indentation code #31

axvr commented Apr 27, 2023 •

edited

Loading

axvr commented Apr 30, 2023 •

edited

Loading

axvr commented May 1, 2023 •

edited

Loading

SavageMessiah commented Jul 14, 2024

axvr commented Sep 22, 2024

NoahTheDuke commented Sep 26, 2024 •

edited

Loading

axvr commented Oct 4, 2024 •

edited

Loading

Rewrite indentation code #31

Are you sure you want to change the base?

Rewrite indentation code #31

Conversation

axvr commented Apr 27, 2023 • edited Loading

To do

axvr commented Apr 30, 2023 • edited Loading

axvr commented May 1, 2023 • edited Loading

SavageMessiah commented Jul 14, 2024

axvr commented Sep 22, 2024

NoahTheDuke commented Sep 26, 2024 • edited Loading

axvr commented Oct 4, 2024 • edited Loading

Asides

axvr commented Apr 27, 2023 •

edited

Loading

axvr commented Apr 30, 2023 •

edited

Loading

axvr commented May 1, 2023 •

edited

Loading

NoahTheDuke commented Sep 26, 2024 •

edited

Loading

axvr commented Oct 4, 2024 •

edited

Loading