Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter programs #914

Merged
merged 6 commits into from
Jun 30, 2023
Merged

filter programs #914

merged 6 commits into from
Jun 30, 2023

Conversation

Byron
Copy link
Owner

@Byron Byron commented Jun 27, 2023

Follow up of #913.

Driving of filter programs with two different protocols - statehandling is interesting here as well when supporting long-running procresses.

Tasks

Generally it's interesting if we can handle big files here, so streaming of files to the filter (huge files are replaced by small ones), and the other way around when small files are replaced by big ones. So it's a two-way streaming of input and output that should be reflected in the interface.

To support delays, it must be possible to run all standard filters but the filter process.

  • one-off clean and smudge filters
  • long-runnnig processes
  • allow partial configuration and partial capabilities (without making this an error)
  • control whether or not to wait for ongoing processes, and initiate graceful shutdowns.
  • ability to restart long-running processes if they crash
  • delay support for long-running processes
  • a test to validate the ident filter against actual git

Next PR

  • pipeline support to handle attribute logic with streaming support
  • integration with gix

Research

  • general directions are for checking out from git and for checking in to git
  • it's vital to see if filtering would apply before having to apply it. This could be native to the relevant functions.
  • optional round-trip checking
  • if gitattribute is text=auto, git is set to auto-detect text mode for all files.
  • to-git conversions are convert_to_git() namely: clean-filter | encode_to_git | crlf | ident
  • to-worktree conversions are convert_to_working_tree…(), namely: ident | crlf | encode_to_worktree | smudge-filter
  • normalization is special enough to be mentioned in the pipeline, but I don't see why it affects anything.
  • there are streaming versions only of apply-filter effectively, with encoding, crlf and ident changes operating on memory buffers

@Byron Byron force-pushed the filter-programs branch 9 times, most recently from 2376a01 to 0ae40b7 Compare June 28, 2023 19:50
Simple filters run in real-time and are piped their content to stdin
while we read it from stdout.
`git` simply ignores the filter if there is none configured for a given
operation or capability.
This isn't documented except for in code, but clear enough to implement
just like that for maximum compatibility.
@Byron Byron force-pushed the filter-programs branch 3 times, most recently from 84722f3 to 29b744b Compare June 30, 2023 15:43
@Byron Byron merged commit 97f8e96 into main Jun 30, 2023
16 checks passed
@Byron Byron deleted the filter-programs branch June 30, 2023 19:48
@Byron Byron mentioned this pull request Jul 1, 2023
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant