Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tr: A [:lower:]/[:upper:] in set2 must be matched in set1 #6445

Merged
merged 1 commit into from
Jun 10, 2024

Conversation

cvonelm
Copy link
Contributor

@cvonelm cvonelm commented Jun 2, 2024

If there is a [:lower:] or [:upper:] in set2, then there must be a [:lower:] or [:upper:] at the same logical position in set1

So

$tr '[:upper:]' '[:lower:]' # works
$tr '1[:upper:]' '[:lower:]' # doesnt

To accomplish this change I have done the following things:

  • flatten error checks (take early exits from solve_set_characters instead of nesting ifs)

  • simplify CharStar ([c*]) handling code by simply replacing CharStar with a CharRepeat of appropriate length

  • switch flatten() from Sequence -> u8 to Sequence -> Char

The main idea here is that to correctly implement the tr behaviour, one has to flatten everything but [:upper:]/[:lower:] into the underlying string of chars and then check if the character class in the one set is at the same position in the other set:

This apparent behavior is supported by these two GNU tr invocations:

$tr [:blank:][:lower:] 12[:upper:] # works
$abcs="abcdefghijklmnopqrstuvwxyz"
$tr [:lower:][:upper:]  "${abcs}[:upper:]" # misaligned error

If there is a [:lower:] or [:upper:] in set2, then there must be a [:lower:] or [:upper:] at the
same logical position in set1

So

tr -t [:upper:] [:lower:] works
tr -t 1[:upper:] [:lower:] doesnt
Copy link

github-actions bot commented Jun 2, 2024

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

@@ -91,18 +95,22 @@ pub enum Sequence {
}

impl Sequence {
pub fn flatten(&self) -> Box<dyn Iterator<Item = u8>> {
pub fn flatten_non_lower_upper(&self) -> Box<dyn Iterator<Item = Self>> {
Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seem that you are duplicating a bunch of code with flatten?
maybe it could be deduplicated a bit, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logicwise, this code does not duplicate things. flatten_all calls flatten_non_upper_lower for everything that is not [:lower:]/[:upper:].

@cakebaker cakebaker merged commit 38344ed into uutils:main Jun 10, 2024
68 checks passed
@cakebaker
Copy link
Contributor

Thanks for your PR :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants