Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

did-you-mean for clap-rs #103

Merged
merged 14 commits into from
May 5, 2015
Merged

did-you-mean for clap-rs #103

merged 14 commits into from
May 5, 2015

Conversation

Byron
Copy link
Contributor

@Byron Byron commented May 5, 2015

Now that did you mean was implemented for the parts under control of google-apis-rs, for completeness it should also be available for clap-rs. As done in google-apis-rs, suggestions should only be made above a certain confidence level, e.g. foo will not result in did you mean 'baz'.

This improvement involves the following parts:

  • sub-command names
    • prog foo results in foo is not a valid subcommand. Did you mean 'fop'?.
  • long flags and options
    • prog --heln results in --heln is not a valid option. Did you mean '--help'?
    • prog --debug-leve=1 results in --debug-leve is not a valid option. Did you mean '--debug-level' ?
    • Why not for short flags ? I think the input string should be more than one character to get a useful result.
    • NOTE: We do not provide the given arguments (i.e. --debug-level=1) in our suggestion as this is also currently not the case with the standard messages in that area. It might be worth a separate issue.
  • possible values
    • prog --mode=foz results in 'foz' is not a valid mode. Please choose one of 'foo' and 'baz'. Did you mean '--mode=foo' ?
    • It only works for one item at a time.. Thus prog --modi=foz results in --modi is not a valid option. Did you mean '--mode=foz'. Thus it is not required to detect that the suggested flag has possible values, which allows this implementation to remain simple and 'stupid' in its first iteration.
  • feature-gate these changes as suggestions
    • should be enabled by default.

If the required confidence level is not reached, the output of the usage string will remain exactly as it was before.

* assure `make test` works on OSX as well
* simplified entire makefile, by basically removing sed invocations to
  manipulate the Cargo.toml file under source control.
* *works for me* predicate

This should probably be tested on another system as well, just to be
sure it makes sense for everyone.
If an argument is not understood as subcommand, but has a
high-confidence match in the list of all known subcommands, we will use
this one to print a customized error message.

Previously, it would say that a positional argument wasn't understood,
now it will say that a subcommand was unknown, and if the user meant
`high-confidence-candidate`.

If the argument doesn't sufficiently match any subcommand, the default
handling will take over and try to treat it as positional argument.

* added dependency to `strsym` crate
* new `did_you_mean` function uses `strsim::jaro_winkler(...)` to look
  for good candidates.

Related to #103
@kbknapp
Copy link
Member

kbknapp commented May 5, 2015

Excellent! I'm also good with all the proposed ideas. The only one that I'm not sure about is short names...as I guess we could just check for closest match alphabetically? I'm unsure about that though, because what would make more sense is to be closest keyboard locality-wise...which would be difficult to check for. So I'm actually good just leaving out short suggestions.

Yeah, I'd like to put this behind a cargo feature gate, perhaps call it "suggestions" and I'm also good if it's in the default. I just want the end developer to have the option to do without the extra dependency if they so choose. This might also require putting a quick note in the README.md about what the default features are, and how to exclude them.

Long arguments now have a special case when saying they are unknown, as
we will match it against all known long flags and suggest the best match
to be used instead.

TODO: refactor to just write a suffix with did-you-mean information.

Related to #103
@Byron
Copy link
Contributor Author

Byron commented May 5, 2015

Great, good to hear ! Keyboard-locality based matching for short flags would be perfect, but would certainly be tough to implement it correctly, considering there are a terzillion different layouts out there which are somewhat hard to differentiate programmatically. Maybe one day there is a library for that and the topic can be tackled (could be worth an issue just to keep track of this, if desired).

If it's about the extra dependency, I'd actually think it's easier to take the 20 lines or so of MIT licensed code into the project, and setup the credit accordingly.
The disadvantage of that is that one obviously can't deactivate it anymore. Usually I'd argue that everyone wants to have such a feature, after all we are now used to it from git, and also rustc. Only the latter had some critical voices as it can make low-confidence suggestions which won't make any sense. The implementation I use doesn't do that though, as it only uses high-confidence values.

What do you think about bringing in the code (the license is compatible) ?

@kbknapp
Copy link
Member

kbknapp commented May 5, 2015

I think in this instance the code is small enough that we could simply just copy/paste, but I'd prefer to use the dependency and a cargo feature as I think this gives better credit, allows one to disable the feature if desired, and also allows for upstream improvements. We could end up copying the code at a later point if it ends up diverging from what we need, also.

@kbknapp kbknapp mentioned this pull request May 5, 2015
Byron added 2 commits May 5, 2015 15:24
That way, we use the prefix previously used by clap, but add our
particular 'did-you-mean' phrase as a suffix.
Removing explicit typing makes the code more readable, which makes it
more maintainable, which probably helps everyone :).
@kbknapp
Copy link
Member

kbknapp commented May 5, 2015

Not necessarily related to 9cb4d13, but it reminds me that de-duplication is on my TODO list for the entire library...there is a ton of duplication/coupling/general cleanup/etc that I need to get complete. I just want to get to a default feature set / implementation first so I'm not just constantly re-doing this ;)

Byron added 8 commits May 5, 2015 16:31
To facilitate running different branches of code that looks
rather similar.
This is done in preparation for adding suggestions.

**NOTE**: We will now always use the ...
`"\"{}\" isn't a valid value for '{}'{}"`
... format, even though in one occasion, only a ...
`"\"{}\" isn't a valid value for {}{}"`
... format was used.

Hope that's alright and not breaking the world. At least, it doesn't
break any of the existing tests.
There now is a single method which deals with formatting the
'did-you-mean' message, supporting different styles to cater all the
various needs that have arisen thus far, with enough potential to be
easily extended in future through a little helper-enumeration whose
variants can possibly take values.

*NOTE*: We might still want to have a look at where the did-you-mean
message should be located for best effect.

Related to #103
More about this can be found [here](http://goo.gl/vt07f7).
It's somewhat unnecessary, but I thought it might be good in preparation
for the next commit.
Run the python based --release mode tests as well. They are mainly
concerned with user facing messages, which are as important as the
internal tests run `cargo test`.
You can now disable the did-you-mean feature entirely, which also
removes the additional dependency it comes with.
Learn more about features and how to use them here:
http://doc.crates.io/manifest.html#the-[features]-section

Related to #103
That way, it will run on travis as well, which comes with python 2
by default, and just doesn't have python 3 pre-installed.
It would probably take to much time to install, especially considering
it's super easy to have to script work in both worlds.
These comments show up on travis, which is not desired

[skip ci]
@Byron Byron changed the title WIP: did-you-mean for clap-rs [DO NOT MERGE] FOR REVIEW: did-you-mean for clap-rs May 5, 2015
@Byron
Copy link
Contributor Author

Byron commented May 5, 2015

@kbknapp Please have a look at the PR and let me know what you think. I will gladly make fixes based on your suggestions.
Something I am not sure about myself is the exact location of the did-you-mean message. Could be on a new line, right now it's on the same one.

@@ -35,6 +35,22 @@
help Prints this message
subcmd tests subcommands'''

_sc_dym_usage = '''Subcommand "subcm" is unknown. Did you mean "subcmd" ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"is unknown" -> "isn't valid" to stick with previous errors

@kbknapp
Copy link
Member

kbknapp commented May 5, 2015

This looks excellent! Thanks a ton! I'd just change that one error message for subcommands to "ins't valid" since that's what we use elsewhere. And yeah, after looking at it both ways, if you could change the suggestion to a newline + tab, it'll be ready for a merge!

I also really like the cleanup you did, that's awesome! 👍

* unknown subcommand message altered to use similar language as is used
  everywhere around clap. Namely, we say 'invalid' instead of 'unknown'
* 'did-you-mean' message separator changed from '. ' to '\n\t'

Related to #103
@Byron
Copy link
Contributor Author

Byron commented May 5, 2015

@kbknapp Thanks for the review, changes are coming in and are being tested. If everything is green, they should be ready :).

@kbknapp
Copy link
Member

kbknapp commented May 5, 2015

Everything looks good, great work!

kbknapp added a commit that referenced this pull request May 5, 2015
@kbknapp kbknapp merged commit 6fabc23 into clap-rs:master May 5, 2015
@@ -213,6 +229,8 @@
'mult_valsmo x2-1: ': ['{} --multvalsmo some other --multvalsmo some'.format(_bin), _mult_vals_2m1],
'mult_valsmo x1: ': ['{} --multvalsmo some other'.format(_bin), _exact],
'F2(ss),O(s),P: ': ['{} value -f -f -o some'.format(_bin), _f2op],
'arg dym: ': ['{} --optio=foo'.format(_bin), _arg_dym_usage],
'pv dym: ': ['{} --Option slo'.format(_bin), _pv_dym_usage],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, here I see a white-space error. My editor used spaces for a while, which is standard for python, whereas the file used tabs. This went unnoticed as it's apparently not an error to do that while defining a dict.
This might want to be adjusted at some point, maybe you can give it a quick-fix-commit.

@kbknapp
Copy link
Member

kbknapp commented May 5, 2015

No worries, I fix it on my next commit here shortly ;)

kbknapp pushed a commit that referenced this pull request May 5, 2015
If an argument is not understood as subcommand, but has a
high-confidence match in the list of all known subcommands, we will use
this one to print a customized error message.

Previously, it would say that a positional argument wasn't understood,
now it will say that a subcommand was unknown, and if the user meant
`high-confidence-candidate`.

If the argument doesn't sufficiently match any subcommand, the default
handling will take over and try to treat it as positional argument.

* added dependency to `strsym` crate
* new `did_you_mean` function uses `strsim::jaro_winkler(...)` to look
  for good candidates.

Related to #103
kbknapp pushed a commit that referenced this pull request May 5, 2015
Long arguments now have a special case when saying they are unknown, as
we will match it against all known long flags and suggest the best match
to be used instead.

TODO: refactor to just write a suffix with did-you-mean information.

Related to #103
kbknapp pushed a commit that referenced this pull request May 5, 2015
There now is a single method which deals with formatting the
'did-you-mean' message, supporting different styles to cater all the
various needs that have arisen thus far, with enough potential to be
easily extended in future through a little helper-enumeration whose
variants can possibly take values.

*NOTE*: We might still want to have a look at where the did-you-mean
message should be located for best effect.

Related to #103
kbknapp pushed a commit that referenced this pull request May 5, 2015
You can now disable the did-you-mean feature entirely, which also
removes the additional dependency it comes with.
Learn more about features and how to use them here:
http://doc.crates.io/manifest.html#the-[features]-section

Related to #103
kbknapp pushed a commit that referenced this pull request May 5, 2015
* unknown subcommand message altered to use similar language as is used
  everywhere around clap. Namely, we say 'invalid' instead of 'unknown'
* 'did-you-mean' message separator changed from '. ' to '\n\t'

Related to #103
@Byron Byron changed the title FOR REVIEW: did-you-mean for clap-rs did-you-mean for clap-rs May 6, 2015
@Byron
Copy link
Contributor Author

Byron commented May 6, 2015

@Byron Byron deleted the did-you-mean branch May 6, 2015 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants