Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] update truncation default & adding field output when input is truncated #79942

Merged

Conversation

benwtrent
Copy link
Member

This commit makes the two following changes (along with some refactoring)

  • Nlp results will now indicate if the input was truncated or not
  • The default truncation is now none instead of first

@elasticmachine elasticmachine added the Team:ML Meta label for the ML team label Oct 27, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@benwtrent
Copy link
Member Author

@elasticmachine update branch

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

PyTorchPassThroughResults that = (PyTorchPassThroughResults) o;
return Arrays.deepEquals(inference, that.inference) && Objects.equals(resultsField, that.resultsField);
}

@Override
public int hashCode() {
return Objects.hash(Arrays.deepHashCode(inference), resultsField);
int result = Objects.hash(super.hashCode(), resultsField);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int result = Objects.hash(super.hashCode(), resultsField);
return Objects.hash(super.hashCode(), Arrays.deepHashCode(inference), resultsField);

}

@Override
public int hashCode() {
return Objects.hash(Arrays.hashCode(inference), resultsField);
int result = Objects.hash(super.hashCode(), resultsField);
Copy link
Member

@davidkyle davidkyle Oct 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment; why not return Objects.hash(super.hashCode(), resultsField, Arrays.hashCode(inference))?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these were auto generated by Intellij :D. So I can update them.

if (numTokens > maxSequenceLength) {
isTruncated = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be tempted to put this is the switch case for FIRST & SECOND because NONE does not truncate it throws instead. Same result just a matter of preference

@benwtrent benwtrent added auto-backport-and-merge Automatically create backport pull requests and merge when ready auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) labels Oct 28, 2021
@elasticsearchmachine elasticsearchmachine merged commit 375fc77 into elastic:master Oct 28, 2021
@benwtrent benwtrent deleted the feature/ml-update-nlp-truncation branch October 28, 2021 14:41
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Oct 28, 2021
…uncated (elastic#79942)

This commit makes the two following changes (along with some
refactoring)  - Nlp results will now indicate if the input was truncated
or not  - The default truncation is now `none` instead of `first`
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.0

weizijun added a commit to weizijun/elasticsearch that referenced this pull request Oct 28, 2021
…formance

* upstream/master: (153 commits)
  [ML] update truncation default & adding field output when input is truncated (elastic#79942)
  [ML] stop using isAllowedByLicense for model license checks (elastic#79908)
  [ML] Retain built-in ML roles granting Kibana privileges (elastic#80014)
  [Transform] remove old mixed cluster BWC layers, not required for 8x (elastic#79927)
  Increase test timeout for CoordinatorTests testAllSearchesExecuted
  [Transform] add rolling upgrade tests for upgrade endpoint (elastic#79721)
  [ML] Update trained model docs for truncate parameter for bert tokenization (elastic#79652)
  `CoordinatorTests` sometimes needs three term bumps (elastic#79574)
  [ML] Account for service being triggered twice in tests (elastic#80000)
  SearchContext: remove unused variable (elastic#79917)
  Revert "Deprecate resolution loss on date field (elastic#78921)" (elastic#79914)
  Re-enable GeoIpDownloaderIT#testStartWithNoDatabases() (elastic#79907)
  Fix SnapshotBasedIndexRecoveryIT#testSeqNoBasedRecoveryIsUsedAfterPrimaryFailOver (elastic#79469)
  Fix RecoverySourceHandlerTests (elastic#79546)
  SQL: stabilize SqlSearchPageTimeoutIT (elastic#79928)
  Wait 3 seconds for the server to reload trust (elastic#79778)
  Skip automatically preserved request headers when rewriting (elastic#79973)
  Check whether stdout is a real console (elastic#79882)
  Convert remote license checker to use LicensedFeature (elastic#79876)
  Miscellaneous fixes for LDAP SDK v6 upgrade (elastic#79891)
  ...

# Conflicts:
#	libs/x-content/src/main/java/org/elasticsearch/xcontent/support/filtering/FilterPath.java
#	libs/x-content/src/test/java/org/elasticsearch/xcontent/support/filtering/FilterPathGeneratorFilteringTests.java
#	libs/x-content/src/test/java/org/elasticsearch/xcontent/support/filtering/FilterPathTests.java
elasticsearchmachine pushed a commit that referenced this pull request Oct 28, 2021
…uncated (#79942) (#80022)

This commit makes the two following changes (along with some
refactoring)  - Nlp results will now indicate if the input was truncated
or not  - The default truncation is now `none` instead of `first`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport-and-merge Automatically create backport pull requests and merge when ready auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :ml Machine learning >non-issue Team:ML Meta label for the ML team v8.0.0-beta1 v8.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants