Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mleap Runtime serialization using wrong file name for DecisionTree models #843

Merged
merged 3 commits into from
Apr 7, 2023

Conversation

AllanaEvans
Copy link
Contributor

This PR introduced a bug in the Mleap Runtime serialization for DecisionTree-type models (DecisionTreeClassifier and DecisionTreeRegression). That PR intended to update how the trees in the DecisionTree are serialized, including switching the file name from "nodes" to "tree". However, two instances were missed, one in the runtime DecisionTreeClassifierOp and one in the runtime DecisionTreeRegressionOp. This bug means that any DecisionTreeClassifier or DecisionTreeRegression models that are serialized using Mleap Runtime will use the "nodes" file name and thus are unable to be deserialized because both Mleap Runtime and Mleap Spark expect the "tree" file name during deserialization (because they were correctly updated in that PR).

In this PR, I am fixing the two missed instances from the original PR and adding a test case for each of these instances. I have verified that the test case failed before the change and succeeds after the change.

Demonstrating that the two new tests failed before fixing the file name:

$ sbt "project mleap-runtime” test
…
[info] DecisionTreeRegressionSpec:
…
[info] save/load model
[info] - correctly reproduces the model when saved and loaded *** FAILED ***
[info]   java.nio.file.NoSuchFileException: /root/tree.json
[info]   at jdk.zipfs/jdk.nio.zipfs.ZipFileSystem.newInputStream(ZipFileSystem.java:572)
[info]   at jdk.zipfs/jdk.nio.zipfs.ZipPath.newInputStream(ZipPath.java:708)
[info]   at jdk.zipfs/jdk.nio.zipfs.ZipFileSystemProvider.newInputStream(ZipFileSystemProvider.java:273)
[info]   at java.base/java.nio.file.Files.newInputStream(Files.java:158)
[info]   at ml.combust.bundle.tree.decision.TreeSerializer.$anonfun$read$1(TreeSerializer.scala:107)
[info]   at resource.DefaultManagedResource.open(AbstractManagedResource.scala:110)
[info]   at resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:87)
[info]   at resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26)
[info]   at resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26)
[info]   at resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50)
…
[info] DecisionTreeClassifierSpec:
…
[info] save/load model
[info] - correctly reproduces the model when saved and loaded *** FAILED ***
[info]   java.nio.file.NoSuchFileException: /root/tree.json
[info]   at jdk.zipfs/jdk.nio.zipfs.ZipFileSystem.newInputStream(ZipFileSystem.java:572)
[info]   at jdk.zipfs/jdk.nio.zipfs.ZipPath.newInputStream(ZipPath.java:708)
[info]   at jdk.zipfs/jdk.nio.zipfs.ZipFileSystemProvider.newInputStream(ZipFileSystemProvider.java:273)
[info]   at java.base/java.nio.file.Files.newInputStream(Files.java:158)
[info]   at ml.combust.bundle.tree.decision.TreeSerializer.$anonfun$read$1(TreeSerializer.scala:107)
[info]   at resource.DefaultManagedResource.open(AbstractManagedResource.scala:110)
[info]   at resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:87)
[info]   at resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26)
[info]   at resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26)
[info]   at resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50)
…
[info] Run completed in 3 seconds, 652 milliseconds.
[info] Total number of tests run: 274
[info] Suites: completed 72, aborted 0
[info] Tests: succeeded 272, failed 2, canceled 0, ignored 0, pending 0
[info] *** 2 TESTS FAILED ***
[error] Failed tests:
[error] 	ml.combust.mleap.runtime.transformer.regression.DecisionTreeRegressionSpec
[error] 	ml.combust.mleap.runtime.transformer.classification.DecisionTreeClassifierSpec
[error] (Test / test) sbt.TestsFailedException: Tests unsuccessful

Demonstrating that the two new tests succeed after fixing the file name:

$ sbt "project mleap-runtime” test
…
[info] DecisionTreeRegressionSpec:
…
[info] save/load model
[info] - correctly reproduces the model when saved and loaded
…
[info] DecisionTreeClassifierSpec:
…
[info] save/load model
[info] - correctly reproduces the model when saved and loaded
…
[info] Run completed in 4 seconds, 104 milliseconds.
[info] Total number of tests run: 274
[info] Suites: completed 72, aborted 0
[info] Tests: succeeded 274, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.

@AllanaEvans
Copy link
Contributor Author

@jsleight @WeichenXu123 Hi! Could I get a review on this PR? This bug is impacting my team. Thanks!

Copy link
Contributor

@jsleight jsleight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Apologies for the delay.

@jsleight jsleight merged commit f750480 into combust:master Apr 7, 2023
@AllanaEvans
Copy link
Contributor Author

@jsleight Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants