Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IoU3D as a custom c++ op (CPU) #890

Merged
merged 36 commits into from
Oct 21, 2022
Merged

Conversation

ianstenbit
Copy link
Contributor

No description provided.

@ianstenbit
Copy link
Contributor Author

/gcbrun

@ianstenbit
Copy link
Contributor Author

/gcbrun

@ianstenbit
Copy link
Contributor Author

/gcbrun

@ianstenbit
Copy link
Contributor Author

/gcbrun

@ianstenbit
Copy link
Contributor Author

This is now working with CI across all of our platforms, and I've also manually tested this on MacOS.

Some todos are:

  • Move custom op away from IoU3D and into some _internal object (I think it's worth getting this committed even before we have a real op so that I can set up the internal build configuration necessary). Probably a ZeroOutTestLayer is appropriate.
  • Update README to show how to build from source
  • Update contributing guide to explain that building from source will be required to run custom op tests.

@ianstenbit ianstenbit changed the title [WIP] Example custom c++ op Example custom c++ op Oct 7, 2022
@ianstenbit
Copy link
Contributor Author

@tanzhenyu

@bhack
Copy link
Contributor

bhack commented Oct 7, 2022

Are we sure that we want to go in a direction of limited HW compatibility for keras libraries considering also all the overhead of maintaining custom ops?

I supposed that the scope of keras-* was to have a better interaction with the compiler team and stress test the technology but to still express our needs through the TF/*HLO composability.

But I must also admit that recently, also on the "modern" and (M)HLO native JAX, something like custom ops is emerging again with google/jax#12632 (XLA's CustomCall)

/Cc @andrew-leaver @jpienaar @paynecl

@@ -0,0 +1,23 @@
# Copyright 2022 The KerasCV Authors. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the Google TF3d non custom-op performances of their 3d losses?
https://github.com/google-research/google-research/blob/master/tf3d/losses/box_prediction_losses.py

@bhack
Copy link
Contributor

bhack commented Oct 7, 2022

From https://github.com/google/jax/pull/12632/files#diff-b8191ca86b51eea58dc118d756de46521d77321338163145aea6b1ef64d126c5R77

JAX uses XLA to compile staged-out Python programs, which are represented with MHLO. MHLO offers common operations used in machine learning, such as the dot-product, exponential function, and so on. Unfortunately, sometimes these operations (and compositions of these operations) are not sufficiently expressive or performant and users look to external libraries (Triton, Numba) or hand-written code (CUDA, C/C++) to implement operations for their desired program.

So are we already here with Keras-* as we was at the same point for many years with TFAddons with the Custom ops proliferation (often with 1K lines of code x PR for components that required custom ops with CPU or CUDA only support)? /cc @seanpmorgan

@ianstenbit ianstenbit changed the title Example custom c++ op Example custom c++ op (CPU) Oct 7, 2022
@ianstenbit
Copy link
Contributor Author

Hey @bhack -- thanks for the comments

This is for an existing internal use case which will require this custom op (the TF3D losses are insufficient) as well as a few other custom ops for 3D spatial data.

I also worry about package accessibility and I want to make sure that we package this in a way which does not add user friction for pip install and lets users who don't want the custom ops install without binaries. I'm not certain what the right solution to that is just yet.

Please let me know if you have any suggestions.

@bhack
Copy link
Contributor

bhack commented Oct 7, 2022

I also worry about package accessibility and I want to make sure that we package this in a way which does not add user friction for pip install and lets users who don't want the custom ops install without binaries. I'm not certain what the right solution to that is just yet.

I could understand that there was no time to address this subject and now, also on your side, you need just to "deliver" but I have tried to discuss this topic since June 2021 and earlier.

What will we do, once you have introduced the c++ infrastructure in the project, when contributors will start to push for their own custom kernels for performance, uncovered vectorized maps and the underline API design of native TF ops or for XLA bridges uncovered ops in TF openxla/stablehlo#216 (comment))?

Are you going to maintain a proliferation of user kernels excluding probably some HW vendor like e.g. AMD or Google CLOUD TPU?
And what about exporting to TFlite when you are going to do this soon or later on an inference component instead of a loss?

So you could also solve something urgent, for your internal delivery with this, but after 1 year and half I still not understand what the strategy is.

@ianstenbit
Copy link
Contributor Author

@bhack I've updated the README and contributing guide to reflect a posture that we do not intend to invite user-contributed custom ops at this time, and that we only intend to include training-time CPU ops.

@ianstenbit
Copy link
Contributor Author

/gcbrun

Copy link
Contributor

@tanzhenyu tanzhenyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Thanks for pushing it forward.

.github/CONTRIBUTING.md Show resolved Hide resolved
set -x

PLATFORM="$(uname -s | tr 'A-Z' 'a-z')"
function is_windows() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some high level question on this:

  1. does this build script follow some other scripts? It would be nice to point out the reference so users can follow a pattern
  2. do we really want to support windows and mac builds? This will put more burden on maintainence and doesn't give immediate benefit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This is adapted from TFA. I'm adding a link to the original script and a brief blurb about what this does.

  2. We don't necessarily need to ship binaries for Windows and Mac OS, but including this makes it a lot easier for people to install from source it they want. I'm not sure if it makes sense to offer pre-built wheels for these environments, as I agree that it does add maintenance cost for not a ton of benefit. But it doesn't cost much to support it in the setup script IMO

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that these things are ported from Addons:
https://github.com/tensorflow/addons/blob/master/build_deps/build_pip_pkg.sh

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pain is to support those if TF drops pre-built wheels for them



# Writes variables to bazelrc file
def write(line):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question with build_pip_pkg.sh

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a little doc blurb at the top -- is there anything else specifically you're looking for here?


cc_library(
name = "tf_header_lib",
hdrs = [":tf_header_include"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the header file inclusion and libtensorflow_framework inclusion seems correct to me.
Though it'd be better to point out where we're relying this from. The main reason for this ask is that "tensorflow_framework" might not be a guaranteed binary in the long run, and we should know how to fix this once that becomes true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow -- do you mean comments specifying that this is enabling linking to the installed TF version, or something else?

cloudbuild/README.md Show resolved Hide resolved
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a _pariwise_iou_op.so being included,

  1. should we just have a single custom_op.so that includes all the targets?
  2. if a user wants to change some ops implementation, maybe we should have a documentation on how to do that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I'm not sure what the right answer is. I figured that for now we can have a specific .so object for this op, and when we add further ops we can decide whether to merge them into a single binary or keep them split. I've seen packages take both approaches, and they both seem reasonable to me (e.g. TFA uses separate binaries but lingvo uses one)
  2. This PR includes docs for installing from source, which should cover the biggest difficulty of developing custom ops. I think we probably want to generally discourage contribution of custom ops, but if you think this is valuable I can write a blurb in CONTRIBUTING.md about the workflow of working on these ops.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah please do so, my main ask here is how users can build and update the *.so file once they change the implementation.
having a single .so object seems better to me, even tensorflow itself only contains very few .so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha -- that is already covered in Tests that require custom ops in CONTRIBUTING.md. I've updated that file to refer to the instructions for building the binaries in the Contributing Custom Ops section

I've also renamed the .so to _keras_cv_custom_ops.so to indicate that later ops should be included in this same binary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""IoU3D loss using a custom TF op."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm...this is indeed very interesting...

keras_cv/losses/iou_3d_loss_test.py Outdated Show resolved Hide resolved
Copy link
Contributor

@bhack bhack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be nice if you could adapt our .devcontainer for C++

You could port copy the c++ entries from:
https://github.com/tensorflow/addons/blob/master/.devcontainer/devcontainer.json

@ianstenbit
Copy link
Contributor Author

/gcbrun

@ianstenbit
Copy link
Contributor Author

/gcbrun

Note that this is implemented using a custom TensorFlow op. Initializing an
IoU3D object will attempt to load the binary for that op.

Boxes should have the format [center_x, center_y, center_z, dimension_x,
Copy link
Contributor

@bhack bhack Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we document a little bit the rational of this format choice in our python API?

I don't have stats on the popularity in the datasets, but If it is not the most popular native format we need to extend our current 2d bbox converters to 3d.

E.g. pytorch3d # Assume inputs: boxes1 (M, 8, 3) and boxes2 (N, 8, 3)

Please, check also the unified format in one of the largest 3d bbox dataset:
https://github.com/facebookresearch/omni3d/blob/main/DATA.md#data-usage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's likely that we'll want to add converters in the future. I've opened #947 for this and we can address it once we have a minimal 3D bbox pipeline working.

@ianstenbit ianstenbit changed the title Add IoU3DLoss with a custom c++ op (CPU) Add IoU3D metric with a custom c++ op (CPU) Oct 19, 2022
Copy link
Contributor

@tanzhenyu tanzhenyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one minor comment

keras_cv/metrics/iou_3d.py Outdated Show resolved Hide resolved
@tanzhenyu
Copy link
Contributor

Also I'm leaning towards only supporting Linux for now until we know better

@ianstenbit
Copy link
Contributor Author

ianstenbit commented Oct 21, 2022

Also I'm leaning towards only supporting Linux for now until we know better

sgtm. I will leave the config code in for now for all platforms, but when we go to build our wheels we can build linux only for now (and other platforms can have a Python-only wheel)

@ianstenbit
Copy link
Contributor Author

/gcbrun

@ianstenbit
Copy link
Contributor Author

/gcbrun

@ianstenbit ianstenbit changed the title Add IoU3D metric with a custom c++ op (CPU) Add IoU3D as a custom c++ op (CPU) Oct 21, 2022
@ianstenbit
Copy link
Contributor Author

/gcbrun

@ianstenbit ianstenbit merged commit 3d6467e into keras-team:master Oct 21, 2022
@ianstenbit ianstenbit deleted the bin branch October 21, 2022 18:05
ghost pushed a commit to y-vectorfield/keras-cv that referenced this pull request Nov 16, 2023
* Initial custom ops testing

* Switch to cpp14

* Add include_package_data flag

* cpp17

* Configure CI tests

* Formatting

* Call configure before running bazel build for CI

* Verbose failures for CI/Bazel

* Switch to tf-cpu for test flow

* Use build dep impls from TFA

* Update cloudbuild workflow to build binaries from source

* Update cloudbuild docks (updated docker image)

* Update GPU tests to TF 2.10

* Update copyright and docstrings

* Move loss to internal, update README and CONTRIBUTING.md

* Update to IoU op ported from Lingvo

* Remove cuda config

* Make iou3d a real keras loss

* Skip custom-ops tests in devcontainer

* Review comments

* Remove skipping of IoU test from devcontainer (it is now opt-in)

* Fix environment variable for windows

* Update environment variable setting

* Use env in yaml for environment variable

* Review comments

* Make IoU a metric, not a loss. Also update Cpp implementation to compute only major axis of pairwise IoU matrix

* typo

* Make test case more demonstrative

* Switch to a single binary, remove binary from git

* Move IoU3D from metrics to ops

* Switch to full matrix computation of iou

* Formatting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants