Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create RFC: Handling XLA Exceptions in Python #435

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

vinayburugu
Copy link

No description provided.

@google-cla
Copy link

google-cla bot commented Nov 8, 2022

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Copy link

@sherhut sherhut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am generally supportive if this. Making exceptions attributable to XLA and exposing them independently from TensorFlow is the right direction. I have some questions as to the details.


Design Proposal

This proposal would replace existing std:: and py:: exceptions thrown by the CPP layers with new custom exceptions which will all be extended from XlaError base class. All status objects originating from XLA will be replaced by XLA specific status codes. Equivalent python exceptions will be defined under a common module xla.exceptions which will contain translations for status objects (previously TensorFlow exceptions (https://github.com/tensorflow/tensorflow/blob/87462bfac761435a46641ff2f10ad0b6e5414a4b/tensorflow/python/framework/errors_impl.py)) as well as direct throws from XLA (previously std library exceptions).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify what you mean by All status objects originating from XLA will be replaced by XLA specific status codes? Do you want to expose the literals in a separate, XLA specific python module or modify them? Do you want a single exception that covers all status codes or sub-exceptions for each?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Status Object updates the error codes from errorcodes.proto which are common to both XLA and tensorflow. I wanted to use unique codes for XLA and create corresponding python Exceptions for those error codes so that we see unique XLA exceptions corresponding to those new error codes added. All this effort is being proposed just to identify whether the root cause is framework or XLA instantly based on the Python Error / Error code etc.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the error codes get wrapped into an 'XlaError' exception at the XLA API boundary, why would we need different error numbers? I agree we should expose the numbers from XLA, as well, but it is not clear to me why they could not be the same numbers?

}
}

We will create different exception classes based on the need by extending this base class. For example:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume initially "need" is defined as existing TensorFlow exceptions that are currently used? What is the intended use for this exception hierarchy? You could also achieve the goal you have defined by just having an XlaError as the only exception that is thrown at the interface boundary to XLA. That exception could carry a status code, message and stack trace to aid debugging.

I acknowledge that this is currently not very principled in which exception is thrown but if we make all thrown exceptions XLA specific anyway, we could unify the use now. Hence I'd like to better understand the use case of sub-exceptions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this change was propsed to covert std exceptions currently being thrown with unique XLA exceptions. If all exceptions throws XlaError, that would suffice.

except xla_exceptions.XlaInternalError:
pass

2. Replacing existing use of std:: and py:: exceptions in the CPP layer
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using standard exceptions has the advantage that they can be treated uniformly. I understand that goal of making errors attributable to XLA but in the specific instance called out below, I am not convinced this is the right approach.
The __getattr__ method is a standard python method defined on an object that, while returned from an XLA API, is potentially used in contexts independent of the XLA api. The contract for that method is to throw the standard exception so that the object behaves like one would expect of a python object.

So I suggest we only replace exceptions that are an exception of the XLA api in a narrow sense and keep standard exceptions when they model python behavior.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is your take on this @vinayburugu? It seems wrong to me to replace these exceptions from a python API perspective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants