Discussion: When to do checks of user inputs? #1417

gabrieldemarmiesse · 2020-03-26T10:52:09Z

Describe the feature and the current behavior/state.

Currently, we don't have a policy specifying when to do checks on numerical values (user input checks). Sometimes, it's done in eager mode only, sometimes in eager and graph mode...

For example: #1338 (comment)

To have maximum speed, it would be better if the checks are not running, but we need a way of debugging too. But eager mode is not a garantee of debug mode. Some people us eager mode for flexibility because they can't use graph mode.

To have a strong policy would allow us to do very expensive checks and would improve the UX by allowing users to have checks on their numerical values if they want to do a dry run.

I propose the we do the following:

if tf.executing_eagerly() and __debug__:
    # do very expensive checks of user inputs here, like comparing values, raising exceptions and such.

the __debug__ directive is True by default in python (yes it's strange). If python is run in optimized mode with python -O, the __debug__ variable is set to False.

In the end, user have several choices:

graph mode and maximal performance.
eager mode and maximal performance: python -O my_script.py
eager mode and debug mode to monitor strange values: python my_script.py.

Of course this doesn't concern checks of user inputs done in __init__ functions because those parts are never critial for performance.

Relevant information

Are you willing to contribute it (yes/no): yes
Are you willing to maintain it going forward? (yes/no): yes
Is there a relevant academic paper? (if so, where): no
Is there already an implementation in another framework? (if so, where): not that I'm aware of. If people know other libraries doing similar things, I'd like to know about it.
Was it part of tf.contrib? (if so, where): no

Which API type would this fall under (layer, metric, optimizer, etc.)

All APIs

Who will benefit with this feature?

Users as they would benefits from many checks while having maximal performance in graph mode as all checks would have been removed.

Any other info.

The text was updated successfully, but these errors were encountered:

fsx950223 · 2020-03-29T07:29:29Z

It sounds good.

bhack · 2020-05-08T11:24:07Z

I don't know if we could or it is safe enough to check only sys.gettrace to see if we are really under a debugger and to enable only the checks in that case.
It seems to me not so much UX friendly to require to use python -O my_script.py for having performance in eager_mode if we want to be eager friendly by default.
I have not read many other docs about explicit -O in other libraries so my point is just related to the user experience point of view and habits.

gabrieldemarmiesse · 2020-05-08T11:35:47Z

I have not read many other docs about explicit -O in other libraries

Because it's not possible to turn off runtime checks in most libraries. So we're already better than most in this regard. Furthermore, many people debug by using prints (not that it's a good practice), but we need to support this use case, it's not possible with sys.gettrace. -O turns off assert statements too so it's been made precisely for this use case.

From the zen of Python:

Errors should never pass silently.
Unless explicitly silenced.

Eager mode is completely aligned with Python here.

bhack · 2020-05-08T11:54:02Z

I could agree with you if we want to use print debugging. I think in that case the alternative could be to use some processor.
At UX level the backward interpretation was not so friendly also for other users. See comments on the upvoted reply at https://stackoverflow.com/questions/482014/how-would-you-do-the-equivalent-of-preprocessor-directives-in-python.

Errors should never pass silently.
Unless explicitly silenced.

Ok but I think that we are not adopting this philosophy cause we tell to the the user:

The error should pass silently if you are in graph mode
The error should not pass silently if you are in eager mode

bhack · 2020-05-08T13:23:04Z

Also could we have any significant side effect in Autograph loop with -0?

Note: If debug is True, AutoGraph limits the number of iterations in normal Python loops to prevent infinite loops and raise an error if the limits are exceeded. However, the iteration limits are very large and may take a while to trigger an error.

guillaumekln · 2020-05-08T14:01:36Z

Sorry for not reacting to this issue earlier.

we don't have a policy specifying when to do checks on numerical values (user input checks)

Can you clarify what you mean by "checks on numerical values" exactly? I would assume that most if not all input checks are related to checking the input shape.

To have maximum speed, it would be better if the checks are not running

This is true but when we talk about shape checks, they are an order of magnitude faster than the actual layer runtime. So I'm not sure performance is a concern here.

I would agree with @bhack that the mismatch between eager and graph mode would be confusing, if not error-prone. As a library, I think we should treat both modes the same way.

For example, I know that the error checking in the seq2seq module is a bit old-style but it is consistent in all modes:

use Python to check static values (e.g. static depth dimension, tensor rank)
use TensorFlow asserts to check dynamic values (e.g. batch size)

gabrieldemarmiesse · 2020-05-08T14:34:14Z

I believe we should support other types of error checks to be able to provide a better user experience. For example:

Inputs are in the expected range, for example for functions expecting images between 0 and 255 or between 0 an 1.
In functions like resampler ops or cutout where one tensor contains the indices of another tensor, we could check all indices are in the expected range.
The sum of all inputs is 1 for losses that expect predictions, or even all predictions are between 0 and 1.
In hamming_distance, we could checks all values are either 0 or 1.

Those should help greatly when a user wants to debug a neural network that doesn't work as expected. Shape checks are helping, but the most painful bugs in neural networks are when the values are not in the expected range, or are not the ones expected, precisely because those errors are silent. We could provide a lot of value if we help the user detect those.

I believe that the checks are light right now, so mostly shape checking, because we couldn't turn them off, so we couldn't afford more expensive ones.

guillaumekln · 2020-05-08T15:02:01Z

So the policy you are describing in this issue is only for the checking the actual input values and not the shapes?

gabrieldemarmiesse · 2020-05-08T15:11:28Z

Good question. When I made the issue, I didn't think about what checks were negligable in running time, I had more in mind that we should have a mode where we have maximal performance, where we drop all checks. Making a check technically stops the flow of computation, so I though it was expensive. That may not be the case for shapes.

Overall, if we can unify the way we do checks, that's better though. Do you think we should discern by category of check?

bhack · 2020-05-08T15:22:23Z

So Is It not better to handle these with a classical loglevels approach?

guillaumekln · 2020-05-08T18:57:17Z

Do you think we should discern by category of check?

I think shape checks should always run and work in both eager and graph mode.

On the other hand, it sounds reasonable to only run numerical checks in a eager debug mode. But it should probably be disabled by default, what do you think? It could be enabled with a global function such as tfa.enable_numerical_checks.

gabrieldemarmiesse · 2020-05-08T19:58:52Z

But it should probably be disabled by default, what do you think?

It's a hard decision.

If it's disabled by default in eager mode, it's likely that a very small subset of our userbase will know about it. Most users just google whatever layer/metric/loss they're looking for and grab it from tensorflow addons. They don't look at the readme because they don't need to. So most of them will never activate it, even when debugging their model.

If we enable it by default in eager mode, it's good for a very big userbase, the one who uses model.fit (graph mode by default here) and run eagerly to debug only. But it's bad for the users running in full eager. It would seems weird though because there is very little penalty for users to run Addons' stuff in graph mode. Why would they need eager mode inside our functions except for debugging?

Overall, it's a good point that we don't want the users running everything in eager mode to get a silent drop in performance.

What we could do, is to use the method we used when loading custom ops. We throw a warning the first time. It would look like this.

def some_addons_function():

    if __debug__ and tf.executing_eagerly():
        tfa.utils.warn_numerical_checks()
        # do expensive checks here


# in utils.py
already_warned = False
def warn_numerical_checks():
    global already_warned
    if not already_warned:
         warnings.warn("Numerical tests are running because you are using eager mode."
                       "it impacts performance negatively but helps you debug your model "
                     " by checking that values are in the expected range and such and will throw an"
                     " error if something happens. For example it can check that the sum of a prediction "
                     " tensor is 1 all the time, with all values between 0 and 1. If such a condition is not "
                     "respected, an exception will be raised to help you debug what's wrong. "
                    " you can turn it off by either running tensorflow addons functions in graph mode "
                    " or running python in optimized mode `python -O my_file.py`")
       already_warned = True

So a good compromise would be enabled by default for eager mode + warning the first time a numerical check is performed. Do you think it's a good idea?

bhack · 2020-05-08T20:11:15Z

IMHO I prefer to have an API call with an env to pilot this like[1] cause I still think that the big user base at the more or less entry level and that don't read the documentation has no habit to use -0.

Then we could talk about what is the best default behavior in eager mode.

[1]https://github.com/tensorflow/tensorflow/blob/v2.2.0/tensorflow/python/autograph/utils/ag_logging.py#L40-L88

bhack · 2020-05-08T21:16:25Z

Also I want to open a new point in the discussion. How many numerical checks we have in the main library?
Cause I suppose that users will expect to have a quite consistent experience in the whole library (TF + addons).
If you start to expect to be "safe" with extra-guards in addons at some point you could expect to have this "feature" available also in the whole library cause APIs are regular mixed in real use cases.
I understand that probably this point is more marginal that the other ones but library consistency of has its own weight in the user experience.

guillaumekln · 2020-05-19T14:58:55Z

So a good compromise would be enabled by default for eager mode + warning the first time a numerical check is performed. Do you think it's a good idea?

It seems reasonable to think that eager mode is mostly used for development where performance is not critical. Overall I don't have a strong opinion here as long as the condition is isolated in a function so that it can easily be changed if required.

seanpmorgan · 2023-03-01T03:43:13Z

TensorFlow Addons is transitioning to a minimal maintenance and release mode. New features will not be added to this repository. For more information, please see our public messaging on this decision:
TensorFlow Addons Wind Down

Please consider sending feature requests / contributions to other repositories in the TF community with a similar charters to TFA:
Keras
Keras-CV
Keras-NLP

gabrieldemarmiesse added discussion needed Feature Request labels Mar 26, 2020

gabrieldemarmiesse mentioned this issue Mar 26, 2020

add cutout image op #1338

Merged

This was referenced May 8, 2020

Remove the tf.cond and tf.control_dependencies in the codebase. #1794

Closed

Enfore the runtime checks in specific conditions. #1800

Closed

seanpmorgan closed this as completed Mar 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: When to do checks of user inputs? #1417

Discussion: When to do checks of user inputs? #1417

gabrieldemarmiesse commented Mar 26, 2020

fsx950223 commented Mar 29, 2020

bhack commented May 8, 2020

gabrieldemarmiesse commented May 8, 2020

bhack commented May 8, 2020 •

edited

Loading

bhack commented May 8, 2020

guillaumekln commented May 8, 2020

gabrieldemarmiesse commented May 8, 2020 •

edited

Loading

guillaumekln commented May 8, 2020

gabrieldemarmiesse commented May 8, 2020

bhack commented May 8, 2020

guillaumekln commented May 8, 2020

gabrieldemarmiesse commented May 8, 2020

bhack commented May 8, 2020

bhack commented May 8, 2020

guillaumekln commented May 19, 2020

seanpmorgan commented Mar 1, 2023

Discussion: When to do checks of user inputs? #1417

Discussion: When to do checks of user inputs? #1417

Comments

gabrieldemarmiesse commented Mar 26, 2020

fsx950223 commented Mar 29, 2020

bhack commented May 8, 2020

gabrieldemarmiesse commented May 8, 2020

bhack commented May 8, 2020 • edited Loading

bhack commented May 8, 2020

guillaumekln commented May 8, 2020

gabrieldemarmiesse commented May 8, 2020 • edited Loading

guillaumekln commented May 8, 2020

gabrieldemarmiesse commented May 8, 2020

bhack commented May 8, 2020

guillaumekln commented May 8, 2020

gabrieldemarmiesse commented May 8, 2020

bhack commented May 8, 2020

bhack commented May 8, 2020

guillaumekln commented May 19, 2020

seanpmorgan commented Mar 1, 2023

bhack commented May 8, 2020 •

edited

Loading

gabrieldemarmiesse commented May 8, 2020 •

edited

Loading