Dense Image Warp tests are flaky #138

seanpmorgan · 2019-04-03T20:07:10Z

Recently we've seen that #53 is causing flaky failures in the CI. See:
https://source.cloud.google.com/results/invocations/8f31faef-505a-440e-b75f-e6edf1071269/targets/tensorflow_addons%2Fubuntu%2Fgpu%2Fpy3%2Fpresubmit/log

Do you mind taking a look when time allows @WindQAQ ?

WindQAQ · 2019-04-04T01:36:43Z

Seems that the behavior of tf.debugging.assert* that we investigated in #53 results in the error.
But I'm not so sure why the test fails even if it runs in eager mode only.

WindQAQ · 2019-04-04T01:40:30Z

Okay, I find that the following script also fails in tf 2.0.0-dev20190403 on my machine. CC @tomerk for visibility.

import tensorflow as tf
from tensorflow.python.framework import test_util


@tf.function
def foo(x):
    tf.debugging.assert_greater_equal(x.shape[0], 2, message="fail")
    y = x[1]
    return y


class TestAssert(tf.test.TestCase):

    def test_assert(self):
        with self.assertRaisesRegexp(tf.errors.InvalidArgumentError, "fail"):
            self.evaluate(foo(tf.random.uniform(shape=(1, 2, 3))))


if __name__ == "__main__":
    tf.test.main()

seanpmorgan · 2019-04-05T12:54:01Z

@WindQAQ So while this does look like a bug... is there any reason we need to be using tf.debugger for the checks in dense_image_warp? I believe the uses can all be checked using the python interpreter (Or in graph mode with tf.function and typical python syntax)?

WindQAQ · 2019-04-05T13:19:52Z

The checks are all about shape inference. If tf.function can handle the task of None shape or this is even not a problem in TF 2.0, I think the checks can definitely use simple assert, if and raise instead.

seanpmorgan · 2019-04-05T13:25:58Z

Could we try a PR that uses just if and raise and then make sure there is a test case that will fail if it doesn't properly trap shape issue?

WindQAQ · 2019-04-05T13:43:18Z

Sure, I'll create a PR in one or two days :-)

WindQAQ · 2019-04-08T04:04:48Z

In the latest docker image (tf==2.0.0-dev20190405), I find that the error is caused by test_zero_flows. The workaround is to disable test in graph mode. Still working on replacing tf.debugging with pure if and raise statement for future extension.

armando-fandango · 2019-04-08T22:29:20Z

@WindQAQ one more is failing now, sorry it was not failing earlier:

[  FAILED  ] DenseImageWarpTest.test_size_exception
[ RUN      ] DenseImageWarpTest.test_zero_flows
[       OK ] DenseImageWarpTest.test_zero_flows
======================================================================
FAIL: test_size_exception (__main__.DenseImageWarpTest)
test_size_exception (__main__.DenseImageWarpTest)
Make sure it throws an exception for images that are too small.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/absl/third_party/unittest3_backport/case.py", line 37, in testPartExecutor
    yield
  File "/usr/local/lib/python2.7/dist-packages/absl/third_party/unittest3_backport/case.py", line 162, in run
    testMethod()
  File "/root/.cache/bazel/_bazel_root/e051f2f195071208ea7f081f88730f4f/execroot/__main__/bazel-out/k8-opt/bin/tensorflow_addons/image/dense_image_warp_test.runfiles/__main__/tensorflow_addons/image/dense_image_warp_test.py", line 223, in test_size_exception
    self._check_interpolation_correctness(shape, "float32", "float32")
AssertionError: "Grid width must be at least 2." does not match "indices[0,0] = -1 is not in [0, 2)
	 [[{{node dense_image_warp/StatefulPartitionedCall/interpolate_bilinear/gather-top_left/GatherV2}}]] [Op:__inference_dense_image_warp_9536]"

----------------------------------------------------------------------
Ran 8 tests in 52.030s

FAILED (failures=1)
================================================================================

WindQAQ · 2019-04-09T03:13:23Z

@armando-fandango I cannot reproduce this error now. Is this on the CI build or your local machine? Thanks!

@seanpmorgan Hi Sean, I'd suppose it is not easy due to tf.function's state. Seems that the following code snippet will not result in multiple distinct graphs being traced, and thus, ValueError is also raised when shape=[3, 3, 3]. Sorry that I am not familiar with tf.function and not sure if the use case is correct.

import tensorflow as tf


@tf.function
def foo(x):
    if tf.shape(x)[0] < 3:
        raise ValueError("fail")
    return x[2]


for shape in [[1, 1, 1], [2, 2, 2], [3, 3, 3]]:
    try:
        foo(tf.random.uniform(shape=shape))
    except ValueError:
        print(shape)

for shape in [[3, 3, 3], [2, 2, 2], [1, 1, 1]]:
    try:
        foo(tf.random.uniform(shape=shape))
    except ValueError:
        print(shape)

'''
output: 
[1, 1, 1]
[2, 2, 2]
[3, 3, 3]
[3, 3, 3]
[2, 2, 2]
[1, 1, 1]
'''

armando-fandango · 2019-04-09T16:01:32Z

This is on latest custom-ops container with master branch. I have two macbooks, and it fails only on one of them.

armando-fandango · 2019-04-09T16:23:41Z

Do we need to add make clean in case something is getting cached ?

WindQAQ · 2019-04-10T02:56:32Z

It's weird. Seems that the behavior varies from machine to machine. I think another workaround is not to use tf.function on this temporarily, but this might not meet the requirements for addons. Or we can just wait for tf.debugging.assert* working as expected...

seanpmorgan · 2019-04-10T11:53:28Z

Yeah so I believe it'll fail on every machine with enough runs. I'm fine with a removal of tf.function for the time being... but we should look into a removal of tf.debugging as a long term fix

seanpmorgan · 2019-04-16T19:46:38Z

Hmmm tests appear to be passing repeatedly. Possible there has been an upstream fix... such is life building against an alpha release.

Closing, but happy to re-open if we see issues again.

seanpmorgan · 2019-04-17T12:10:09Z

Welp... too quick on calling that (nightly fail). We should either remove tf.debugging or remove the tf.funcion decorator

khatchad · 2022-01-13T16:56:16Z

Welp... too quick on calling that (nightly fail). We should either remove tf.debugging or remove the tf.funcion decorator

You should be able to use tf.debugging.assert_equal() with tf.function if you follow it up with a control dependency.

khatchad · 2022-01-13T17:00:34Z

Seems that the behavior of tf.debugging.assert* that we investigated in #53 results in the error. But I'm not so sure why the test fails even if it runs in eager mode only.

I don't think you need tf.debugging.assert_equal() in eager mode. It should return None in eager mode. I believe you should be able to use the normal assert() functions in eager mode as there is no graph involved.

seanpmorgan added bug Something isn't working image labels Apr 3, 2019

WindQAQ mentioned this issue Apr 8, 2019

Fix flaky test #149

Merged

seanpmorgan closed this as completed in #149 Apr 8, 2019

seanpmorgan reopened this Apr 8, 2019

seanpmorgan closed this as completed Apr 16, 2019

seanpmorgan reopened this Apr 17, 2019

seanpmorgan mentioned this issue Apr 17, 2019

FIX: Dense Image Warp Bug #187

Merged

facaiy closed this as completed in #187 Apr 18, 2019

WindQAQ mentioned this issue May 22, 2019

Raise exceptions based on tensor's properties #260

Closed

faizan-m mentioned this issue Nov 18, 2021

Generate Assertion Ops for interpolate_bilinear #2609

Merged

21 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dense Image Warp tests are flaky #138

Dense Image Warp tests are flaky #138

seanpmorgan commented Apr 3, 2019

WindQAQ commented Apr 4, 2019 •

edited

Loading

WindQAQ commented Apr 4, 2019 •

edited

Loading

seanpmorgan commented Apr 5, 2019 •

edited

Loading

WindQAQ commented Apr 5, 2019

seanpmorgan commented Apr 5, 2019

WindQAQ commented Apr 5, 2019

WindQAQ commented Apr 8, 2019 •

edited

Loading

armando-fandango commented Apr 8, 2019

WindQAQ commented Apr 9, 2019

armando-fandango commented Apr 9, 2019

armando-fandango commented Apr 9, 2019

WindQAQ commented Apr 10, 2019

seanpmorgan commented Apr 10, 2019

seanpmorgan commented Apr 16, 2019

seanpmorgan commented Apr 17, 2019

khatchad commented Jan 13, 2022

khatchad commented Jan 13, 2022

Dense Image Warp tests are flaky #138

Dense Image Warp tests are flaky #138

Comments

seanpmorgan commented Apr 3, 2019

WindQAQ commented Apr 4, 2019 • edited Loading

WindQAQ commented Apr 4, 2019 • edited Loading

seanpmorgan commented Apr 5, 2019 • edited Loading

WindQAQ commented Apr 5, 2019

seanpmorgan commented Apr 5, 2019

WindQAQ commented Apr 5, 2019

WindQAQ commented Apr 8, 2019 • edited Loading

armando-fandango commented Apr 8, 2019

WindQAQ commented Apr 9, 2019

armando-fandango commented Apr 9, 2019

armando-fandango commented Apr 9, 2019

WindQAQ commented Apr 10, 2019

seanpmorgan commented Apr 10, 2019

seanpmorgan commented Apr 16, 2019

seanpmorgan commented Apr 17, 2019

khatchad commented Jan 13, 2022

khatchad commented Jan 13, 2022

WindQAQ commented Apr 4, 2019 •

edited

Loading

WindQAQ commented Apr 4, 2019 •

edited

Loading

seanpmorgan commented Apr 5, 2019 •

edited

Loading

WindQAQ commented Apr 8, 2019 •

edited

Loading