CIFAR-10 eval fails with error TypeError: Input 'predictions' of 'InTopKV2' Op has type float16 that does not match expected type of float32 #7225

chrismattmann · 2019-07-16T15:55:44Z

System information

What is the top-level directory of the model you are using:
tutorials/image/cifar10/cifar10_eval.py
Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
no
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux jupyter-mattmann-40usc-2eedu 4.15.15-1.el7.x86_64 initial commit, simple, separated models #1 SMP Thu Oct 4 07:42:41 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
TensorFlow installed from (source or binary): used PIP (binary)
TensorFlow version (use command below): 1.13.1, tensorflow-datasets 1.0.2
Bazel version (if compiling from source): N/A
CUDA/cuDNN version: N/A
GPU model and memory: 4 GPUs
Exact command to reproduce:
python3 cifar_eval.py

== env ==========================================================
LD_LIBRARY_PATH /usr/local/nvidia/lib:/usr/local/nvidia/lib64
DYLD_LIBRARY_PATH is unset

== nvidia-smi ===================================================
Tue Jul 16 15:59:26 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.14       Driver Version: 430.14       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:1B:00.0 Off |                  N/A |
| 31%   32C    P0    85W / 250W |      0MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 208...  Off  | 00000000:1E:00.0 Off |                  N/A |
|  0%   30C    P8    22W / 250W |      0MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce RTX 208...  Off  | 00000000:61:00.0 Off |                  N/A |
|  0%   30C    P0    65W / 250W |      0MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce RTX 208...  Off  | 00000000:63:00.0 Off |                  N/A |
| 29%   29C    P0    62W / 250W |      0MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

== cuda libs  ===================================================
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcudart_static.a
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcudart.so.10.0.130

== tensorflow installed from info ==================

== python version  ==============================================
(major, minor, micro, releaselevel, serial)
(3, 7, 3, 'final', 0)

== bazel version  ===============================================
jovyan@jupyter-mattmann-40usc-2eedu:~/models/tutorials/image/cifar10$

Describe the problem

CIFAR-10 eval script fails. I think the error is related to tensorflow/tensorflow#165. I'll work up a PR to fix.

Source code / logs

Will send a PR.

The text was updated successfully, but these errors were encountered:

chrismattmann · 2019-07-16T16:08:26Z

Another bug later on after fixing this one:

Traceback (most recent call last):
  File "cifar10_eval.py", line 156, in <module>
    tf.app.run()
  File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 300, in run
    _run_main(main, args)
  File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "cifar10_eval.py", line 152, in main
    evaluate()
  File "cifar10_eval.py", line 128, in evaluate
    top_k_op = tf.nn.in_top_k(logits, labels, 1)
  File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/ops/nn_ops.py", line 4784, in in_top_k
    return gen_nn_ops.in_top_kv2(predictions, targets, k, name=name)
  File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 5040, in in_top_kv2
    "InTopKV2", predictions=predictions, targets=targets, k=k, name=name)
  File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 626, in _apply_op_helper
    param_name=input_name)
  File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 60, in _SatisfiesTypeConstraint
    ", ".join(dtypes.as_dtype(x).name for x in allowed_list)))
TypeError: Value passed to parameter 'targets' has DataType float16 not in list of allowed values: int32, int64

Fixing this cast as well.

…put 'predictions' of 'InTopKV2' Op has type float16 that contributed by mattmann.

chrismattmann · 2019-07-16T16:24:42Z

Added a PR #7227 that fixes this.

…ctions' of 'InTopKV2' Op has type float16 that contributed by mattmann. (#7227)

chrismattmann · 2019-07-28T21:13:41Z

Committed by @tfboyd in 63605b9 thanks!

chrismattmann added a commit to chrismattmann/models that referenced this issue Jul 16, 2019

Fix for tensorflow#7225: CIFAR-10 eval fails with error TypeError: In…

6183fbd

…put 'predictions' of 'InTopKV2' Op has type float16 that contributed by mattmann.

chrismattmann added a commit to chrismattmann/models that referenced this issue Jul 16, 2019

Fix for tensorflow#7225: CIFAR-10 eval fails with error TypeError: In…

d420757

…put 'predictions' of 'InTopKV2' Op has type float16 that contributed by mattmann.

tfboyd pushed a commit that referenced this issue Jul 18, 2019

Fix for #7225: CIFAR-10 eval fails with error TypeError: Input 'predi…

63605b9

…ctions' of 'InTopKV2' Op has type float16 that contributed by mattmann. (#7227)

chrismattmann closed this as completed Jul 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CIFAR-10 eval fails with error TypeError: Input 'predictions' of 'InTopKV2' Op has type float16 that does not match expected type of float32 #7225

CIFAR-10 eval fails with error TypeError: Input 'predictions' of 'InTopKV2' Op has type float16 that does not match expected type of float32 #7225

chrismattmann commented Jul 16, 2019 •

edited

Loading

chrismattmann commented Jul 16, 2019

chrismattmann commented Jul 16, 2019

chrismattmann commented Jul 28, 2019

CIFAR-10 eval fails with error TypeError: Input 'predictions' of 'InTopKV2' Op has type float16 that does not match expected type of float32 #7225

CIFAR-10 eval fails with error TypeError: Input 'predictions' of 'InTopKV2' Op has type float16 that does not match expected type of float32 #7225

Comments

chrismattmann commented Jul 16, 2019 • edited Loading

System information

Describe the problem

Source code / logs

chrismattmann commented Jul 16, 2019

chrismattmann commented Jul 16, 2019

chrismattmann commented Jul 28, 2019

chrismattmann commented Jul 16, 2019 •

edited

Loading