Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Normalize.transform result when stdev is zero #993

Merged
merged 1 commit into from
Jul 27, 2021

Conversation

rjzamora
Copy link
Collaborator

@rjzamora rjzamora commented Jul 27, 2021

The Normalize.transform method is currently failing to return columns with a standard-deviation statistic equal to zero. This is obviously a problem when a follow-on operator depends on that column. This PR effectively sets all values in a "problematic column" to zero.

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #993 of commit 52568e05ca48eba3471d59e7e1af664f4edf40ac, no merge conflicts.
Running as SYSTEM
Setting status of 52568e05ca48eba3471d59e7e1af664f4edf40ac to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/2985/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/993/*:refs/remotes/origin/pr/993/* # timeout=10
 > git rev-parse 52568e05ca48eba3471d59e7e1af664f4edf40ac^{commit} # timeout=10
Checking out Revision 52568e05ca48eba3471d59e7e1af664f4edf40ac (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 52568e05ca48eba3471d59e7e1af664f4edf40ac # timeout=10
Commit message: "correct normalize for case that std=0"
 > git rev-list --no-walk bf63c04535e4874da6d0f930713b2187428f5818 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins8184866626482963173.sh
Installing NVTabular
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: pip in /var/jenkins_home/.local/lib/python3.8/site-packages (21.2.1)
Requirement already satisfied: setuptools in /var/jenkins_home/.local/lib/python3.8/site-packages (57.4.0)
Requirement already satisfied: wheel in /usr/local/lib/python3.8/dist-packages (0.36.2)
Requirement already satisfied: pybind11 in /var/jenkins_home/.local/lib/python3.8/site-packages (2.7.0)
running develop
running egg_info
creating nvtabular.egg-info
writing nvtabular.egg-info/PKG-INFO
writing dependency_links to nvtabular.egg-info/dependency_links.txt
writing requirements to nvtabular.egg-info/requires.txt
writing top-level names to nvtabular.egg-info/top_level.txt
writing manifest file 'nvtabular.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'nvtabular.egg-info/SOURCES.txt'
running build_ext
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.8 -c flagcheck.cpp -o flagcheck.o -std=c++17
building 'nvtabular_cpp' extension
creating build
creating build/temp.linux-x86_64-3.8
creating build/temp.linux-x86_64-3.8/cpp
creating build/temp.linux-x86_64-3.8/cpp/nvtabular
creating build/temp.linux-x86_64-3.8/cpp/nvtabular/inference
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.5.3+71.g52568e0 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.5.3+71.g52568e0 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.5.3+71.g52568e0 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/categorify.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.5.3+71.g52568e0 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/fill.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -std=c++17 -fvisibility=hidden -g0
creating build/lib.linux-x86_64-3.8
x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -o build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so
copying build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so -> 
Generating nvtabular/inference/triton/model_config_pb2.py from nvtabular/inference/triton/model_config.proto
Creating /var/jenkins_home/.local/lib/python3.8/site-packages/nvtabular.egg-link (link to .)
nvtabular 0.5.3+71.g52568e0 is already the active version in easy-install.pth

Installed /var/jenkins_home/workspace/nvtabular_tests/nvtabular
Processing dependencies for nvtabular==0.5.3+71.g52568e0
Searching for pyarrow==1.0.1
Best match: pyarrow 1.0.1
Adding pyarrow 1.0.1 to easy-install.pth file
Installing plasma_store script to /var/jenkins_home/.local/bin

Using /usr/local/lib/python3.8/dist-packages
Searching for tdqm==0.0.1
Best match: tdqm 0.0.1
Adding tdqm 0.0.1 to easy-install.pth file

Using /var/jenkins_home/.local/lib/python3.8/site-packages
Searching for numba==0.53.1
Best match: numba 0.53.1
Adding numba 0.53.1 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for pandas==1.1.5
Best match: pandas 1.1.5
Adding pandas 1.1.5 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for distributed==2021.4.1
Best match: distributed 2021.4.1
Adding distributed 2021.4.1 to easy-install.pth file
Installing dask-ssh script to /var/jenkins_home/.local/bin
Installing dask-scheduler script to /var/jenkins_home/.local/bin
Installing dask-worker script to /var/jenkins_home/.local/bin

Using /var/jenkins_home/.local/lib/python3.8/site-packages
Searching for dask==2021.4.1
Best match: dask 2021.4.1
Processing dask-2021.4.1-py3.8.egg
dask 2021.4.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg
Searching for PyYAML==5.4.1
Best match: PyYAML 5.4.1
Processing PyYAML-5.4.1-py3.8-linux-x86_64.egg
PyYAML 5.4.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/PyYAML-5.4.1-py3.8-linux-x86_64.egg
Searching for numpy==1.20.2
Best match: numpy 1.20.2
Adding numpy 1.20.2 to easy-install.pth file
Installing f2py script to /var/jenkins_home/.local/bin
Installing f2py3 script to /var/jenkins_home/.local/bin
Installing f2py3.8 script to /var/jenkins_home/.local/bin

Using /usr/local/lib/python3.8/dist-packages
Searching for tqdm==4.61.2
Best match: tqdm 4.61.2
Processing tqdm-4.61.2-py3.8.egg
tqdm 4.61.2 is already the active version in easy-install.pth
Installing tqdm script to /var/jenkins_home/.local/bin

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tqdm-4.61.2-py3.8.egg
Searching for llvmlite==0.36.0
Best match: llvmlite 0.36.0
Adding llvmlite 0.36.0 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for setuptools==57.4.0
Best match: setuptools 57.4.0
Adding setuptools 57.4.0 to easy-install.pth file

Using /var/jenkins_home/.local/lib/python3.8/site-packages
Searching for pytz==2021.1
Best match: pytz 2021.1
Adding pytz 2021.1 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for python-dateutil==2.8.2
Best match: python-dateutil 2.8.2
Adding python-dateutil 2.8.2 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for cloudpickle==1.6.0
Best match: cloudpickle 1.6.0
Processing cloudpickle-1.6.0-py3.8.egg
cloudpickle 1.6.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/cloudpickle-1.6.0-py3.8.egg
Searching for tornado==6.1
Best match: tornado 6.1
Processing tornado-6.1-py3.8-linux-x86_64.egg
tornado 6.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tornado-6.1-py3.8-linux-x86_64.egg
Searching for click==8.0.1
Best match: click 8.0.1
Processing click-8.0.1-py3.8.egg
click 8.0.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/click-8.0.1-py3.8.egg
Searching for msgpack==1.0.2
Best match: msgpack 1.0.2
Processing msgpack-1.0.2-py3.8-linux-x86_64.egg
msgpack 1.0.2 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/msgpack-1.0.2-py3.8-linux-x86_64.egg
Searching for tblib==1.7.0
Best match: tblib 1.7.0
Processing tblib-1.7.0-py3.8.egg
tblib 1.7.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tblib-1.7.0-py3.8.egg
Searching for toolz==0.11.1
Best match: toolz 0.11.1
Processing toolz-0.11.1-py3.8.egg
toolz 0.11.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/toolz-0.11.1-py3.8.egg
Searching for psutil==5.8.0
Best match: psutil 5.8.0
Processing psutil-5.8.0-py3.8-linux-x86_64.egg
psutil 5.8.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/psutil-5.8.0-py3.8-linux-x86_64.egg
Searching for zict==2.0.0
Best match: zict 2.0.0
Processing zict-2.0.0-py3.8.egg
zict 2.0.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/zict-2.0.0-py3.8.egg
Searching for sortedcontainers==2.4.0
Best match: sortedcontainers 2.4.0
Processing sortedcontainers-2.4.0-py3.8.egg
sortedcontainers 2.4.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/sortedcontainers-2.4.0-py3.8.egg
Searching for partd==1.2.0
Best match: partd 1.2.0
Processing partd-1.2.0-py3.8.egg
partd 1.2.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/partd-1.2.0-py3.8.egg
Searching for fsspec==2021.7.0
Best match: fsspec 2021.7.0
Processing fsspec-2021.7.0-py3.8.egg
fsspec 2021.7.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/fsspec-2021.7.0-py3.8.egg
Searching for six==1.15.0
Best match: six 1.15.0
Adding six 1.15.0 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for HeapDict==1.0.1
Best match: HeapDict 1.0.1
Processing HeapDict-1.0.1-py3.8.egg
HeapDict 1.0.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/HeapDict-1.0.1-py3.8.egg
Searching for locket==0.2.1
Best match: locket 0.2.1
Processing locket-0.2.1-py3.8.egg
locket 0.2.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/locket-0.2.1-py3.8.egg
Finished processing dependencies for nvtabular==0.5.3+71.g52568e0
Running black --check
All done! ✨ 🍰 ✨
109 files would be left unchanged.
Running flake8
Running isort
/usr/local/lib/python3.8/dist-packages/isort/main.py:141: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
warn(f"Likely recursive symlink detected to {resolved_path}")
/usr/local/lib/python3.8/dist-packages/isort/main.py:141: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/examples/scaling-criteo/imgs
warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 2 files
Running bandit
Running pylint
************* Module nvtabular.ops.categorify
nvtabular/ops/categorify.py:459:15: I1101: Module 'nvtabular_cpp' has no 'inference' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
************* Module nvtabular.ops.fill
nvtabular/ops/fill.py:66:15: I1101: Module 'nvtabular_cpp' has no 'inference' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
************* Module bench.datasets.tools.train_hugectr
bench/datasets/tools/train_hugectr.py:28:13: I1101: Module 'hugectr' has no 'solver_parser_helper' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
bench/datasets/tools/train_hugectr.py:41:16: I1101: Module 'hugectr' has no 'optimizer' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)


Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)

Running flake8-nb
Building docs
make: Entering directory '/var/jenkins_home/workspace/nvtabular_tests/nvtabular/docs'
2021-07-27 03:13:38.709067: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-07-27 03:13:40.070533: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-07-27 03:13:40.071629: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:07:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2021-07-27 03:13:40.072637: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 1 with properties:
pciBusID: 0000:08:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2021-07-27 03:13:40.072668: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-07-27 03:13:40.072714: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-07-27 03:13:40.072747: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-07-27 03:13:40.072780: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-07-27 03:13:40.072809: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-07-27 03:13:40.072853: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-07-27 03:13:40.072885: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-07-27 03:13:40.072921: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-07-27 03:13:40.076776: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0, 1
/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.6) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
make: Leaving directory '/var/jenkins_home/workspace/nvtabular_tests/nvtabular/docs'
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml
plugins: cov-2.12.1, forked-1.3.0, xdist-2.3.0
collected 1155 items

tests/unit/test_column_group.py .. [ 0%]
tests/unit/test_column_similarity.py ........................ [ 2%]
tests/unit/test_cpu_workflow.py ...... [ 2%]
tests/unit/test_dask_nvt.py ............................................ [ 6%]
...........................................................F......... [ 12%]
tests/unit/test_dataloader_backend.py . [ 12%]
tests/unit/test_io.py .................................................. [ 16%]
........................................................................ [ 23%]
........ssssssss.................................................. [ 28%]
tests/unit/test_notebooks.py ...... [ 29%]
tests/unit/test_ops.py ................................................. [ 33%]
........................................................................ [ 39%]
........................................................................ [ 46%]
........................................................................ [ 52%]
........................................................................ [ 58%]
........................................................................ [ 64%]
................................. [ 67%]
tests/unit/test_s3.py .. [ 67%]
tests/unit/test_tf_dataloader.py ....................................... [ 71%]
.................................s [ 74%]
tests/unit/test_tf_feature_columns.py . [ 74%]
tests/unit/test_tf_layers.py ........................................... [ 78%]
................................... [ 81%]
tests/unit/test_tools.py ...................... [ 82%]
tests/unit/test_torch_dataloader.py .................................... [ 86%]
.............................................. [ 90%]
tests/unit/test_triton_inference.py ....................... [ 92%]
tests/unit/test_workflow.py ............................................ [ 95%]
................................................ [100%]

=================================== FAILURES ===================================
_________ test_dask_preproc_cpu[None-Shuffle.PER_WORKER-csv-no-header] _________

client = <Client: 'tcp://127.0.0.1:32901' processes=2 threads=16, memory=125.83 GiB>
tmpdir = local('/tmp/pytest-of-jenkins/pytest-5/test_dask_preproc_cpu_None_Shu2')
datasets = {'cats': local('/tmp/pytest-of-jenkins/pytest-5/cats0'), 'csv': local('/tmp/pytest-of-jenkins/pytest-5/csv0'), 'csv-no... local('/tmp/pytest-of-jenkins/pytest-5/csv-no-header0'), 'parquet': local('/tmp/pytest-of-jenkins/pytest-5/parquet0')}
engine = 'csv-no-header', shuffle = <Shuffle.PER_WORKER: 1>, cpu = None

@pytest.mark.parametrize("engine", ["parquet", "csv", "csv-no-header"])
@pytest.mark.parametrize("shuffle", [Shuffle.PER_WORKER, None])
@pytest.mark.parametrize("cpu", [None, True])
def test_dask_preproc_cpu(client, tmpdir, datasets, engine, shuffle, cpu):
    paths = glob.glob(str(datasets[engine]) + "/*." + engine.split("-")[0])
    if engine == "parquet":
        df1 = cudf.read_parquet(paths[0])[mycols_pq]
        df2 = cudf.read_parquet(paths[1])[mycols_pq]
    elif engine == "csv":
        df1 = cudf.read_csv(paths[0], header=0)[mycols_csv]
        df2 = cudf.read_csv(paths[1], header=0)[mycols_csv]
    else:
        df1 = cudf.read_csv(paths[0], names=allcols_csv)[mycols_csv]
        df2 = cudf.read_csv(paths[1], names=allcols_csv)[mycols_csv]
    df0 = cudf.concat([df1, df2], axis=0)

    if engine in ("parquet", "csv"):
        dataset = Dataset(paths, part_size="1MB", cpu=cpu)
    else:
        dataset = Dataset(paths, names=allcols_csv, part_size="1MB", cpu=cpu)

    # Simple transform (normalize)
    cat_names = ["name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]
    conts = cont_names >> ops.FillMissing() >> ops.Normalize()
    workflow = Workflow(conts + cat_names + label_name, client=client)
    transformed = workflow.fit_transform(dataset)

    # Write out dataset
    output_path = os.path.join(tmpdir, "processed")
    transformed.to_parquet(output_path=output_path, shuffle=shuffle, out_files_per_proc=4)

    # Check the final result
  df_disk = dd_read_parquet(output_path, engine="pyarrow").compute()

tests/unit/test_dask_nvt.py:273:


../../../.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg/dask/base.py:285: in compute
(result,) = compute(self, traverse=False, **kwargs)
../../../.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg/dask/base.py:567: in compute
results = schedule(dsk, keys, **kwargs)
../../../.local/lib/python3.8/site-packages/distributed/client.py:2666: in get
results = self.gather(packed, asynchronous=asynchronous, direct=direct)
../../../.local/lib/python3.8/site-packages/distributed/client.py:1975: in gather
return self.sync(
../../../.local/lib/python3.8/site-packages/distributed/client.py:843: in sync
return sync(
../../../.local/lib/python3.8/site-packages/distributed/utils.py:353: in sync
raise exc.with_traceback(tb)
../../../.local/lib/python3.8/site-packages/distributed/utils.py:336: in f
result[0] = yield future
../../../.local/lib/python3.8/site-packages/tornado-6.1-py3.8-linux-x86_64.egg/tornado/gen.py:762: in run
value = future.result()
../../../.local/lib/python3.8/site-packages/distributed/client.py:1840: in _gather
raise exception.with_traceback(traceback)
../../../.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg/dask/dataframe/io/parquet/core.py:381: in read_parquet_part
dfs = [
../../../.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg/dask/dataframe/io/parquet/core.py:382: in
func(fs, rg, columns.copy(), index, **toolz.merge(kwargs, kw))
../../../.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg/dask/dataframe/io/parquet/arrow.py:599: in read_partition
arrow_table = cls._read_table(
../../../.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg/dask/dataframe/io/parquet/arrow.py:2007: in _read_table
return _read_table_from_path(
../../../.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg/dask/dataframe/io/parquet/arrow.py:406: in _read_table_from_path
return pq.ParquetFile(fil).read_row_groups(
/usr/local/lib/python3.8/dist-packages/pyarrow/parquet.py:198: in init
self.reader.open(source, use_memory_map=memory_map,
pyarrow/_parquet.pyx:1020: in pyarrow._parquet.ParquetReader.open
???


???
E OSError: Couldn't deserialize thrift: TProtocolException: Invalid data

pyarrow/error.pxi:99: OSError
----------------------------- Captured stderr call -----------------------------
distributed.worker - WARNING - Compute Failed
Function: read_parquet_part
args: (<fsspec.implementations.local.LocalFileSystem object at 0x7faa1458ae80>, <bound method ArrowDatasetEngine.read_partition of <class 'dask.dataframe.io.parquet.arrow.ArrowLegacyEngine'>>, Empty DataFrame
Columns: [x, y, id, name-string, label]
Index: [], [(('/tmp/pytest-of-jenkins/pytest-5/test_dask_preproc_cpu_None_Shu2/processed/part_3.parquet', [0], []), {})], ['x', 'y', 'id', 'name-string', 'label'], None, {'partitions': <pyarrow.parquet.ParquetPartitions object at 0x7fa8e04b4a90>, 'categories': [], 'filters': None})
kwargs: {}
Exception: OSError("Couldn't deserialize thrift: TProtocolException: Invalid data\n")

=============================== warnings summary ===============================
tests/unit/test_io.py: 36 warnings
tests/unit/test_workflow.py: 44 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:86: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler will be used for execution. Please use the client argument to initialize a Workflow object with distributed-execution enabled.
warnings.warn(

tests/unit/test_io.py: 52 warnings
tests/unit/test_workflow.py: 35 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dask.py:372: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler will be used for this write operation. Please use the client argument to initialize a Dataset and/or Workflow object with distributed-execution enabled.
warnings.warn(

tests/unit/test_io.py: 20 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:476: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler is being used for this shuffle operation. Please use the client argument to initialize a Dataset and/or Workflow object with distributed-execution enabled.
warnings.warn(

tests/unit/test_ops.py::test_fill_missing[True-True-parquet]
tests/unit/test_ops.py::test_fill_missing[True-False-parquet]
tests/unit/test_ops.py::test_filter[parquet-0.1-True]
/usr/local/lib/python3.8/dist-packages/pandas/core/indexing.py:670: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
iloc._setitem_with_indexer(indexer, value)

tests/unit/test_ops.py::test_join_external[True-True-left-host-pandas-parquet]
tests/unit/test_ops.py::test_join_external[True-True-left-device-pandas-parquet]
tests/unit/test_ops.py::test_join_external[True-True-inner-host-pandas-parquet]
tests/unit/test_ops.py::test_join_external[True-True-inner-device-pandas-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/join_external.py:171: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
_ext.drop_duplicates(ignore_index=True, inplace=True)

tests/unit/test_ops.py::test_filter[parquet-0.1-True]
tests/unit/test_ops.py::test_filter[parquet-0.1-False]
tests/unit/test_ops.py::test_groupby_op[id-True]
tests/unit/test_ops.py::test_groupby_op[id-False]
/var/jenkins_home/.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg/dask/dataframe/core.py:6610: UserWarning: Insufficient elements for head. 1 elements requested, only 0 elements available. Try passing larger npartitions to head.
warnings.warn(msg.format(n, len(r)))

-- Docs: https://docs.pytest.org/en/stable/warnings.html

---------- coverage: platform linux, python 3.8.10-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

examples/multi-gpu-movielens/torch_trainer.py 65 0 6 1 99% 32->36
nvtabular/init.py 12 0 0 0 100%
nvtabular/column_group.py 157 18 82 5 87% 54, 87, 128, 152-165, 214, 301
nvtabular/dispatch.py 251 44 122 21 81% 35-38, 43-45, 51-61, 68-69, 86, 93, 101, 112, 118, 123->125, 136, 159-162, 201, 208, 224, 231, 262->267, 265, 268, 271->275, 308, 319-322, 349-352, 382, 386, 427, 451, 453, 460
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 132 78 88 15 38% 29, 98, 102, 113-129, 139, 142-157, 161, 165-166, 172-197, 206-216, 219-226, 228->231, 232, 237-277, 280
nvtabular/framework_utils/tensorflow/layers/init.py 4 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 153 12 85 6 91% 60, 68->49, 122, 179, 231-239, 335->343, 357->360, 363-364, 367
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 25 20 1 43% 49, 74-103, 106-110, 113
nvtabular/framework_utils/tensorflow/layers/outer_product.py 30 24 10 0 15% 37-38, 41-60, 71-84, 87
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 30 1 12 1 95% 47
nvtabular/framework_utils/torch/models.py 45 0 28 0 100%
nvtabular/framework_utils/torch/utils.py 75 4 30 2 94% 64, 118-120
nvtabular/inference/init.py 0 0 0 0 100%
nvtabular/inference/triton/init.py 298 138 128 15 54% 80-84, 138-188, 233-277, 308, 310, 334-342, 350-357, 376, 398-414, 418-422, 455-459, 497-507, 531-553, 557-624, 633->636, 636->632, 665-675, 679-680, 684, 694, 715, 722, 728->731, 732
nvtabular/inference/triton/benchmarking_tools.py 52 52 10 0 0% 2-103
nvtabular/inference/triton/data_conversions.py 87 3 58 4 95% 32-33, 84
nvtabular/inference/triton/model.py 144 144 76 0 0% 27-280
nvtabular/inference/triton/model_config_pb2.py 299 0 2 0 100%
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 88 88 30 0 0% 16-189
nvtabular/io/csv.py 57 6 20 5 86% 22-23, 99, 103->107, 108, 110, 124
nvtabular/io/dask.py 183 8 72 11 93% 111, 114, 150, 398, 408, 425->428, 436, 440->442, 442->438, 447, 449
nvtabular/io/dataframe_engine.py 61 5 28 6 88% 19-20, 50, 69, 88->92, 92->97, 94->97, 97->116, 125
nvtabular/io/dataset.py 306 40 138 26 84% 44-45, 246, 248, 261, 270, 288-302, 405->475, 410-413, 418->428, 423-424, 435->433, 449->453, 464, 475->484, 534->538, 581, 700, 704-706, 708, 768-769, 796, 800->821, 803-804, 810, 816, 911-912, 1028-1033, 1039, 1089
nvtabular/io/dataset_engine.py 23 1 0 0 96% 45
nvtabular/io/hugectr.py 45 2 24 2 91% 34, 74->97, 101
nvtabular/io/parquet.py 492 23 156 14 94% 33-34, 92-100, 124->126, 213-215, 338-343, 381-386, 502->509, 570->575, 576-577, 697, 701, 705, 711, 743, 760, 764, 771->773, 881->exit, 891->896, 901->911, 916, 938
nvtabular/io/shuffle.py 31 6 16 5 77% 42, 44-45, 49, 59, 63
nvtabular/io/writer.py 173 13 66 5 92% 24-25, 51, 79, 125, 128, 207, 216, 219, 262, 283-285
nvtabular/io/writer_factory.py 18 2 8 2 85% 35, 60
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 327 12 138 9 95% 142-143, 233->235, 245-249, 295-296, 335->339, 410, 414-415, 445, 550, 558
nvtabular/loader/tensorflow.py 155 22 50 7 85% 57, 65-68, 78, 88, 296, 332, 347-349, 378-380, 390-398, 401-404
nvtabular/loader/tf_utils.py 55 10 20 5 80% 29->32, 32->34, 39->41, 43, 50-51, 58-60, 66-70
nvtabular/loader/torch.py 81 13 16 2 78% 25-27, 30-36, 111, 149-150
nvtabular/ops/init.py 21 0 0 0 100%
nvtabular/ops/bucketize.py 32 10 18 3 62% 52-54, 58, 61-64, 83-86
nvtabular/ops/categorify.py 573 67 323 47 85% 230, 232, 247, 251, 259, 267, 269, 296, 315-316, 331, 342->347, 350-357, 436-437, 455-456, 532->534, 655, 691, 720->723, 724-726, 733-734, 747-749, 750->718, 766, 774, 776, 783->exit, 806, 809->812, 820, 845, 850, 866->870, 877-880, 891, 895, 897, 909-912, 990, 992, 1021->1044, 1027->1044, 1045-1050, 1087, 1105->1110, 1109, 1119->1116, 1124->1116, 1132, 1140-1150
nvtabular/ops/clip.py 18 2 6 3 79% 43, 51->53, 54
nvtabular/ops/column_similarity.py 103 24 36 5 72% 19-20, 76->exit, 106, 178-179, 188-190, 198-214, 231->234, 235, 245
nvtabular/ops/data_stats.py 56 2 22 3 94% 91->93, 95, 97->87, 102
nvtabular/ops/difference_lag.py 25 0 8 1 97% 66->68
nvtabular/ops/dropna.py 8 0 0 0 100%
nvtabular/ops/fill.py 63 6 22 1 89% 62-66, 101, 127
nvtabular/ops/filter.py 20 1 6 1 92% 49
nvtabular/ops/groupby.py 92 4 56 6 92% 71, 80, 82, 92->94, 104->109, 180
nvtabular/ops/hash_bucket.py 29 2 18 2 87% 69, 99
nvtabular/ops/hashed_cross.py 28 3 13 4 83% 50, 63, 77->exit, 78
nvtabular/ops/join_external.py 89 7 38 6 90% 20-21, 113, 115, 117, 159, 176->178, 212
nvtabular/ops/join_groupby.py 84 5 30 2 94% 106, 109->118, 194-195, 198-199
nvtabular/ops/lambdaop.py 39 6 18 6 79% 59, 63, 77, 89, 94, 103
nvtabular/ops/list_slice.py 63 24 26 1 56% 21-22, 52-53, 100-114, 122-133
nvtabular/ops/logop.py 8 0 0 0 100%
nvtabular/ops/moments.py 65 0 20 0 100%
nvtabular/ops/normalize.py 71 8 14 1 87% 69, 77-78, 111-112, 134-135, 139
nvtabular/ops/operator.py 29 1 2 1 94% 25
nvtabular/ops/rename.py 23 3 14 3 84% 45, 66-68
nvtabular/ops/stat_operator.py 8 0 0 0 100%
nvtabular/ops/target_encoding.py 146 11 64 5 90% 147, 167->171, 174->183, 228-229, 232-233, 242-248, 339->342
nvtabular/tools/init.py 0 0 0 0 100%
nvtabular/tools/data_gen.py 236 1 62 1 99% 323
nvtabular/tools/dataset_inspector.py 49 7 18 1 79% 31-38
nvtabular/tools/inspector_script.py 46 46 0 0 0% 17-168
nvtabular/utils.py 102 43 46 8 52% 31-32, 36-37, 50, 61-62, 64-66, 69, 72, 78, 84, 90-126, 145, 149->153
nvtabular/worker.py 82 5 38 7 90% 24-25, 82->99, 91, 92->99, 99->102, 108, 110, 111->113
nvtabular/workflow.py 161 13 77 5 92% 28-29, 45, 141, 155-157, 261, 276-277, 295-296, 384

TOTAL 6352 1095 2534 293 80%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 80.47%
=========================== short test summary info ============================
SKIPPED [8] tests/unit/test_io.py:500: could not import 'uavro': No module named 'uavro'
SKIPPED [1] tests/unit/test_tf_dataloader.py:521: not working correctly in ci environment
===== 1 failed, 1145 passed, 9 skipped, 198 warnings in 938.11s (0:15:38) ======
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://github.com/gitapi/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins6486864595425670731.sh

@rjzamora
Copy link
Collaborator Author

rerun tests

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #993 of commit 52568e05ca48eba3471d59e7e1af664f4edf40ac, no merge conflicts.
Running as SYSTEM
Setting status of 52568e05ca48eba3471d59e7e1af664f4edf40ac to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/2992/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/993/*:refs/remotes/origin/pr/993/* # timeout=10
 > git rev-parse 52568e05ca48eba3471d59e7e1af664f4edf40ac^{commit} # timeout=10
Checking out Revision 52568e05ca48eba3471d59e7e1af664f4edf40ac (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 52568e05ca48eba3471d59e7e1af664f4edf40ac # timeout=10
Commit message: "correct normalize for case that std=0"
 > git rev-list --no-walk 85f0563f801c87f2ca1c8213b49704d6606bfc8c # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins8875261578407297921.sh
Installing NVTabular
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: pip in /var/jenkins_home/.local/lib/python3.8/site-packages (21.2.1)
Requirement already satisfied: setuptools in /var/jenkins_home/.local/lib/python3.8/site-packages (57.4.0)
Requirement already satisfied: wheel in /usr/local/lib/python3.8/dist-packages (0.36.2)
Requirement already satisfied: pybind11 in /var/jenkins_home/.local/lib/python3.8/site-packages (2.7.0)
running develop
running egg_info
creating nvtabular.egg-info
writing nvtabular.egg-info/PKG-INFO
writing dependency_links to nvtabular.egg-info/dependency_links.txt
writing requirements to nvtabular.egg-info/requires.txt
writing top-level names to nvtabular.egg-info/top_level.txt
writing manifest file 'nvtabular.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'nvtabular.egg-info/SOURCES.txt'
running build_ext
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.8 -c flagcheck.cpp -o flagcheck.o -std=c++17
building 'nvtabular_cpp' extension
creating build
creating build/temp.linux-x86_64-3.8
creating build/temp.linux-x86_64-3.8/cpp
creating build/temp.linux-x86_64-3.8/cpp/nvtabular
creating build/temp.linux-x86_64-3.8/cpp/nvtabular/inference
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.5.3+71.g52568e0 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.5.3+71.g52568e0 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.5.3+71.g52568e0 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/categorify.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.5.3+71.g52568e0 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/fill.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -std=c++17 -fvisibility=hidden -g0
creating build/lib.linux-x86_64-3.8
x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -o build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so
copying build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so -> 
Generating nvtabular/inference/triton/model_config_pb2.py from nvtabular/inference/triton/model_config.proto
Creating /var/jenkins_home/.local/lib/python3.8/site-packages/nvtabular.egg-link (link to .)
nvtabular 0.5.3+71.g52568e0 is already the active version in easy-install.pth

Installed /var/jenkins_home/workspace/nvtabular_tests/nvtabular
Processing dependencies for nvtabular==0.5.3+71.g52568e0
Searching for pyarrow==1.0.1
Best match: pyarrow 1.0.1
Adding pyarrow 1.0.1 to easy-install.pth file
Installing plasma_store script to /var/jenkins_home/.local/bin

Using /usr/local/lib/python3.8/dist-packages
Searching for tdqm==0.0.1
Best match: tdqm 0.0.1
Adding tdqm 0.0.1 to easy-install.pth file

Using /var/jenkins_home/.local/lib/python3.8/site-packages
Searching for numba==0.53.1
Best match: numba 0.53.1
Adding numba 0.53.1 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for pandas==1.1.5
Best match: pandas 1.1.5
Adding pandas 1.1.5 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for distributed==2021.4.1
Best match: distributed 2021.4.1
Adding distributed 2021.4.1 to easy-install.pth file
Installing dask-ssh script to /var/jenkins_home/.local/bin
Installing dask-scheduler script to /var/jenkins_home/.local/bin
Installing dask-worker script to /var/jenkins_home/.local/bin

Using /var/jenkins_home/.local/lib/python3.8/site-packages
Searching for dask==2021.4.1
Best match: dask 2021.4.1
Processing dask-2021.4.1-py3.8.egg
dask 2021.4.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg
Searching for PyYAML==5.4.1
Best match: PyYAML 5.4.1
Processing PyYAML-5.4.1-py3.8-linux-x86_64.egg
PyYAML 5.4.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/PyYAML-5.4.1-py3.8-linux-x86_64.egg
Searching for numpy==1.20.2
Best match: numpy 1.20.2
Adding numpy 1.20.2 to easy-install.pth file
Installing f2py script to /var/jenkins_home/.local/bin
Installing f2py3 script to /var/jenkins_home/.local/bin
Installing f2py3.8 script to /var/jenkins_home/.local/bin

Using /usr/local/lib/python3.8/dist-packages
Searching for tqdm==4.61.2
Best match: tqdm 4.61.2
Processing tqdm-4.61.2-py3.8.egg
tqdm 4.61.2 is already the active version in easy-install.pth
Installing tqdm script to /var/jenkins_home/.local/bin

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tqdm-4.61.2-py3.8.egg
Searching for setuptools==57.4.0
Best match: setuptools 57.4.0
Adding setuptools 57.4.0 to easy-install.pth file

Using /var/jenkins_home/.local/lib/python3.8/site-packages
Searching for llvmlite==0.36.0
Best match: llvmlite 0.36.0
Adding llvmlite 0.36.0 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for pytz==2021.1
Best match: pytz 2021.1
Adding pytz 2021.1 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for python-dateutil==2.8.2
Best match: python-dateutil 2.8.2
Adding python-dateutil 2.8.2 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for cloudpickle==1.6.0
Best match: cloudpickle 1.6.0
Processing cloudpickle-1.6.0-py3.8.egg
cloudpickle 1.6.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/cloudpickle-1.6.0-py3.8.egg
Searching for tornado==6.1
Best match: tornado 6.1
Processing tornado-6.1-py3.8-linux-x86_64.egg
tornado 6.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tornado-6.1-py3.8-linux-x86_64.egg
Searching for zict==2.0.0
Best match: zict 2.0.0
Processing zict-2.0.0-py3.8.egg
zict 2.0.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/zict-2.0.0-py3.8.egg
Searching for tblib==1.7.0
Best match: tblib 1.7.0
Processing tblib-1.7.0-py3.8.egg
tblib 1.7.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tblib-1.7.0-py3.8.egg
Searching for toolz==0.11.1
Best match: toolz 0.11.1
Processing toolz-0.11.1-py3.8.egg
toolz 0.11.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/toolz-0.11.1-py3.8.egg
Searching for sortedcontainers==2.4.0
Best match: sortedcontainers 2.4.0
Processing sortedcontainers-2.4.0-py3.8.egg
sortedcontainers 2.4.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/sortedcontainers-2.4.0-py3.8.egg
Searching for msgpack==1.0.2
Best match: msgpack 1.0.2
Processing msgpack-1.0.2-py3.8-linux-x86_64.egg
msgpack 1.0.2 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/msgpack-1.0.2-py3.8-linux-x86_64.egg
Searching for psutil==5.8.0
Best match: psutil 5.8.0
Processing psutil-5.8.0-py3.8-linux-x86_64.egg
psutil 5.8.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/psutil-5.8.0-py3.8-linux-x86_64.egg
Searching for click==8.0.1
Best match: click 8.0.1
Processing click-8.0.1-py3.8.egg
click 8.0.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/click-8.0.1-py3.8.egg
Searching for partd==1.2.0
Best match: partd 1.2.0
Processing partd-1.2.0-py3.8.egg
partd 1.2.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/partd-1.2.0-py3.8.egg
Searching for fsspec==2021.7.0
Best match: fsspec 2021.7.0
Processing fsspec-2021.7.0-py3.8.egg
fsspec 2021.7.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/fsspec-2021.7.0-py3.8.egg
Searching for six==1.15.0
Best match: six 1.15.0
Adding six 1.15.0 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for HeapDict==1.0.1
Best match: HeapDict 1.0.1
Processing HeapDict-1.0.1-py3.8.egg
HeapDict 1.0.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/HeapDict-1.0.1-py3.8.egg
Searching for locket==0.2.1
Best match: locket 0.2.1
Processing locket-0.2.1-py3.8.egg
locket 0.2.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/locket-0.2.1-py3.8.egg
Finished processing dependencies for nvtabular==0.5.3+71.g52568e0
Running black --check
All done! ✨ 🍰 ✨
109 files would be left unchanged.
Running flake8
Running isort
/usr/local/lib/python3.8/dist-packages/isort/main.py:141: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
warn(f"Likely recursive symlink detected to {resolved_path}")
/usr/local/lib/python3.8/dist-packages/isort/main.py:141: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/examples/scaling-criteo/imgs
warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 2 files
Running bandit
Running pylint
************* Module nvtabular.ops.categorify
nvtabular/ops/categorify.py:459:15: I1101: Module 'nvtabular_cpp' has no 'inference' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
************* Module nvtabular.ops.fill
nvtabular/ops/fill.py:66:15: I1101: Module 'nvtabular_cpp' has no 'inference' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
************* Module bench.datasets.tools.train_hugectr
bench/datasets/tools/train_hugectr.py:28:13: I1101: Module 'hugectr' has no 'solver_parser_helper' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
bench/datasets/tools/train_hugectr.py:41:16: I1101: Module 'hugectr' has no 'optimizer' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)


Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)

Running flake8-nb
Building docs
make: Entering directory '/var/jenkins_home/workspace/nvtabular_tests/nvtabular/docs'
2021-07-27 14:29:03.856349: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-07-27 14:29:05.228870: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-07-27 14:29:05.230138: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:07:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2021-07-27 14:29:05.231316: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 1 with properties:
pciBusID: 0000:08:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2021-07-27 14:29:05.231350: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-07-27 14:29:05.231404: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-07-27 14:29:05.231443: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-07-27 14:29:05.231482: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-07-27 14:29:05.231520: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-07-27 14:29:05.231571: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-07-27 14:29:05.231609: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-07-27 14:29:05.231653: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-07-27 14:29:05.236177: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0, 1
/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.6) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
make: Leaving directory '/var/jenkins_home/workspace/nvtabular_tests/nvtabular/docs'
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml
plugins: cov-2.12.1, forked-1.3.0, xdist-2.3.0
collected 1155 items

tests/unit/test_column_group.py .. [ 0%]
tests/unit/test_column_similarity.py ........................ [ 2%]
tests/unit/test_cpu_workflow.py ...... [ 2%]
tests/unit/test_dask_nvt.py ............................................ [ 6%]
..................................................................... [ 12%]
tests/unit/test_dataloader_backend.py . [ 12%]
tests/unit/test_io.py .................................................. [ 16%]
........................................................................ [ 23%]
........ssssssss.................................................. [ 28%]
tests/unit/test_notebooks.py ...... [ 29%]
tests/unit/test_ops.py ................................................. [ 33%]
........................................................................ [ 39%]
........................................................................ [ 46%]
........................................................................ [ 52%]
........................................................................ [ 58%]
........................................................................ [ 64%]
................................. [ 67%]
tests/unit/test_s3.py .. [ 67%]
tests/unit/test_tf_dataloader.py ....................................... [ 71%]
.................................s [ 74%]
tests/unit/test_tf_feature_columns.py . [ 74%]
tests/unit/test_tf_layers.py ........................................... [ 78%]
................................... [ 81%]
tests/unit/test_tools.py ...................... [ 82%]
tests/unit/test_torch_dataloader.py .................................... [ 86%]
.............................................. [ 90%]
tests/unit/test_triton_inference.py ....................... [ 92%]
tests/unit/test_workflow.py ............................................ [ 95%]
................................................ [100%]

=============================== warnings summary ===============================
tests/unit/test_io.py: 36 warnings
tests/unit/test_workflow.py: 44 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:86: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler will be used for execution. Please use the client argument to initialize a Workflow object with distributed-execution enabled.
warnings.warn(

tests/unit/test_io.py: 52 warnings
tests/unit/test_workflow.py: 35 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dask.py:372: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler will be used for this write operation. Please use the client argument to initialize a Dataset and/or Workflow object with distributed-execution enabled.
warnings.warn(

tests/unit/test_io.py: 20 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:476: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler is being used for this shuffle operation. Please use the client argument to initialize a Dataset and/or Workflow object with distributed-execution enabled.
warnings.warn(

tests/unit/test_ops.py::test_fill_missing[True-True-parquet]
tests/unit/test_ops.py::test_fill_missing[True-False-parquet]
tests/unit/test_ops.py::test_filter[parquet-0.1-True]
/usr/local/lib/python3.8/dist-packages/pandas/core/indexing.py:670: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
iloc._setitem_with_indexer(indexer, value)

tests/unit/test_ops.py::test_join_external[True-True-left-host-pandas-parquet]
tests/unit/test_ops.py::test_join_external[True-True-left-device-pandas-parquet]
tests/unit/test_ops.py::test_join_external[True-True-inner-host-pandas-parquet]
tests/unit/test_ops.py::test_join_external[True-True-inner-device-pandas-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/join_external.py:171: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
_ext.drop_duplicates(ignore_index=True, inplace=True)

tests/unit/test_ops.py::test_filter[parquet-0.1-True]
tests/unit/test_ops.py::test_filter[parquet-0.1-False]
tests/unit/test_ops.py::test_groupby_op[id-True]
tests/unit/test_ops.py::test_groupby_op[id-False]
/var/jenkins_home/.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg/dask/dataframe/core.py:6610: UserWarning: Insufficient elements for head. 1 elements requested, only 0 elements available. Try passing larger npartitions to head.
warnings.warn(msg.format(n, len(r)))

-- Docs: https://docs.pytest.org/en/stable/warnings.html

---------- coverage: platform linux, python 3.8.10-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

examples/multi-gpu-movielens/torch_trainer.py 65 0 6 1 99% 32->36
nvtabular/init.py 12 0 0 0 100%
nvtabular/column_group.py 157 18 82 5 87% 54, 87, 128, 152-165, 214, 301
nvtabular/dispatch.py 251 44 122 21 81% 35-38, 43-45, 51-61, 68-69, 86, 93, 101, 112, 118, 123->125, 136, 159-162, 201, 208, 224, 231, 262->267, 265, 268, 271->275, 308, 319-322, 349-352, 382, 386, 427, 451, 453, 460
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 132 78 88 15 38% 29, 98, 102, 113-129, 139, 142-157, 161, 165-166, 172-197, 206-216, 219-226, 228->231, 232, 237-277, 280
nvtabular/framework_utils/tensorflow/layers/init.py 4 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 153 12 85 6 91% 60, 68->49, 122, 179, 231-239, 335->343, 357->360, 363-364, 367
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 25 20 1 43% 49, 74-103, 106-110, 113
nvtabular/framework_utils/tensorflow/layers/outer_product.py 30 24 10 0 15% 37-38, 41-60, 71-84, 87
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 30 1 12 1 95% 47
nvtabular/framework_utils/torch/models.py 45 0 28 0 100%
nvtabular/framework_utils/torch/utils.py 75 4 30 2 94% 64, 118-120
nvtabular/inference/init.py 0 0 0 0 100%
nvtabular/inference/triton/init.py 298 138 128 15 54% 80-84, 138-188, 233-277, 308, 310, 334-342, 350-357, 376, 398-414, 418-422, 455-459, 497-507, 531-553, 557-624, 633->636, 636->632, 665-675, 679-680, 684, 694, 715, 722, 728->731, 732
nvtabular/inference/triton/benchmarking_tools.py 52 52 10 0 0% 2-103
nvtabular/inference/triton/data_conversions.py 87 3 58 4 95% 32-33, 84
nvtabular/inference/triton/model.py 144 144 76 0 0% 27-280
nvtabular/inference/triton/model_config_pb2.py 299 0 2 0 100%
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 88 88 30 0 0% 16-189
nvtabular/io/csv.py 57 6 20 5 86% 22-23, 99, 103->107, 108, 110, 124
nvtabular/io/dask.py 183 8 72 11 93% 111, 114, 150, 398, 408, 425->428, 436, 440->442, 442->438, 447, 449
nvtabular/io/dataframe_engine.py 61 5 28 6 88% 19-20, 50, 69, 88->92, 92->97, 94->97, 97->116, 125
nvtabular/io/dataset.py 306 40 138 26 84% 44-45, 246, 248, 261, 270, 288-302, 405->475, 410-413, 418->428, 423-424, 435->433, 449->453, 464, 475->484, 534->538, 581, 700, 704-706, 708, 768-769, 796, 800->821, 803-804, 810, 816, 911-912, 1028-1033, 1039, 1089
nvtabular/io/dataset_engine.py 23 1 0 0 96% 45
nvtabular/io/hugectr.py 45 2 24 2 91% 34, 74->97, 101
nvtabular/io/parquet.py 492 23 156 14 94% 33-34, 92-100, 124->126, 213-215, 338-343, 381-386, 502->509, 570->575, 576-577, 697, 701, 705, 711, 743, 760, 764, 771->773, 881->exit, 891->896, 901->911, 916, 938
nvtabular/io/shuffle.py 31 6 16 5 77% 42, 44-45, 49, 59, 63
nvtabular/io/writer.py 173 13 66 5 92% 24-25, 51, 79, 125, 128, 207, 216, 219, 262, 283-285
nvtabular/io/writer_factory.py 18 2 8 2 85% 35, 60
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 327 12 138 9 95% 142-143, 233->235, 245-249, 295-296, 335->339, 410, 414-415, 445, 550, 558
nvtabular/loader/tensorflow.py 155 22 50 7 85% 57, 65-68, 78, 88, 296, 332, 347-349, 378-380, 390-398, 401-404
nvtabular/loader/tf_utils.py 55 10 20 5 80% 29->32, 32->34, 39->41, 43, 50-51, 58-60, 66-70
nvtabular/loader/torch.py 81 13 16 2 78% 25-27, 30-36, 111, 149-150
nvtabular/ops/init.py 21 0 0 0 100%
nvtabular/ops/bucketize.py 32 10 18 3 62% 52-54, 58, 61-64, 83-86
nvtabular/ops/categorify.py 573 67 323 47 85% 230, 232, 247, 251, 259, 267, 269, 296, 315-316, 331, 342->347, 350-357, 436-437, 455-456, 532->534, 655, 691, 720->723, 724-726, 733-734, 747-749, 750->718, 766, 774, 776, 783->exit, 806, 809->812, 820, 845, 850, 866->870, 877-880, 891, 895, 897, 909-912, 990, 992, 1021->1044, 1027->1044, 1045-1050, 1087, 1105->1110, 1109, 1119->1116, 1124->1116, 1132, 1140-1150
nvtabular/ops/clip.py 18 2 6 3 79% 43, 51->53, 54
nvtabular/ops/column_similarity.py 103 24 36 5 72% 19-20, 76->exit, 106, 178-179, 188-190, 198-214, 231->234, 235, 245
nvtabular/ops/data_stats.py 56 2 22 3 94% 91->93, 95, 97->87, 102
nvtabular/ops/difference_lag.py 25 0 8 1 97% 66->68
nvtabular/ops/dropna.py 8 0 0 0 100%
nvtabular/ops/fill.py 63 6 22 1 89% 62-66, 101, 127
nvtabular/ops/filter.py 20 1 6 1 92% 49
nvtabular/ops/groupby.py 92 4 56 6 92% 71, 80, 82, 92->94, 104->109, 180
nvtabular/ops/hash_bucket.py 29 2 18 2 87% 69, 99
nvtabular/ops/hashed_cross.py 28 3 13 4 83% 50, 63, 77->exit, 78
nvtabular/ops/join_external.py 89 7 38 6 90% 20-21, 113, 115, 117, 159, 176->178, 212
nvtabular/ops/join_groupby.py 84 5 30 2 94% 106, 109->118, 194-195, 198-199
nvtabular/ops/lambdaop.py 39 6 18 6 79% 59, 63, 77, 89, 94, 103
nvtabular/ops/list_slice.py 63 24 26 1 56% 21-22, 52-53, 100-114, 122-133
nvtabular/ops/logop.py 8 0 0 0 100%
nvtabular/ops/moments.py 65 0 20 0 100%
nvtabular/ops/normalize.py 71 8 14 1 87% 69, 77-78, 111-112, 134-135, 139
nvtabular/ops/operator.py 29 1 2 1 94% 25
nvtabular/ops/rename.py 23 3 14 3 84% 45, 66-68
nvtabular/ops/stat_operator.py 8 0 0 0 100%
nvtabular/ops/target_encoding.py 146 11 64 5 90% 147, 167->171, 174->183, 228-229, 232-233, 242-248, 339->342
nvtabular/tools/init.py 0 0 0 0 100%
nvtabular/tools/data_gen.py 236 1 62 1 99% 323
nvtabular/tools/dataset_inspector.py 49 7 18 1 79% 31-38
nvtabular/tools/inspector_script.py 46 46 0 0 0% 17-168
nvtabular/utils.py 102 43 46 8 52% 31-32, 36-37, 50, 61-62, 64-66, 69, 72, 78, 84, 90-126, 145, 149->153
nvtabular/worker.py 82 5 38 7 90% 24-25, 82->99, 91, 92->99, 99->102, 108, 110, 111->113
nvtabular/workflow.py 161 13 77 5 92% 28-29, 45, 141, 155-157, 261, 276-277, 295-296, 384

TOTAL 6352 1095 2534 293 80%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 80.47%
=========================== short test summary info ============================
SKIPPED [8] tests/unit/test_io.py:500: could not import 'uavro': No module named 'uavro'
SKIPPED [1] tests/unit/test_tf_dataloader.py:521: not working correctly in ci environment
========== 1146 passed, 9 skipped, 198 warnings in 948.51s (0:15:48) ===========
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://github.com/gitapi/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins4919202167961871118.sh

@rjzamora
Copy link
Collaborator Author

rjzamora commented Jul 27, 2021

Not sure what might have cause the first CI failure - I cannot reproduce it locally, and I do not expect that it is related to this PR.

@rjzamora rjzamora merged commit b460687 into NVIDIA-Merlin:main Jul 27, 2021
@rjzamora rjzamora deleted the normalize-fix branch July 27, 2021 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants