Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sample_dtypes method to Dataset #1028

Merged
merged 7 commits into from
Aug 10, 2021

Conversation

rjzamora
Copy link
Collaborator

The current Dataset.fit logic uses a "dangerous" to_ddf().head(1) call to sample the "real" pandas/cudf data to infer the data types. It makes sense that we cannot trust the dtypes attribute of the underlying Dask collection (for cudf-based collections), but there is no guarentee that head(n) will return the first n rows of the dataset unless there are >=n rows in the first Dask partition. To avoid this problem, this PR adds a simple Dataset.sample_dtypes method that will iterate through the Dask partitions until a non-empty partition is found. Although this may be slow for a large Dask collection with many empty partitions, the common case should have the same performance (and we are clearly fixing a bug).

@rjzamora
Copy link
Collaborator Author

cc @oyilmaz-nvidia

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #1028 of commit ae1ab94871231e8c1de9b5501c394d50eb92ce95, no merge conflicts.
Running as SYSTEM
Setting status of ae1ab94871231e8c1de9b5501c394d50eb92ce95 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/3113/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/1028/*:refs/remotes/origin/pr/1028/* # timeout=10
 > git rev-parse ae1ab94871231e8c1de9b5501c394d50eb92ce95^{commit} # timeout=10
Checking out Revision ae1ab94871231e8c1de9b5501c394d50eb92ce95 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f ae1ab94871231e8c1de9b5501c394d50eb92ce95 # timeout=10
Commit message: "Merge remote-tracking branch 'upstream/main' into add-sample_dtypes"
 > git rev-list --no-walk 3fe5693d7bc55d4218824b8200b7d52fbc303628 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins127774214642998396.sh
Installing NVTabular
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: pip in /var/jenkins_home/.local/lib/python3.8/site-packages (21.2.3)
Requirement already satisfied: setuptools in /var/jenkins_home/.local/lib/python3.8/site-packages (57.4.0)
Requirement already satisfied: wheel in /var/jenkins_home/.local/lib/python3.8/site-packages (0.37.0)
Requirement already satisfied: pybind11 in /var/jenkins_home/.local/lib/python3.8/site-packages (2.7.1)
running develop
running egg_info
creating nvtabular.egg-info
writing nvtabular.egg-info/PKG-INFO
writing dependency_links to nvtabular.egg-info/dependency_links.txt
writing requirements to nvtabular.egg-info/requires.txt
writing top-level names to nvtabular.egg-info/top_level.txt
writing manifest file 'nvtabular.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'nvtabular.egg-info/SOURCES.txt'
running build_ext
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.8 -c flagcheck.cpp -o flagcheck.o -std=c++17
building 'nvtabular_cpp' extension
creating build
creating build/temp.linux-x86_64-3.8
creating build/temp.linux-x86_64-3.8/cpp
creating build/temp.linux-x86_64-3.8/cpp/nvtabular
creating build/temp.linux-x86_64-3.8/cpp/nvtabular/inference
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+7.gae1ab94 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+7.gae1ab94 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+7.gae1ab94 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/categorify.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+7.gae1ab94 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/fill.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -std=c++17 -fvisibility=hidden -g0
creating build/lib.linux-x86_64-3.8
x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -o build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so
copying build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so -> 
Generating nvtabular/inference/triton/model_config_pb2.py from nvtabular/inference/triton/model_config.proto
Creating /var/jenkins_home/.local/lib/python3.8/site-packages/nvtabular.egg-link (link to .)
nvtabular 0.6.0+7.gae1ab94 is already the active version in easy-install.pth

Installed /var/jenkins_home/workspace/nvtabular_tests/nvtabular
Processing dependencies for nvtabular==0.6.0+7.gae1ab94
Searching for pyarrow==1.0.1
Best match: pyarrow 1.0.1
Adding pyarrow 1.0.1 to easy-install.pth file
Installing plasma_store script to /var/jenkins_home/.local/bin

Using /usr/local/lib/python3.8/dist-packages
Searching for tqdm==4.61.2
Best match: tqdm 4.61.2
Processing tqdm-4.61.2-py3.8.egg
tqdm 4.61.2 is already the active version in easy-install.pth
Installing tqdm script to /var/jenkins_home/.local/bin

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tqdm-4.61.2-py3.8.egg
Searching for numba==0.53.1
Best match: numba 0.53.1
Adding numba 0.53.1 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for pandas==1.1.5
Best match: pandas 1.1.5
Adding pandas 1.1.5 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for distributed==2021.4.1
Best match: distributed 2021.4.1
Adding distributed 2021.4.1 to easy-install.pth file
Installing dask-ssh script to /var/jenkins_home/.local/bin
Installing dask-scheduler script to /var/jenkins_home/.local/bin
Installing dask-worker script to /var/jenkins_home/.local/bin

Using /var/jenkins_home/.local/lib/python3.8/site-packages
Searching for dask==2021.4.1
Best match: dask 2021.4.1
Processing dask-2021.4.1-py3.8.egg
dask 2021.4.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg
Searching for PyYAML==5.4.1
Best match: PyYAML 5.4.1
Processing PyYAML-5.4.1-py3.8-linux-x86_64.egg
PyYAML 5.4.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/PyYAML-5.4.1-py3.8-linux-x86_64.egg
Searching for numpy==1.20.2
Best match: numpy 1.20.2
Adding numpy 1.20.2 to easy-install.pth file
Installing f2py script to /var/jenkins_home/.local/bin
Installing f2py3 script to /var/jenkins_home/.local/bin
Installing f2py3.8 script to /var/jenkins_home/.local/bin

Using /usr/local/lib/python3.8/dist-packages
Searching for llvmlite==0.36.0
Best match: llvmlite 0.36.0
Adding llvmlite 0.36.0 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for setuptools==57.4.0
Best match: setuptools 57.4.0
Adding setuptools 57.4.0 to easy-install.pth file

Using /var/jenkins_home/.local/lib/python3.8/site-packages
Searching for pytz==2021.1
Best match: pytz 2021.1
Adding pytz 2021.1 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for python-dateutil==2.8.2
Best match: python-dateutil 2.8.2
Adding python-dateutil 2.8.2 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for tblib==1.7.0
Best match: tblib 1.7.0
Processing tblib-1.7.0-py3.8.egg
tblib 1.7.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tblib-1.7.0-py3.8.egg
Searching for cloudpickle==1.6.0
Best match: cloudpickle 1.6.0
Processing cloudpickle-1.6.0-py3.8.egg
cloudpickle 1.6.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/cloudpickle-1.6.0-py3.8.egg
Searching for tornado==6.1
Best match: tornado 6.1
Processing tornado-6.1-py3.8-linux-x86_64.egg
tornado 6.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tornado-6.1-py3.8-linux-x86_64.egg
Searching for zict==2.0.0
Best match: zict 2.0.0
Processing zict-2.0.0-py3.8.egg
zict 2.0.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/zict-2.0.0-py3.8.egg
Searching for click==8.0.1
Best match: click 8.0.1
Processing click-8.0.1-py3.8.egg
click 8.0.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/click-8.0.1-py3.8.egg
Searching for msgpack==1.0.2
Best match: msgpack 1.0.2
Processing msgpack-1.0.2-py3.8-linux-x86_64.egg
msgpack 1.0.2 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/msgpack-1.0.2-py3.8-linux-x86_64.egg
Searching for sortedcontainers==2.4.0
Best match: sortedcontainers 2.4.0
Processing sortedcontainers-2.4.0-py3.8.egg
sortedcontainers 2.4.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/sortedcontainers-2.4.0-py3.8.egg
Searching for toolz==0.11.1
Best match: toolz 0.11.1
Processing toolz-0.11.1-py3.8.egg
toolz 0.11.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/toolz-0.11.1-py3.8.egg
Searching for psutil==5.8.0
Best match: psutil 5.8.0
Processing psutil-5.8.0-py3.8-linux-x86_64.egg
psutil 5.8.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/psutil-5.8.0-py3.8-linux-x86_64.egg
Searching for partd==1.2.0
Best match: partd 1.2.0
Processing partd-1.2.0-py3.8.egg
partd 1.2.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/partd-1.2.0-py3.8.egg
Searching for fsspec==2021.7.0
Best match: fsspec 2021.7.0
Processing fsspec-2021.7.0-py3.8.egg
fsspec 2021.7.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/fsspec-2021.7.0-py3.8.egg
Searching for six==1.15.0
Best match: six 1.15.0
Adding six 1.15.0 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for HeapDict==1.0.1
Best match: HeapDict 1.0.1
Processing HeapDict-1.0.1-py3.8.egg
HeapDict 1.0.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/HeapDict-1.0.1-py3.8.egg
Searching for locket==0.2.1
Best match: locket 0.2.1
Processing locket-0.2.1-py3.8.egg
locket 0.2.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/locket-0.2.1-py3.8.egg
Finished processing dependencies for nvtabular==0.6.0+7.gae1ab94
Running black --check
All done! ✨ 🍰 ✨
110 files would be left unchanged.
Running flake8
Running isort
/usr/local/lib/python3.8/dist-packages/isort/main.py:141: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
warn(f"Likely recursive symlink detected to {resolved_path}")
/usr/local/lib/python3.8/dist-packages/isort/main.py:141: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/examples/scaling-criteo/imgs
warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 2 files
Running bandit
Running pylint
************* Module nvtabular.ops.categorify
nvtabular/ops/categorify.py:459:15: I1101: Module 'nvtabular_cpp' has no 'inference' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
************* Module nvtabular.ops.fill
nvtabular/ops/fill.py:66:15: I1101: Module 'nvtabular_cpp' has no 'inference' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
************* Module bench.datasets.tools.train_hugectr
bench/datasets/tools/train_hugectr.py:28:13: I1101: Module 'hugectr' has no 'solver_parser_helper' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
bench/datasets/tools/train_hugectr.py:41:16: I1101: Module 'hugectr' has no 'optimizer' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)


Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)

Running flake8-nb
Building docs
make: Entering directory '/var/jenkins_home/workspace/nvtabular_tests/nvtabular/docs'
2021-08-10 17:25:45.887787: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-08-10 17:25:47.264163: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-08-10 17:25:47.265241: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:07:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2021-08-10 17:25:47.266245: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 1 with properties:
pciBusID: 0000:08:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2021-08-10 17:25:47.266275: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-08-10 17:25:47.266322: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-08-10 17:25:47.266356: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-08-10 17:25:47.266407: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-08-10 17:25:47.266443: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-08-10 17:25:47.266494: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-08-10 17:25:47.266530: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-08-10 17:25:47.266571: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-08-10 17:25:47.271020: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0, 1
/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.6) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
make: Leaving directory '/var/jenkins_home/workspace/nvtabular_tests/nvtabular/docs'
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml
plugins: cov-2.12.1, forked-1.3.0, xdist-2.3.0
collected 1161 items

tests/unit/test_column_group.py .. [ 0%]
tests/unit/test_column_similarity.py ........................ [ 2%]
tests/unit/test_cpu_workflow.py ...... [ 2%]
tests/unit/test_dask_nvt.py ............................................ [ 6%]
..................................................................... [ 12%]
tests/unit/test_dataloader_backend.py . [ 12%]
tests/unit/test_io.py .................................................. [ 16%]
........................................................................ [ 23%]
........ssssssss.................................................. [ 28%]
tests/unit/test_notebooks.py ...... [ 29%]
tests/unit/test_ops.py ................................................. [ 33%]
........................................................................ [ 39%]
........................................................................ [ 45%]
........................................................................ [ 52%]
........................................................................ [ 58%]
........................................................................ [ 64%]
................................. [ 67%]
tests/unit/test_s3.py .. [ 67%]
tests/unit/test_tf_dataloader.py ....................................... [ 70%]
.................................s [ 73%]
tests/unit/test_tf_feature_columns.py . [ 73%]
tests/unit/test_tf_layers.py ........................................... [ 77%]
................................... [ 80%]
tests/unit/test_tools.py ...................... [ 82%]
tests/unit/test_torch_dataloader.py .................................... [ 85%]
.............................................. [ 89%]
tests/unit/test_torch_layers.py . [ 89%]
tests/unit/test_triton_inference.py ............................ [ 92%]
tests/unit/test_workflow.py ............................................ [ 95%]
................................................ [100%]

=============================== warnings summary ===============================
tests/unit/test_io.py: 36 warnings
tests/unit/test_workflow.py: 44 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:86: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler will be used for execution. Please use the client argument to initialize a Workflow object with distributed-execution enabled.
warnings.warn(

tests/unit/test_io.py: 52 warnings
tests/unit/test_workflow.py: 35 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dask.py:372: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler will be used for this write operation. Please use the client argument to initialize a Dataset and/or Workflow object with distributed-execution enabled.
warnings.warn(

tests/unit/test_io.py: 20 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:476: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler is being used for this shuffle operation. Please use the client argument to initialize a Dataset and/or Workflow object with distributed-execution enabled.
warnings.warn(

tests/unit/test_ops.py::test_fill_missing[True-True-parquet]
tests/unit/test_ops.py::test_fill_missing[True-False-parquet]
tests/unit/test_ops.py::test_filter[parquet-0.1-True]
/usr/local/lib/python3.8/dist-packages/pandas/core/indexing.py:670: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
iloc._setitem_with_indexer(indexer, value)

tests/unit/test_ops.py::test_join_external[True-True-left-host-pandas-parquet]
tests/unit/test_ops.py::test_join_external[True-True-left-device-pandas-parquet]
tests/unit/test_ops.py::test_join_external[True-True-inner-host-pandas-parquet]
tests/unit/test_ops.py::test_join_external[True-True-inner-device-pandas-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/join_external.py:171: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
_ext.drop_duplicates(ignore_index=True, inplace=True)

-- Docs: https://docs.pytest.org/en/stable/warnings.html

---------- coverage: platform linux, python 3.8.10-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

examples/multi-gpu-movielens/torch_trainer.py 65 0 6 1 99% 32->36
nvtabular/init.py 12 0 0 0 100%
nvtabular/column_group.py 157 18 82 5 87% 54, 87, 128, 152-165, 214, 301
nvtabular/dispatch.py 253 45 124 22 81% 35-38, 43-45, 51-61, 68-69, 86, 93, 101, 112, 118, 123->125, 136, 159-162, 201, 204, 210, 226, 233, 264->269, 267, 270, 273->277, 310, 321-324, 351-354, 384, 388, 429, 453, 455, 462
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 132 78 88 15 38% 29, 98, 102, 113-129, 139, 142-157, 161, 165-166, 172-197, 206-216, 219-226, 228->231, 232, 237-277, 280
nvtabular/framework_utils/tensorflow/layers/init.py 4 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 153 12 85 6 91% 60, 68->49, 122, 179, 231-239, 335->343, 357->360, 363-364, 367
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 25 20 1 43% 49, 74-103, 106-110, 113
nvtabular/framework_utils/tensorflow/layers/outer_product.py 30 24 10 0 15% 37-38, 41-60, 71-84, 87
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 32 1 14 1 96% 50
nvtabular/framework_utils/torch/models.py 45 0 28 0 100%
nvtabular/framework_utils/torch/utils.py 75 4 30 2 94% 64, 118-120
nvtabular/inference/init.py 0 0 0 0 100%
nvtabular/inference/triton/init.py 298 131 128 13 57% 80-84, 138-188, 233-277, 308, 334-342, 350-357, 376, 398-414, 455-459, 497-507, 531-553, 557-624, 633->636, 636->632, 665-675, 684, 694, 715, 722, 728->731, 732
nvtabular/inference/triton/benchmarking_tools.py 52 52 10 0 0% 2-103
nvtabular/inference/triton/data_conversions.py 87 3 58 4 95% 32-33, 84
nvtabular/inference/triton/model.py 160 160 80 0 0% 27-305
nvtabular/inference/triton/model_config_pb2.py 299 0 2 0 100%
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 88 88 30 0 0% 16-189
nvtabular/io/csv.py 57 6 20 5 86% 22-23, 99, 103->107, 108, 110, 124
nvtabular/io/dask.py 183 8 72 11 93% 111, 114, 150, 398, 408, 425->428, 436, 440->442, 442->438, 447, 449
nvtabular/io/dataframe_engine.py 61 5 28 6 88% 19-20, 50, 69, 88->92, 92->97, 94->97, 97->116, 125
nvtabular/io/dataset.py 322 43 150 28 84% 44-45, 246, 248, 261, 270, 288-302, 405->475, 410-413, 418->428, 423-424, 435->433, 449->453, 464, 475->484, 534->538, 581, 703, 705, 707, 713, 717-719, 721, 781-782, 809, 816-817, 823, 829, 924-925, 1041-1046, 1052, 1119
nvtabular/io/dataset_engine.py 23 1 0 0 96% 45
nvtabular/io/hugectr.py 45 2 24 2 91% 34, 74->97, 101
nvtabular/io/parquet.py 492 23 156 14 94% 33-34, 92-100, 124->126, 213-215, 338-343, 381-386, 502->509, 570->575, 576-577, 697, 701, 705, 711, 743, 760, 764, 771->773, 881->exit, 891->896, 901->911, 916, 938
nvtabular/io/shuffle.py 31 6 16 5 77% 42, 44-45, 49, 59, 63
nvtabular/io/writer.py 175 13 68 5 92% 24-25, 51, 79, 125, 128, 212, 221, 224, 267, 288-290
nvtabular/io/writer_factory.py 18 2 8 2 85% 35, 60
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 327 12 138 9 95% 142-143, 233->235, 245-249, 295-296, 335->339, 410, 414-415, 445, 550, 558
nvtabular/loader/tensorflow.py 155 22 50 7 85% 57, 65-68, 78, 88, 296, 332, 347-349, 378-380, 390-398, 401-404
nvtabular/loader/tf_utils.py 55 10 20 5 80% 29->32, 32->34, 39->41, 43, 50-51, 58-60, 66-70
nvtabular/loader/torch.py 81 13 16 2 78% 25-27, 30-36, 111, 149-150
nvtabular/ops/init.py 21 0 0 0 100%
nvtabular/ops/bucketize.py 32 10 18 3 62% 52-54, 58, 61-64, 83-86
nvtabular/ops/categorify.py 573 67 323 47 85% 230, 232, 247, 251, 259, 267, 269, 296, 315-316, 331, 342->347, 350-357, 436-437, 455-456, 532->534, 655, 691, 720->723, 724-726, 733-734, 747-749, 750->718, 766, 774, 776, 783->exit, 806, 809->812, 820, 845, 850, 866->870, 877-880, 891, 895, 897, 909-912, 990, 992, 1021->1044, 1027->1044, 1045-1050, 1087, 1105->1110, 1109, 1119->1116, 1124->1116, 1132, 1140-1150
nvtabular/ops/clip.py 18 2 6 3 79% 43, 51->53, 54
nvtabular/ops/column_similarity.py 103 24 36 5 72% 19-20, 76->exit, 106, 178-179, 188-190, 198-214, 231->234, 235, 245
nvtabular/ops/data_stats.py 56 2 22 3 94% 91->93, 95, 97->87, 102
nvtabular/ops/difference_lag.py 25 0 8 1 97% 66->68
nvtabular/ops/dropna.py 8 0 0 0 100%
nvtabular/ops/fill.py 63 6 22 1 89% 62-66, 101, 127
nvtabular/ops/filter.py 20 1 6 1 92% 49
nvtabular/ops/groupby.py 92 4 56 6 92% 71, 80, 82, 92->94, 104->109, 180
nvtabular/ops/hash_bucket.py 29 2 18 2 87% 69, 99
nvtabular/ops/hashed_cross.py 28 3 13 4 83% 50, 63, 77->exit, 78
nvtabular/ops/join_external.py 89 7 38 6 90% 20-21, 113, 115, 117, 159, 176->178, 212
nvtabular/ops/join_groupby.py 84 5 30 2 94% 106, 109->118, 194-195, 198-199
nvtabular/ops/lambdaop.py 39 6 18 6 79% 59, 63, 77, 89, 94, 103
nvtabular/ops/list_slice.py 63 24 26 1 56% 21-22, 52-53, 100-114, 122-133
nvtabular/ops/logop.py 8 0 0 0 100%
nvtabular/ops/moments.py 65 0 20 0 100%
nvtabular/ops/normalize.py 71 8 14 1 87% 69, 77-78, 111-112, 134-135, 139
nvtabular/ops/operator.py 29 1 2 1 94% 25
nvtabular/ops/rename.py 23 3 14 3 84% 45, 66-68
nvtabular/ops/stat_operator.py 8 0 0 0 100%
nvtabular/ops/target_encoding.py 146 11 64 5 90% 147, 167->171, 174->183, 228-229, 232-233, 242-248, 339->342
nvtabular/tools/init.py 0 0 0 0 100%
nvtabular/tools/data_gen.py 236 1 62 1 99% 323
nvtabular/tools/dataset_inspector.py 49 7 18 1 79% 31-38
nvtabular/tools/inspector_script.py 46 46 0 0 0% 17-168
nvtabular/utils.py 102 43 46 8 52% 31-32, 36-37, 50, 61-62, 64-66, 69, 72, 78, 84, 90-126, 145, 149->153
nvtabular/worker.py 82 5 38 7 90% 24-25, 82->99, 91, 92->99, 99->102, 108, 110, 111->113
nvtabular/workflow.py 161 13 77 5 92% 28-29, 45, 141, 155-157, 261, 276-277, 295-296, 384

TOTAL 6390 1108 2556 294 80%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 80.45%
=========================== short test summary info ============================
SKIPPED [8] tests/unit/test_io.py:500: could not import 'uavro': No module named 'uavro'
SKIPPED [1] tests/unit/test_tf_dataloader.py:521: not working correctly in ci environment
========== 1152 passed, 9 skipped, 194 warnings in 1015.38s (0:16:55) ==========
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://github.com/gitapi/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins3378048853657505890.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #1028 of commit 170529f9935bd87ec5111ee699a793ea60840488, no merge conflicts.
Running as SYSTEM
Setting status of 170529f9935bd87ec5111ee699a793ea60840488 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/3114/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/1028/*:refs/remotes/origin/pr/1028/* # timeout=10
 > git rev-parse 170529f9935bd87ec5111ee699a793ea60840488^{commit} # timeout=10
Checking out Revision 170529f9935bd87ec5111ee699a793ea60840488 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 170529f9935bd87ec5111ee699a793ea60840488 # timeout=10
Commit message: "simplify logic a bit"
 > git rev-list --no-walk ae1ab94871231e8c1de9b5501c394d50eb92ce95 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins7105106784579247432.sh
Installing NVTabular
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: pip in /var/jenkins_home/.local/lib/python3.8/site-packages (21.2.3)
Requirement already satisfied: setuptools in /var/jenkins_home/.local/lib/python3.8/site-packages (57.4.0)
Requirement already satisfied: wheel in /var/jenkins_home/.local/lib/python3.8/site-packages (0.37.0)
Requirement already satisfied: pybind11 in /var/jenkins_home/.local/lib/python3.8/site-packages (2.7.1)
running develop
running egg_info
creating nvtabular.egg-info
writing nvtabular.egg-info/PKG-INFO
writing dependency_links to nvtabular.egg-info/dependency_links.txt
writing requirements to nvtabular.egg-info/requires.txt
writing top-level names to nvtabular.egg-info/top_level.txt
writing manifest file 'nvtabular.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'nvtabular.egg-info/SOURCES.txt'
running build_ext
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.8 -c flagcheck.cpp -o flagcheck.o -std=c++17
building 'nvtabular_cpp' extension
creating build
creating build/temp.linux-x86_64-3.8
creating build/temp.linux-x86_64-3.8/cpp
creating build/temp.linux-x86_64-3.8/cpp/nvtabular
creating build/temp.linux-x86_64-3.8/cpp/nvtabular/inference
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+8.g170529f -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+8.g170529f -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+8.g170529f -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/categorify.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+8.g170529f -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/fill.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -std=c++17 -fvisibility=hidden -g0
creating build/lib.linux-x86_64-3.8
x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -o build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so
copying build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so -> 
Generating nvtabular/inference/triton/model_config_pb2.py from nvtabular/inference/triton/model_config.proto
Creating /var/jenkins_home/.local/lib/python3.8/site-packages/nvtabular.egg-link (link to .)
nvtabular 0.6.0+8.g170529f is already the active version in easy-install.pth

Installed /var/jenkins_home/workspace/nvtabular_tests/nvtabular
Processing dependencies for nvtabular==0.6.0+8.g170529f
Searching for pyarrow==1.0.1
Best match: pyarrow 1.0.1
Adding pyarrow 1.0.1 to easy-install.pth file
Installing plasma_store script to /var/jenkins_home/.local/bin

Using /usr/local/lib/python3.8/dist-packages
Searching for tqdm==4.61.2
Best match: tqdm 4.61.2
Processing tqdm-4.61.2-py3.8.egg
tqdm 4.61.2 is already the active version in easy-install.pth
Installing tqdm script to /var/jenkins_home/.local/bin

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tqdm-4.61.2-py3.8.egg
Searching for numba==0.53.1
Best match: numba 0.53.1
Adding numba 0.53.1 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for pandas==1.1.5
Best match: pandas 1.1.5
Adding pandas 1.1.5 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for distributed==2021.4.1
Best match: distributed 2021.4.1
Adding distributed 2021.4.1 to easy-install.pth file
Installing dask-ssh script to /var/jenkins_home/.local/bin
Installing dask-scheduler script to /var/jenkins_home/.local/bin
Installing dask-worker script to /var/jenkins_home/.local/bin

Using /var/jenkins_home/.local/lib/python3.8/site-packages
Searching for dask==2021.4.1
Best match: dask 2021.4.1
Processing dask-2021.4.1-py3.8.egg
dask 2021.4.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg
Searching for PyYAML==5.4.1
Best match: PyYAML 5.4.1
Processing PyYAML-5.4.1-py3.8-linux-x86_64.egg
PyYAML 5.4.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/PyYAML-5.4.1-py3.8-linux-x86_64.egg
Searching for numpy==1.20.2
Best match: numpy 1.20.2
Adding numpy 1.20.2 to easy-install.pth file
Installing f2py script to /var/jenkins_home/.local/bin
Installing f2py3 script to /var/jenkins_home/.local/bin
Installing f2py3.8 script to /var/jenkins_home/.local/bin

Using /usr/local/lib/python3.8/dist-packages
Searching for setuptools==57.4.0
Best match: setuptools 57.4.0
Adding setuptools 57.4.0 to easy-install.pth file

Using /var/jenkins_home/.local/lib/python3.8/site-packages
Searching for llvmlite==0.36.0
Best match: llvmlite 0.36.0
Adding llvmlite 0.36.0 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for pytz==2021.1
Best match: pytz 2021.1
Adding pytz 2021.1 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for python-dateutil==2.8.2
Best match: python-dateutil 2.8.2
Adding python-dateutil 2.8.2 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for tornado==6.1
Best match: tornado 6.1
Processing tornado-6.1-py3.8-linux-x86_64.egg
tornado 6.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tornado-6.1-py3.8-linux-x86_64.egg
Searching for click==8.0.1
Best match: click 8.0.1
Processing click-8.0.1-py3.8.egg
click 8.0.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/click-8.0.1-py3.8.egg
Searching for cloudpickle==1.6.0
Best match: cloudpickle 1.6.0
Processing cloudpickle-1.6.0-py3.8.egg
cloudpickle 1.6.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/cloudpickle-1.6.0-py3.8.egg
Searching for tblib==1.7.0
Best match: tblib 1.7.0
Processing tblib-1.7.0-py3.8.egg
tblib 1.7.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tblib-1.7.0-py3.8.egg
Searching for sortedcontainers==2.4.0
Best match: sortedcontainers 2.4.0
Processing sortedcontainers-2.4.0-py3.8.egg
sortedcontainers 2.4.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/sortedcontainers-2.4.0-py3.8.egg
Searching for zict==2.0.0
Best match: zict 2.0.0
Processing zict-2.0.0-py3.8.egg
zict 2.0.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/zict-2.0.0-py3.8.egg
Searching for msgpack==1.0.2
Best match: msgpack 1.0.2
Processing msgpack-1.0.2-py3.8-linux-x86_64.egg
msgpack 1.0.2 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/msgpack-1.0.2-py3.8-linux-x86_64.egg
Searching for psutil==5.8.0
Best match: psutil 5.8.0
Processing psutil-5.8.0-py3.8-linux-x86_64.egg
psutil 5.8.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/psutil-5.8.0-py3.8-linux-x86_64.egg
Searching for toolz==0.11.1
Best match: toolz 0.11.1
Processing toolz-0.11.1-py3.8.egg
toolz 0.11.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/toolz-0.11.1-py3.8.egg
Searching for partd==1.2.0
Best match: partd 1.2.0
Processing partd-1.2.0-py3.8.egg
partd 1.2.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/partd-1.2.0-py3.8.egg
Searching for fsspec==2021.7.0
Best match: fsspec 2021.7.0
Processing fsspec-2021.7.0-py3.8.egg
fsspec 2021.7.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/fsspec-2021.7.0-py3.8.egg
Searching for six==1.15.0
Best match: six 1.15.0
Adding six 1.15.0 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for HeapDict==1.0.1
Best match: HeapDict 1.0.1
Processing HeapDict-1.0.1-py3.8.egg
HeapDict 1.0.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/HeapDict-1.0.1-py3.8.egg
Searching for locket==0.2.1
Best match: locket 0.2.1
Processing locket-0.2.1-py3.8.egg
locket 0.2.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/locket-0.2.1-py3.8.egg
Finished processing dependencies for nvtabular==0.6.0+8.g170529f
Running black --check
All done! ✨ 🍰 ✨
110 files would be left unchanged.
Running flake8
Running isort
/usr/local/lib/python3.8/dist-packages/isort/main.py:141: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
warn(f"Likely recursive symlink detected to {resolved_path}")
/usr/local/lib/python3.8/dist-packages/isort/main.py:141: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/examples/scaling-criteo/imgs
warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 2 files
Running bandit
Running pylint
************* Module nvtabular.ops.categorify
nvtabular/ops/categorify.py:459:15: I1101: Module 'nvtabular_cpp' has no 'inference' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
************* Module nvtabular.ops.fill
nvtabular/ops/fill.py:66:15: I1101: Module 'nvtabular_cpp' has no 'inference' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
************* Module bench.datasets.tools.train_hugectr
bench/datasets/tools/train_hugectr.py:28:13: I1101: Module 'hugectr' has no 'solver_parser_helper' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
bench/datasets/tools/train_hugectr.py:41:16: I1101: Module 'hugectr' has no 'optimizer' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)


Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)

Running flake8-nb
Building docs
make: Entering directory '/var/jenkins_home/workspace/nvtabular_tests/nvtabular/docs'
2021-08-10 17:45:24.460869: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-08-10 17:45:25.826980: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-08-10 17:45:25.828068: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:07:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2021-08-10 17:45:25.829067: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 1 with properties:
pciBusID: 0000:08:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2021-08-10 17:45:25.829097: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-08-10 17:45:25.829146: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-08-10 17:45:25.829179: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-08-10 17:45:25.829213: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-08-10 17:45:25.829244: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-08-10 17:45:25.829289: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-08-10 17:45:25.829319: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-08-10 17:45:25.829356: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-08-10 17:45:25.833606: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0, 1
/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.6) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
make: Leaving directory '/var/jenkins_home/workspace/nvtabular_tests/nvtabular/docs'
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml
plugins: cov-2.12.1, forked-1.3.0, xdist-2.3.0
collected 1161 items

tests/unit/test_column_group.py .. [ 0%]
tests/unit/test_column_similarity.py ........................ [ 2%]
tests/unit/test_cpu_workflow.py ...... [ 2%]
tests/unit/test_dask_nvt.py ............................................ [ 6%]
..................................................................... [ 12%]
tests/unit/test_dataloader_backend.py . [ 12%]
tests/unit/test_io.py .................................................. [ 16%]
........................................................................ [ 23%]
........ssssssss.................................................. [ 28%]
tests/unit/test_notebooks.py ...... [ 29%]
tests/unit/test_ops.py ................................................. [ 33%]
........................................................................ [ 39%]
........................................................................ [ 45%]
........................................................................ [ 52%]
........................................................................ [ 58%]
........................................................................ [ 64%]
................................. [ 67%]
tests/unit/test_s3.py .. [ 67%]
tests/unit/test_tf_dataloader.py ....................................... [ 70%]
.................................s [ 73%]
tests/unit/test_tf_feature_columns.py . [ 73%]
tests/unit/test_tf_layers.py ........................................... [ 77%]
................................... [ 80%]
tests/unit/test_tools.py ...................... [ 82%]
tests/unit/test_torch_dataloader.py ..................Build timed out (after 60 minutes). Marking the build as failed.
Terminated
Build was aborted
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://github.com/gitapi/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins8114337708872464826.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #1028 of commit 6af2355e8e2037f8fe946ff452ff68a17203a135, no merge conflicts.
Running as SYSTEM
Setting status of 6af2355e8e2037f8fe946ff452ff68a17203a135 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/3117/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/1028/*:refs/remotes/origin/pr/1028/* # timeout=10
 > git rev-parse 6af2355e8e2037f8fe946ff452ff68a17203a135^{commit} # timeout=10
Checking out Revision 6af2355e8e2037f8fe946ff452ff68a17203a135 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 6af2355e8e2037f8fe946ff452ff68a17203a135 # timeout=10
Commit message: "Merge branch 'main' into add-sample_dtypes"
 > git rev-list --no-walk d89cde4a5e0201bb7b784de5ba9bb1964d0a2340 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins6256790787784737091.sh
Installing NVTabular
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: pip in /var/jenkins_home/.local/lib/python3.8/site-packages (21.2.3)
Requirement already satisfied: setuptools in /var/jenkins_home/.local/lib/python3.8/site-packages (57.4.0)
Requirement already satisfied: wheel in /var/jenkins_home/.local/lib/python3.8/site-packages (0.37.0)
Requirement already satisfied: pybind11 in /var/jenkins_home/.local/lib/python3.8/site-packages (2.7.1)
running develop
running egg_info
creating nvtabular.egg-info
writing nvtabular.egg-info/PKG-INFO
writing dependency_links to nvtabular.egg-info/dependency_links.txt
writing requirements to nvtabular.egg-info/requires.txt
writing top-level names to nvtabular.egg-info/top_level.txt
writing manifest file 'nvtabular.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'nvtabular.egg-info/SOURCES.txt'
running build_ext
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.8 -c flagcheck.cpp -o flagcheck.o -std=c++17
building 'nvtabular_cpp' extension
creating build
creating build/temp.linux-x86_64-3.8
creating build/temp.linux-x86_64-3.8/cpp
creating build/temp.linux-x86_64-3.8/cpp/nvtabular
creating build/temp.linux-x86_64-3.8/cpp/nvtabular/inference
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+11.g6af2355 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+11.g6af2355 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+11.g6af2355 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/categorify.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+11.g6af2355 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/fill.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -std=c++17 -fvisibility=hidden -g0
creating build/lib.linux-x86_64-3.8
x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -o build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so
copying build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so -> 
Generating nvtabular/inference/triton/model_config_pb2.py from nvtabular/inference/triton/model_config.proto
Creating /var/jenkins_home/.local/lib/python3.8/site-packages/nvtabular.egg-link (link to .)
nvtabular 0.6.0+11.g6af2355 is already the active version in easy-install.pth

Installed /var/jenkins_home/workspace/nvtabular_tests/nvtabular
Processing dependencies for nvtabular==0.6.0+11.g6af2355
Searching for pyarrow==1.0.1
Best match: pyarrow 1.0.1
Adding pyarrow 1.0.1 to easy-install.pth file
Installing plasma_store script to /var/jenkins_home/.local/bin

Using /usr/local/lib/python3.8/dist-packages
Searching for tqdm==4.61.2
Best match: tqdm 4.61.2
Processing tqdm-4.61.2-py3.8.egg
tqdm 4.61.2 is already the active version in easy-install.pth
Installing tqdm script to /var/jenkins_home/.local/bin

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tqdm-4.61.2-py3.8.egg
Searching for numba==0.53.1
Best match: numba 0.53.1
Adding numba 0.53.1 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for pandas==1.1.5
Best match: pandas 1.1.5
Adding pandas 1.1.5 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for distributed==2021.4.1
Best match: distributed 2021.4.1
Adding distributed 2021.4.1 to easy-install.pth file
Installing dask-ssh script to /var/jenkins_home/.local/bin
Installing dask-scheduler script to /var/jenkins_home/.local/bin
Installing dask-worker script to /var/jenkins_home/.local/bin

Using /var/jenkins_home/.local/lib/python3.8/site-packages
Searching for dask==2021.4.1
Best match: dask 2021.4.1
Processing dask-2021.4.1-py3.8.egg
dask 2021.4.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg
Searching for PyYAML==5.4.1
Best match: PyYAML 5.4.1
Processing PyYAML-5.4.1-py3.8-linux-x86_64.egg
PyYAML 5.4.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/PyYAML-5.4.1-py3.8-linux-x86_64.egg
Searching for numpy==1.20.2
Best match: numpy 1.20.2
Adding numpy 1.20.2 to easy-install.pth file
Installing f2py script to /var/jenkins_home/.local/bin
Installing f2py3 script to /var/jenkins_home/.local/bin
Installing f2py3.8 script to /var/jenkins_home/.local/bin

Using /usr/local/lib/python3.8/dist-packages
Searching for setuptools==57.4.0
Best match: setuptools 57.4.0
Adding setuptools 57.4.0 to easy-install.pth file

Using /var/jenkins_home/.local/lib/python3.8/site-packages
Searching for llvmlite==0.36.0
Best match: llvmlite 0.36.0
Adding llvmlite 0.36.0 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for python-dateutil==2.8.2
Best match: python-dateutil 2.8.2
Adding python-dateutil 2.8.2 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for pytz==2021.1
Best match: pytz 2021.1
Adding pytz 2021.1 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for cloudpickle==1.6.0
Best match: cloudpickle 1.6.0
Processing cloudpickle-1.6.0-py3.8.egg
cloudpickle 1.6.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/cloudpickle-1.6.0-py3.8.egg
Searching for zict==2.0.0
Best match: zict 2.0.0
Processing zict-2.0.0-py3.8.egg
zict 2.0.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/zict-2.0.0-py3.8.egg
Searching for sortedcontainers==2.4.0
Best match: sortedcontainers 2.4.0
Processing sortedcontainers-2.4.0-py3.8.egg
sortedcontainers 2.4.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/sortedcontainers-2.4.0-py3.8.egg
Searching for click==8.0.1
Best match: click 8.0.1
Processing click-8.0.1-py3.8.egg
click 8.0.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/click-8.0.1-py3.8.egg
Searching for tblib==1.7.0
Best match: tblib 1.7.0
Processing tblib-1.7.0-py3.8.egg
tblib 1.7.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tblib-1.7.0-py3.8.egg
Searching for msgpack==1.0.2
Best match: msgpack 1.0.2
Processing msgpack-1.0.2-py3.8-linux-x86_64.egg
msgpack 1.0.2 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/msgpack-1.0.2-py3.8-linux-x86_64.egg
Searching for toolz==0.11.1
Best match: toolz 0.11.1
Processing toolz-0.11.1-py3.8.egg
toolz 0.11.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/toolz-0.11.1-py3.8.egg
Searching for psutil==5.8.0
Best match: psutil 5.8.0
Processing psutil-5.8.0-py3.8-linux-x86_64.egg
psutil 5.8.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/psutil-5.8.0-py3.8-linux-x86_64.egg
Searching for tornado==6.1
Best match: tornado 6.1
Processing tornado-6.1-py3.8-linux-x86_64.egg
tornado 6.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tornado-6.1-py3.8-linux-x86_64.egg
Searching for partd==1.2.0
Best match: partd 1.2.0
Processing partd-1.2.0-py3.8.egg
partd 1.2.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/partd-1.2.0-py3.8.egg
Searching for fsspec==2021.7.0
Best match: fsspec 2021.7.0
Processing fsspec-2021.7.0-py3.8.egg
fsspec 2021.7.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/fsspec-2021.7.0-py3.8.egg
Searching for six==1.15.0
Best match: six 1.15.0
Adding six 1.15.0 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for HeapDict==1.0.1
Best match: HeapDict 1.0.1
Processing HeapDict-1.0.1-py3.8.egg
HeapDict 1.0.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/HeapDict-1.0.1-py3.8.egg
Searching for locket==0.2.1
Best match: locket 0.2.1
Processing locket-0.2.1-py3.8.egg
locket 0.2.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/locket-0.2.1-py3.8.egg
Finished processing dependencies for nvtabular==0.6.0+11.g6af2355
Running black --check
All done! ✨ 🍰 ✨
110 files would be left unchanged.
Running flake8
Running isort
/usr/local/lib/python3.8/dist-packages/isort/main.py:141: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
warn(f"Likely recursive symlink detected to {resolved_path}")
/usr/local/lib/python3.8/dist-packages/isort/main.py:141: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/examples/scaling-criteo/imgs
warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 2 files
Running bandit
Running pylint
************* Module nvtabular.ops.categorify
nvtabular/ops/categorify.py:459:15: I1101: Module 'nvtabular_cpp' has no 'inference' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
************* Module nvtabular.ops.fill
nvtabular/ops/fill.py:66:15: I1101: Module 'nvtabular_cpp' has no 'inference' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
************* Module bench.datasets.tools.train_hugectr
bench/datasets/tools/train_hugectr.py:28:13: I1101: Module 'hugectr' has no 'solver_parser_helper' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
bench/datasets/tools/train_hugectr.py:41:16: I1101: Module 'hugectr' has no 'optimizer' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)


Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)

Running flake8-nb
Building docs
make: Entering directory '/var/jenkins_home/workspace/nvtabular_tests/nvtabular/docs'
2021-08-10 19:04:43.152028: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-08-10 19:04:44.522349: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-08-10 19:04:44.523443: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:07:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2021-08-10 19:04:44.524458: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 1 with properties:
pciBusID: 0000:08:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2021-08-10 19:04:44.524488: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-08-10 19:04:44.524535: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-08-10 19:04:44.524569: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-08-10 19:04:44.524601: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-08-10 19:04:44.524635: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-08-10 19:04:44.524681: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-08-10 19:04:44.524711: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-08-10 19:04:44.524747: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-08-10 19:04:44.528962: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0, 1
/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.6) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
make: Leaving directory '/var/jenkins_home/workspace/nvtabular_tests/nvtabular/docs'
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml
plugins: cov-2.12.1, forked-1.3.0, xdist-2.3.0
collected 1163 items

tests/unit/test_column_group.py .. [ 0%]
tests/unit/test_column_similarity.py ........................ [ 2%]
tests/unit/test_cpu_workflow.py ...... [ 2%]
tests/unit/test_dask_nvt.py ............................................ [ 6%]
..................................................................... [ 12%]
tests/unit/test_dataloader_backend.py . [ 12%]
tests/unit/test_io.py .................................................. [ 16%]
........................................................................ [ 23%]
........ssssssss.................................................. [ 28%]
tests/unit/test_notebooks.py ...... [ 29%]
tests/unit/test_ops.py ................................................. [ 33%]
........................................................................ [ 39%]
........................................................................ [ 45%]
........................................................................ [ 52%]
........................................................................ [ 58%]
........................................................................ [ 64%]
................................. [ 67%]
tests/unit/test_s3.py .. [ 67%]
tests/unit/test_tf_dataloader.py ....................................... [ 70%]
.................................s [ 73%]
tests/unit/test_tf_feature_columns.py . [ 73%]
tests/unit/test_tf_layers.py ........................................... [ 77%]
................................... [ 80%]
tests/unit/test_tools.py ...................... [ 82%]
tests/unit/test_torch_dataloader.py .................................... [ 85%]
.............................................. [ 89%]
tests/unit/test_torch_layers.py . [ 89%]
tests/unit/test_triton_inference.py .............................. [ 92%]
tests/unit/test_workflow.py ............................................ [ 95%]
................................................ [100%]

=============================== warnings summary ===============================
tests/unit/test_io.py: 36 warnings
tests/unit/test_workflow.py: 44 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:86: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler will be used for execution. Please use the client argument to initialize a Workflow object with distributed-execution enabled.
warnings.warn(

tests/unit/test_io.py: 52 warnings
tests/unit/test_workflow.py: 35 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dask.py:372: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler will be used for this write operation. Please use the client argument to initialize a Dataset and/or Workflow object with distributed-execution enabled.
warnings.warn(

tests/unit/test_io.py: 20 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:476: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler is being used for this shuffle operation. Please use the client argument to initialize a Dataset and/or Workflow object with distributed-execution enabled.
warnings.warn(

tests/unit/test_ops.py::test_fill_missing[True-True-parquet]
tests/unit/test_ops.py::test_fill_missing[True-False-parquet]
tests/unit/test_ops.py::test_filter[parquet-0.1-True]
/usr/local/lib/python3.8/dist-packages/pandas/core/indexing.py:670: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
iloc._setitem_with_indexer(indexer, value)

tests/unit/test_ops.py::test_join_external[True-True-left-host-pandas-parquet]
tests/unit/test_ops.py::test_join_external[True-True-left-device-pandas-parquet]
tests/unit/test_ops.py::test_join_external[True-True-inner-host-pandas-parquet]
tests/unit/test_ops.py::test_join_external[True-True-inner-device-pandas-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/join_external.py:171: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
_ext.drop_duplicates(ignore_index=True, inplace=True)

tests/unit/test_ops.py::test_filter[parquet-0.1-True]
tests/unit/test_ops.py::test_filter[parquet-0.1-False]
tests/unit/test_ops.py::test_groupby_op[id-True]
tests/unit/test_ops.py::test_groupby_op[id-False]
/var/jenkins_home/.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg/dask/dataframe/core.py:6610: UserWarning: Insufficient elements for head. 1 elements requested, only 0 elements available. Try passing larger npartitions to head.
warnings.warn(msg.format(n, len(r)))

-- Docs: https://docs.pytest.org/en/stable/warnings.html

---------- coverage: platform linux, python 3.8.10-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

examples/multi-gpu-movielens/torch_trainer.py 65 0 6 1 99% 32->36
nvtabular/init.py 12 0 0 0 100%
nvtabular/column_group.py 157 18 82 5 87% 54, 87, 128, 152-165, 214, 301
nvtabular/dispatch.py 253 45 124 22 81% 35-38, 43-45, 51-61, 68-69, 86, 93, 101, 112, 118, 123->125, 136, 159-162, 201, 204, 210, 226, 233, 264->269, 267, 270, 273->277, 310, 321-324, 351-354, 384, 388, 429, 453, 455, 462
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 132 78 88 15 38% 29, 98, 102, 113-129, 139, 142-157, 161, 165-166, 172-197, 206-216, 219-226, 228->231, 232, 237-277, 280
nvtabular/framework_utils/tensorflow/layers/init.py 4 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 153 12 85 6 91% 60, 68->49, 122, 179, 231-239, 335->343, 357->360, 363-364, 367
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 25 20 1 43% 49, 74-103, 106-110, 113
nvtabular/framework_utils/tensorflow/layers/outer_product.py 30 24 10 0 15% 37-38, 41-60, 71-84, 87
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 32 1 14 1 96% 50
nvtabular/framework_utils/torch/models.py 45 0 28 0 100%
nvtabular/framework_utils/torch/utils.py 75 4 30 2 94% 64, 118-120
nvtabular/inference/init.py 0 0 0 0 100%
nvtabular/inference/triton/init.py 298 131 128 13 57% 80-84, 138-188, 233-277, 308, 334-342, 350-357, 376, 398-414, 455-459, 497-507, 531-553, 557-624, 633->636, 636->632, 665-675, 684, 694, 715, 722, 728->731, 732
nvtabular/inference/triton/benchmarking_tools.py 52 52 10 0 0% 2-103
nvtabular/inference/triton/data_conversions.py 87 3 58 4 95% 32-33, 84
nvtabular/inference/triton/model.py 160 160 80 0 0% 27-305
nvtabular/inference/triton/model_config_pb2.py 299 0 2 0 100%
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 88 88 30 0 0% 16-189
nvtabular/io/csv.py 57 6 20 5 86% 22-23, 99, 103->107, 108, 110, 124
nvtabular/io/dask.py 183 8 72 11 93% 111, 114, 150, 398, 408, 425->428, 436, 440->442, 442->438, 447, 449
nvtabular/io/dataframe_engine.py 61 5 28 6 88% 19-20, 50, 69, 88->92, 92->97, 94->97, 97->116, 125
nvtabular/io/dataset.py 320 43 150 28 84% 44-45, 246, 248, 261, 270, 288-302, 405->475, 410-413, 418->428, 423-424, 435->433, 449->453, 464, 475->484, 534->538, 581, 703, 705, 707, 713, 717-719, 721, 781-782, 809, 816-817, 823, 829, 924-925, 1041-1046, 1052, 1117
nvtabular/io/dataset_engine.py 23 1 0 0 96% 45
nvtabular/io/hugectr.py 45 2 24 2 91% 34, 74->97, 101
nvtabular/io/parquet.py 492 23 156 14 94% 33-34, 92-100, 124->126, 213-215, 338-343, 381-386, 502->509, 570->575, 576-577, 697, 701, 705, 711, 743, 760, 764, 771->773, 881->exit, 891->896, 901->911, 916, 938
nvtabular/io/shuffle.py 31 6 16 5 77% 42, 44-45, 49, 59, 63
nvtabular/io/writer.py 175 13 68 5 92% 24-25, 51, 79, 125, 128, 212, 221, 224, 267, 288-290
nvtabular/io/writer_factory.py 18 2 8 2 85% 35, 60
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 327 12 138 9 95% 142-143, 233->235, 245-249, 295-296, 335->339, 410, 414-415, 445, 550, 558
nvtabular/loader/tensorflow.py 155 22 50 7 85% 57, 65-68, 78, 88, 296, 332, 347-349, 378-380, 390-398, 401-404
nvtabular/loader/tf_utils.py 55 10 20 5 80% 29->32, 32->34, 39->41, 43, 50-51, 58-60, 66-70
nvtabular/loader/torch.py 81 13 16 2 78% 25-27, 30-36, 111, 149-150
nvtabular/ops/init.py 21 0 0 0 100%
nvtabular/ops/bucketize.py 32 10 18 3 62% 52-54, 58, 61-64, 83-86
nvtabular/ops/categorify.py 573 67 323 47 85% 230, 232, 247, 251, 259, 267, 269, 296, 315-316, 331, 342->347, 350-357, 436-437, 455-456, 532->534, 655, 691, 720->723, 724-726, 733-734, 747-749, 750->718, 766, 774, 776, 783->exit, 806, 809->812, 820, 845, 850, 866->870, 877-880, 891, 895, 897, 909-912, 990, 992, 1021->1044, 1027->1044, 1045-1050, 1087, 1105->1110, 1109, 1119->1116, 1124->1116, 1132, 1140-1150
nvtabular/ops/clip.py 18 2 6 3 79% 43, 51->53, 54
nvtabular/ops/column_similarity.py 103 24 36 5 72% 19-20, 76->exit, 106, 178-179, 188-190, 198-214, 231->234, 235, 245
nvtabular/ops/data_stats.py 56 2 22 3 94% 91->93, 95, 97->87, 102
nvtabular/ops/difference_lag.py 25 0 8 1 97% 66->68
nvtabular/ops/dropna.py 8 0 0 0 100%
nvtabular/ops/fill.py 63 6 22 1 89% 62-66, 101, 127
nvtabular/ops/filter.py 20 1 6 1 92% 49
nvtabular/ops/groupby.py 92 4 56 6 92% 71, 80, 82, 92->94, 104->109, 180
nvtabular/ops/hash_bucket.py 29 2 18 2 87% 69, 99
nvtabular/ops/hashed_cross.py 28 3 13 4 83% 50, 63, 77->exit, 78
nvtabular/ops/join_external.py 89 7 38 6 90% 20-21, 113, 115, 117, 159, 176->178, 212
nvtabular/ops/join_groupby.py 84 5 30 2 94% 106, 109->118, 194-195, 198-199
nvtabular/ops/lambdaop.py 39 6 18 6 79% 59, 63, 77, 89, 94, 103
nvtabular/ops/list_slice.py 63 24 26 1 56% 21-22, 52-53, 100-114, 122-133
nvtabular/ops/logop.py 8 0 0 0 100%
nvtabular/ops/moments.py 65 0 20 0 100%
nvtabular/ops/normalize.py 71 8 14 1 87% 69, 77-78, 111-112, 134-135, 139
nvtabular/ops/operator.py 29 1 2 1 94% 25
nvtabular/ops/rename.py 23 3 14 3 84% 45, 66-68
nvtabular/ops/stat_operator.py 8 0 0 0 100%
nvtabular/ops/target_encoding.py 146 11 64 5 90% 147, 167->171, 174->183, 228-229, 232-233, 242-248, 339->342
nvtabular/tools/init.py 0 0 0 0 100%
nvtabular/tools/data_gen.py 236 1 62 1 99% 323
nvtabular/tools/dataset_inspector.py 49 7 18 1 79% 31-38
nvtabular/tools/inspector_script.py 46 46 0 0 0% 17-168
nvtabular/utils.py 102 43 46 8 52% 31-32, 36-37, 50, 61-62, 64-66, 69, 72, 78, 84, 90-126, 145, 149->153
nvtabular/worker.py 82 5 38 7 90% 24-25, 82->99, 91, 92->99, 99->102, 108, 110, 111->113
nvtabular/workflow.py 161 13 77 5 92% 28-29, 45, 141, 155-157, 261, 276-277, 295-296, 384

TOTAL 6388 1108 2556 294 80%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 80.44%
=========================== short test summary info ============================
SKIPPED [8] tests/unit/test_io.py:500: could not import 'uavro': No module named 'uavro'
SKIPPED [1] tests/unit/test_tf_dataloader.py:521: not working correctly in ci environment
========== 1154 passed, 9 skipped, 198 warnings in 992.02s (0:16:32) ===========
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://github.com/gitapi/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins1960311383280698029.sh

@karlhigley
Copy link
Contributor

This looks handy for the upcoming schema work. 👍🏻

Copy link
Contributor

@oyilmaz-nvidia oyilmaz-nvidia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Rick! Just tested it and it worked.

@oyilmaz-nvidia oyilmaz-nvidia merged commit 86a8680 into NVIDIA-Merlin:main Aug 10, 2021
@rjzamora rjzamora deleted the add-sample_dtypes branch August 10, 2021 21:17
benfred pushed a commit that referenced this pull request Aug 11, 2021
* correct normalize for case that std=0

* fix partition_on-related bugs

* add sample_dtypes method to Dataset

* simplify logic a bit
mikemckiernan pushed a commit that referenced this pull request Nov 24, 2022
* correct normalize for case that std=0

* fix partition_on-related bugs

* add sample_dtypes method to Dataset

* simplify logic a bit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants