Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the Merlin repos in the CI image build #558

Merged
merged 1 commit into from
Aug 24, 2022

Conversation

karlhigley
Copy link
Contributor

This avoids needing to rebuild the Merlin base image and the HugeCTR image in order to get changes in the Merlin repos into the CI images.

This avoids needing to rebuild the Merlin base image and the HugeCTR image in order to get changes in the Merlin repos into the CI images.
@karlhigley karlhigley added chore Infrastructure update ci labels Aug 24, 2022
@karlhigley karlhigley added this to the Merlin 22.09 milestone Aug 24, 2022
@karlhigley karlhigley self-assigned this Aug 24, 2022
@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #558 of commit 542e317364fbc00b47c0cc1e9874551083996f94, no merge conflicts.
Running as SYSTEM
Setting status of 542e317364fbc00b47c0cc1e9874551083996f94 to PENDING with url https://10.20.13.93:8080/job/merlin_merlin/360/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_merlin
using credential systems-login
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Merlin # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Merlin
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Merlin +refs/pull/558/*:refs/remotes/origin/pr/558/* # timeout=10
 > git rev-parse 542e317364fbc00b47c0cc1e9874551083996f94^{commit} # timeout=10
Checking out Revision 542e317364fbc00b47c0cc1e9874551083996f94 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 542e317364fbc00b47c0cc1e9874551083996f94 # timeout=10
Commit message: "Update the Merlin repos in the CI image build"
 > git rev-list --no-walk 5fa97b4cfe70c6e88523531ed4bc60cc9e3c63ff # timeout=10
[merlin_merlin] $ /bin/bash /tmp/jenkins18374647206690276077.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_merlin/merlin
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 3 items

tests/unit/test_version.py . [ 33%]
tests/unit/examples/test_building_deploying_multi_stage_RecSys.py s [ 66%]
tests/unit/examples/test_scaling_criteo_merlin_models.py F [100%]

=================================== FAILURES ===================================
__________________________________ test_func ___________________________________

def test_func():
    with testbook(
        REPO_ROOT / "examples" / "scaling-criteo" / "02-ETL-with-NVTabular.ipynb",
        execute=False,
        timeout=180,
    ) as tb1:
        tb1.inject(
            """
            import os
            os.environ["BASE_DIR"] = "/tmp/input/criteo/"
            os.environ["INPUT_DATA_DIR"] = "/tmp/input/criteo/"
            os.environ["OUTPUT_DATA_DIR"] = "/tmp/output/criteo/"
            os.system("mkdir -p /tmp/input/criteo")
            os.system("mkdir -p /tmp/output/criteo")

            from merlin.datasets.synthetic import generate_data

            train, valid = generate_data("criteo", int(1000000), set_sizes=(0.7, 0.3))

            train.to_ddf().compute().to_parquet('/tmp/input/criteo/day_0.parquet')
            valid.to_ddf().compute().to_parquet('/tmp/input/criteo/day_1.parquet')
            """
        )
      tb1.execute()

tests/unit/examples/test_scaling_criteo_merlin_models.py:33:


/usr/local/lib/python3.8/dist-packages/testbook/client.py:147: in execute
super().execute_cell(cell, index)
/usr/local/lib/python3.8/dist-packages/nbclient/util.py:85: in wrapped
return just_run(coro(*args, **kwargs))
/usr/local/lib/python3.8/dist-packages/nbclient/util.py:60: in just_run
return loop.run_until_complete(coro)
/usr/lib/python3.8/asyncio/base_events.py:616: in run_until_complete
return future.result()
/usr/local/lib/python3.8/dist-packages/nbclient/client.py:1025: in async_execute_cell
await self._check_raise_for_error(cell, cell_index, exec_reply)


self = <testbook.client.TestbookNotebookClient object at 0x7f1e8de254c0>
cell = {'cell_type': 'code', 'execution_count': 7, 'metadata': {'jupyter': {'outputs_hidden': False}, 'execution': {'iopub.st...pool_size=(device_pool_size // 256) * 256\n )\n\n# Create the distributed client\nclient = Client(cluster)\nclient'}
cell_index = 12
exec_reply = {'buffers': [], 'content': {'ename': 'MemoryError', 'engine_info': {'engine_id': -1, 'engine_uuid': '75c3ed86-ee33-4cf...e, 'engine': '75c3ed86-ee33-4cff-817a-28217cfae8e5', 'started': '2022-08-24T15:47:24.263717Z', 'status': 'error'}, ...}

async def _check_raise_for_error(
    self, cell: NotebookNode, cell_index: int, exec_reply: t.Optional[t.Dict]
) -> None:

    if exec_reply is None:
        return None

    exec_reply_content = exec_reply['content']
    if exec_reply_content['status'] != 'error':
        return None

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors
        or exec_reply_content.get('ename') in self.allow_error_names
        or "raises-exception" in cell.metadata.get("tags", [])
    )
    await run_hook(
        self.on_cell_error, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )
    if not cell_allows_errors:
      raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)

E nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
E ------------------
E # Dask dashboard
E dashboard_port = "8787"
E
E # Deploy a Single-Machine Multi-GPU Cluster
E protocol = "tcp" # "tcp" or "ucx"
E if numba.cuda.is_available():
E NUM_GPUS = list(range(len(numba.cuda.gpus)))
E else:
E NUM_GPUS = []
E visible_devices = ",".join([str(n) for n in NUM_GPUS]) # Select devices to place workers
E device_limit_frac = 0.7 # Spill GPU-Worker memory to host at this limit.
E device_pool_frac = 0.8
E part_mem_frac = 0.15
E
E # Use total device size to calculate args.device_limit_frac
E device_size = device_mem_size(kind="total")
E device_limit = int(device_limit_frac * device_size)
E device_pool_size = int(device_pool_frac * device_size)
E part_size = int(part_mem_frac * device_size)
E
E # Check if any device memory is already occupied
E for dev in visible_devices.split(","):
E fmem = pynvml_mem_size(kind="free", index=int(dev))
E used = (device_size - fmem) / 1e9
E if used > 1.0:
E warnings.warn(f"BEWARE - {used} GB is already occupied on device {int(dev)}!")
E
E cluster = None # (Optional) Specify existing scheduler port
E if cluster is None:
E cluster = LocalCUDACluster(
E protocol=protocol,
E n_workers=len(visible_devices.split(",")),
E CUDA_VISIBLE_DEVICES=visible_devices,
E device_memory_limit=device_limit,
E local_directory=dask_workdir,
E dashboard_address=":" + dashboard_port,
E rmm_pool_size=(device_pool_size // 256) * 256
E )
E
E # Create the distributed client
E client = Client(cluster)
E client
E ------------------
E
E �[0;31m---------------------------------------------------------------------------�[0m
E �[0;31mMemoryError�[0m Traceback (most recent call last)
E Input �[0;32mIn [7]�[0m, in �[0;36m<cell line: 29>�[0;34m()�[0m
E �[1;32m 28�[0m cluster �[38;5;241m=�[39m �[38;5;28;01mNone�[39;00m �[38;5;66;03m# (Optional) Specify existing scheduler port�[39;00m
E �[1;32m 29�[0m �[38;5;28;01mif�[39;00m cluster �[38;5;129;01mis�[39;00m �[38;5;28;01mNone�[39;00m:
E �[0;32m---> 30�[0m cluster �[38;5;241m=�[39m �[43mLocalCUDACluster�[49m�[43m(�[49m
E �[1;32m 31�[0m �[43m �[49m�[43mprotocol�[49m�[38;5;241;43m=�[39;49m�[43mprotocol�[49m�[43m,�[49m
E �[1;32m 32�[0m �[43m �[49m�[43mn_workers�[49m�[38;5;241;43m=�[39;49m�[38;5;28;43mlen�[39;49m�[43m(�[49m�[43mvisible_devices�[49m�[38;5;241;43m.�[39;49m�[43msplit�[49m�[43m(�[49m�[38;5;124;43m"�[39;49m�[38;5;124;43m,�[39;49m�[38;5;124;43m"�[39;49m�[43m)�[49m�[43m)�[49m�[43m,�[49m
E �[1;32m 33�[0m �[43m �[49m�[43mCUDA_VISIBLE_DEVICES�[49m�[38;5;241;43m=�[39;49m�[43mvisible_devices�[49m�[43m,�[49m
E �[1;32m 34�[0m �[43m �[49m�[43mdevice_memory_limit�[49m�[38;5;241;43m=�[39;49m�[43mdevice_limit�[49m�[43m,�[49m
E �[1;32m 35�[0m �[43m �[49m�[43mlocal_directory�[49m�[38;5;241;43m=�[39;49m�[43mdask_workdir�[49m�[43m,�[49m
E �[1;32m 36�[0m �[43m �[49m�[43mdashboard_address�[49m�[38;5;241;43m=�[39;49m�[38;5;124;43m"�[39;49m�[38;5;124;43m:�[39;49m�[38;5;124;43m"�[39;49m�[43m �[49m�[38;5;241;43m+�[39;49m�[43m �[49m�[43mdashboard_port�[49m�[43m,�[49m
E �[1;32m 37�[0m �[43m �[49m�[43mrmm_pool_size�[49m�[38;5;241;43m=�[39;49m�[43m(�[49m�[43mdevice_pool_size�[49m�[43m �[49m�[38;5;241;43m/�[39;49m�[38;5;241;43m/�[39;49m�[43m �[49m�[38;5;241;43m256�[39;49m�[43m)�[49m�[43m �[49m�[38;5;241;43m�[39;49m�[43m �[49m�[38;5;241;43m256�[39;49m
E �[1;32m 38�[0m �[43m �[49m�[43m)�[49m
E �[1;32m 40�[0m �[38;5;66;03m# Create the distributed client�[39;00m
E �[1;32m 41�[0m client �[38;5;241m=�[39m Client(cluster)
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/dask_cuda/local_cuda_cluster.py:366�[0m, in �[0;36mLocalCUDACluster.__init__�[0;34m(self, CUDA_VISIBLE_DEVICES, n_workers, threads_per_worker, memory_limit, device_memory_limit, data, local_directory, shared_filesystem, protocol, enable_tcp_over_ucx, enable_infiniband, enable_nvlink, enable_rdmacm, rmm_pool_size, rmm_maximum_pool_size, rmm_managed_memory, rmm_async, rmm_log_directory, rmm_track_allocations, jit_unspill, log_spilling, worker_class, pre_import, **kwargs)�[0m
E �[1;32m 364�[0m �[38;5;28mself�[39m�[38;5;241m.�[39mcuda_visible_devices �[38;5;241m=�[39m CUDA_VISIBLE_DEVICES
E �[1;32m 365�[0m �[38;5;28mself�[39m�[38;5;241m.�[39mscale(n_workers)
E �[0;32m--> 366�[0m �[38;5;28;43mself�[39;49m�[38;5;241;43m.�[39;49m�[43msync�[49m�[43m(�[49m�[38;5;28;43mself�[39;49m�[38;5;241;43m.�[39;49m�[43m_correct_state�[49m�[43m)�[49m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/distributed/utils.py:309�[0m, in �[0;36mSyncMethodMixin.sync�[0;34m(self, func, asynchronous, callback_timeout, args, **kwargs)�[0m
E �[1;32m 307�[0m �[38;5;28;01mreturn�[39;00m future
E �[1;32m 308�[0m �[38;5;28;01melse�[39;00m:
E �[0;32m--> 309�[0m �[38;5;28;01mreturn�[39;00m �[43msync�[49m�[43m(�[49m
E �[1;32m 310�[0m �[43m �[49m�[38;5;28;43mself�[39;49m�[38;5;241;43m.�[39;49m�[43mloop�[49m�[43m,�[49m�[43m �[49m�[43mfunc�[49m�[43m,�[49m�[43m �[49m�[38;5;241;43m
�[39;49m�[43margs�[49m�[43m,�[49m�[43m �[49m�[43mcallback_timeout�[49m�[38;5;241;43m=�[39;49m�[43mcallback_timeout�[49m�[43m,�[49m�[43m �[49m�[38;5;241;43m
�[39;49m�[38;5;241;43m*�[39;49m�[43mkwargs�[49m
E �[1;32m 311�[0m �[43m �[49m�[43m)�[49m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/distributed/utils.py:376�[0m, in �[0;36msync�[0;34m(loop, func, callback_timeout, *args, **kwargs)�[0m
E �[1;32m 374�[0m �[38;5;28;01mif�[39;00m error:
E �[1;32m 375�[0m typ, exc, tb �[38;5;241m=�[39m error
E �[0;32m--> 376�[0m �[38;5;28;01mraise�[39;00m exc�[38;5;241m.�[39mwith_traceback(tb)
E �[1;32m 377�[0m �[38;5;28;01melse�[39;00m:
E �[1;32m 378�[0m �[38;5;28;01mreturn�[39;00m result
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/distributed/utils.py:349�[0m, in �[0;36msync..f�[0;34m()�[0m
E �[1;32m 347�[0m future �[38;5;241m=�[39m asyncio�[38;5;241m.�[39mwait_for(future, callback_timeout)
E �[1;32m 348�[0m future �[38;5;241m=�[39m asyncio�[38;5;241m.�[39mensure_future(future)
E �[0;32m--> 349�[0m result �[38;5;241m=�[39m �[38;5;28;01myield�[39;00m future
E �[1;32m 350�[0m �[38;5;28;01mexcept�[39;00m �[38;5;167;01mException�[39;00m:
E �[1;32m 351�[0m error �[38;5;241m=�[39m sys�[38;5;241m.�[39mexc_info()
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tornado/gen.py:769�[0m, in �[0;36mRunner.run�[0;34m(self)�[0m
E �[1;32m 766�[0m exc_info �[38;5;241m=�[39m �[38;5;28;01mNone�[39;00m
E �[1;32m 768�[0m �[38;5;28;01mtry�[39;00m:
E �[0;32m--> 769�[0m value �[38;5;241m=�[39m �[43mfuture�[49m�[38;5;241;43m.�[39;49m�[43mresult�[49m�[43m(�[49m�[43m)�[49m
E �[1;32m 770�[0m �[38;5;28;01mexcept�[39;00m �[38;5;167;01mException�[39;00m:
E �[1;32m 771�[0m exc_info �[38;5;241m=�[39m sys�[38;5;241m.�[39mexc_info()
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/distributed/deploy/spec.py:352�[0m, in �[0;36mSpecCluster._correct_state_internal�[0;34m(self)�[0m
E �[1;32m 350�[0m �[38;5;28;01mfor�[39;00m w �[38;5;129;01min�[39;00m workers:
E �[1;32m 351�[0m w�[38;5;241m.�[39m_cluster �[38;5;241m=�[39m weakref�[38;5;241m.�[39mref(�[38;5;28mself�[39m)
E �[0;32m--> 352�[0m �[38;5;28;01mawait�[39;00m w �[38;5;66;03m# for tornado gen.coroutine support�[39;00m
E �[1;32m 353�[0m �[38;5;28mself�[39m�[38;5;241m.�[39mworkers�[38;5;241m.�[39mupdate(�[38;5;28mdict�[39m(�[38;5;28mzip�[39m(to_open, workers)))
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/distributed/core.py:299�[0m, in �[0;36mServer.await.._�[0;34m()�[0m
E �[1;32m 293�[0m �[38;5;28;01mraise�[39;00m �[38;5;167;01mTimeoutError�[39;00m(
E �[1;32m 294�[0m �[38;5;124m"�[39m�[38;5;132;01m{}�[39;00m�[38;5;124m failed to start in �[39m�[38;5;132;01m{}�[39;00m�[38;5;124m seconds�[39m�[38;5;124m"�[39m�[38;5;241m.�[39mformat(
E �[1;32m 295�[0m �[38;5;28mtype�[39m(�[38;5;28mself�[39m)�[38;5;241m.�[39m�[38;5;18m__name__�[39m, timeout
E �[1;32m 296�[0m )
E �[1;32m 297�[0m )
E �[1;32m 298�[0m �[38;5;28;01melse�[39;00m:
E �[0;32m--> 299�[0m �[38;5;28;01mawait�[39;00m �[38;5;28mself�[39m�[38;5;241m.�[39mstart()
E �[1;32m 300�[0m �[38;5;28mself�[39m�[38;5;241m.�[39mstatus �[38;5;241m=�[39m Status�[38;5;241m.�[39mrunning
E �[1;32m 301�[0m �[38;5;28;01mreturn�[39;00m �[38;5;28mself�[39m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/distributed/nanny.py:347�[0m, in �[0;36mNanny.start�[0;34m(self)�[0m
E �[1;32m 344�[0m �[38;5;28;01mawait�[39;00m �[38;5;28mself�[39m�[38;5;241m.�[39mplugin_add(plugin�[38;5;241m=�[39mplugin, name�[38;5;241m=�[39mname)
E �[1;32m 346�[0m logger�[38;5;241m.�[39minfo(�[38;5;124m"�[39m�[38;5;124m Start Nanny at: �[39m�[38;5;132;01m%r�[39;00m�[38;5;124m"�[39m, �[38;5;28mself�[39m�[38;5;241m.�[39maddress)
E �[0;32m--> 347�[0m response �[38;5;241m=�[39m �[38;5;28;01mawait�[39;00m �[38;5;28mself�[39m�[38;5;241m.�[39minstantiate()
E �[1;32m 348�[0m �[38;5;28;01mif�[39;00m response �[38;5;241m==�[39m Status�[38;5;241m.�[39mrunning:
E �[1;32m 349�[0m �[38;5;28;01massert�[39;00m �[38;5;28mself�[39m�[38;5;241m.�[39mworker_address
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/distributed/nanny.py:430�[0m, in �[0;36mNanny.instantiate�[0;34m(self)�[0m
E �[1;32m 428�[0m �[38;5;28;01melse�[39;00m:
E �[1;32m 429�[0m �[38;5;28;01mtry�[39;00m:
E �[0;32m--> 430�[0m result �[38;5;241m=�[39m �[38;5;28;01mawait�[39;00m �[38;5;28mself�[39m�[38;5;241m.�[39mprocess�[38;5;241m.�[39mstart()
E �[1;32m 431�[0m �[38;5;28;01mexcept�[39;00m �[38;5;167;01mException�[39;00m:
E �[1;32m 432�[0m �[38;5;28;01mawait�[39;00m �[38;5;28mself�[39m�[38;5;241m.�[39mclose()
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/distributed/nanny.py:685�[0m, in �[0;36mWorkerProcess.start�[0;34m(self)�[0m
E �[1;32m 683�[0m �[38;5;28;01mreturn�[39;00m �[38;5;28mself�[39m�[38;5;241m.�[39mstatus
E �[1;32m 684�[0m �[38;5;28;01mtry�[39;00m:
E �[0;32m--> 685�[0m msg �[38;5;241m=�[39m �[38;5;28;01mawait�[39;00m �[38;5;28mself�[39m�[38;5;241m.�[39m_wait_until_connected(uid)
E �[1;32m 686�[0m �[38;5;28;01mexcept�[39;00m �[38;5;167;01mException�[39;00m:
E �[1;32m 687�[0m �[38;5;28mself�[39m�[38;5;241m.�[39mstatus �[38;5;241m=�[39m Status�[38;5;241m.�[39mfailed
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/distributed/nanny.py:803�[0m, in �[0;36mWorkerProcess._wait_until_connected�[0;34m(self, uid)�[0m
E �[1;32m 799�[0m �[38;5;28;01mif�[39;00m �[38;5;124m"�[39m�[38;5;124mexception�[39m�[38;5;124m"�[39m �[38;5;129;01min�[39;00m msg:
E �[1;32m 800�[0m logger�[38;5;241m.�[39merror(
E �[1;32m 801�[0m �[38;5;124m"�[39m�[38;5;124mFailed while trying to start worker process: �[39m�[38;5;132;01m%s�[39;00m�[38;5;124m"�[39m, msg[�[38;5;124m"�[39m�[38;5;124mexception�[39m�[38;5;124m"�[39m]
E �[1;32m 802�[0m )
E �[0;32m--> 803�[0m �[38;5;28;01mraise�[39;00m msg[�[38;5;124m"�[39m�[38;5;124mexception�[39m�[38;5;124m"�[39m]
E �[1;32m 804�[0m �[38;5;28;01melse�[39;00m:
E �[1;32m 805�[0m �[38;5;28;01mreturn�[39;00m msg
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/distributed/nanny.py:869�[0m, in �[0;36mrun�[0;34m()�[0m
E �[1;32m 865�[0m �[38;5;124;03m"""�[39;00m
E �[1;32m 866�[0m �[38;5;124;03mTry to start worker and inform parent of outcome.�[39;00m
E �[1;32m 867�[0m �[38;5;124;03m"""�[39;00m
E �[1;32m 868�[0m �[38;5;28;01mtry�[39;00m:
E �[0;32m--> 869�[0m �[38;5;28;01mawait�[39;00m worker
E �[1;32m 870�[0m �[38;5;28;01mexcept�[39;00m �[38;5;167;01mException�[39;00m �[38;5;28;01mas�[39;00m e:
E �[1;32m 871�[0m logger�[38;5;241m.�[39mexception(�[38;5;124m"�[39m�[38;5;124mFailed to start worker�[39m�[38;5;124m"�[39m)
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/distributed/core.py:299�[0m, in �[0;36m_�[0;34m()�[0m
E �[1;32m 293�[0m �[38;5;28;01mraise�[39;00m �[38;5;167;01mTimeoutError�[39;00m(
E �[1;32m 294�[0m �[38;5;124m"�[39m�[38;5;132;01m{}�[39;00m�[38;5;124m failed to start in �[39m�[38;5;132;01m{}�[39;00m�[38;5;124m seconds�[39m�[38;5;124m"�[39m�[38;5;241m.�[39mformat(
E �[1;32m 295�[0m �[38;5;28mtype�[39m(�[38;5;28mself�[39m)�[38;5;241m.�[39m�[38;5;18m__name__�[39m, timeout
E �[1;32m 296�[0m )
E �[1;32m 297�[0m )
E �[1;32m 298�[0m �[38;5;28;01melse�[39;00m:
E �[0;32m--> 299�[0m �[38;5;28;01mawait�[39;00m �[38;5;28mself�[39m�[38;5;241m.�[39mstart()
E �[1;32m 300�[0m �[38;5;28mself�[39m�[38;5;241m.�[39mstatus �[38;5;241m=�[39m Status�[38;5;241m.�[39mrunning
E �[1;32m 301�[0m �[38;5;28;01mreturn�[39;00m �[38;5;28mself�[39m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/distributed/worker.py:1372�[0m, in �[0;36mstart�[0;34m()�[0m
E �[1;32m 1370�[0m �[38;5;28;01mfor�[39;00m exc �[38;5;129;01min�[39;00m plugins_exceptions:
E �[1;32m 1371�[0m logger�[38;5;241m.�[39merror(�[38;5;28mrepr�[39m(exc))
E �[0;32m-> 1372�[0m �[38;5;28;01mraise�[39;00m plugins_exceptions[�[38;5;241m0�[39m]
E �[1;32m 1374�[0m �[38;5;28mself�[39m�[38;5;241m.�[39m_pending_plugins �[38;5;241m=�[39m ()
E �[1;32m 1376�[0m �[38;5;28;01mawait�[39;00m �[38;5;28mself�[39m�[38;5;241m.�[39m_register_with_scheduler()
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/distributed/worker.py:3248�[0m, in �[0;36mplugin_add�[0;34m()�[0m
E �[1;32m 3246�[0m �[38;5;28;01mif�[39;00m �[38;5;28mhasattr�[39m(plugin, �[38;5;124m"�[39m�[38;5;124msetup�[39m�[38;5;124m"�[39m):
E �[1;32m 3247�[0m �[38;5;28;01mtry�[39;00m:
E �[0;32m-> 3248�[0m result �[38;5;241m=�[39m plugin�[38;5;241m.�[39msetup(worker�[38;5;241m=�[39m�[38;5;28mself�[39m)
E �[1;32m 3249�[0m �[38;5;28;01mif�[39;00m isawaitable(result):
E �[1;32m 3250�[0m result �[38;5;241m=�[39m �[38;5;28;01mawait�[39;00m result
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/dask_cuda/utils.py:78�[0m, in �[0;36msetup�[0;34m()�[0m
E �[1;32m 74�[0m �[38;5;28;01mimport�[39;00m �[38;5;21;01mrmm�[39;00m
E �[1;32m 76�[0m pool_allocator �[38;5;241m=�[39m �[38;5;28;01mFalse�[39;00m �[38;5;28;01mif�[39;00m �[38;5;28mself�[39m�[38;5;241m.�[39minitial_pool_size �[38;5;129;01mis�[39;00m �[38;5;28;01mNone�[39;00m �[38;5;28;01melse�[39;00m �[38;5;28;01mTrue�[39;00m
E �[0;32m---> 78�[0m rmm�[38;5;241m.�[39mreinitialize(
E �[1;32m 79�[0m pool_allocator�[38;5;241m=�[39mpool_allocator,
E �[1;32m 80�[0m managed_memory�[38;5;241m=�[39m�[38;5;28mself�[39m�[38;5;241m.�[39mmanaged_memory,
E �[1;32m 81�[0m initial_pool_size�[38;5;241m=�[39m�[38;5;28mself�[39m�[38;5;241m.�[39minitial_pool_size,
E �[1;32m 82�[0m maximum_pool_size�[38;5;241m=�[39m�[38;5;28mself�[39m�[38;5;241m.�[39mmaximum_pool_size,
E �[1;32m 83�[0m logging�[38;5;241m=�[39m�[38;5;28mself�[39m�[38;5;241m.�[39mlogging,
E �[1;32m 84�[0m log_file_name�[38;5;241m=�[39mget_rmm_log_file_name(
E �[1;32m 85�[0m worker, �[38;5;28mself�[39m�[38;5;241m.�[39mlogging, �[38;5;28mself�[39m�[38;5;241m.�[39mlog_directory
E �[1;32m 86�[0m ),
E �[1;32m 87�[0m )
E �[1;32m 88�[0m �[38;5;28;01mif�[39;00m �[38;5;28mself�[39m�[38;5;241m.�[39mrmm_track_allocations:
E �[1;32m 89�[0m �[38;5;28;01mimport�[39;00m �[38;5;21;01mrmm�[39;00m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/rmm/rmm.py:84�[0m, in �[0;36mreinitialize�[0;34m()�[0m
E �[1;32m 31�[0m �[38;5;28;01mdef�[39;00m �[38;5;21mreinitialize�[39m(
E �[1;32m 32�[0m pool_allocator�[38;5;241m=�[39m�[38;5;28;01mFalse�[39;00m,
E �[1;32m 33�[0m managed_memory�[38;5;241m=�[39m�[38;5;28;01mFalse�[39;00m,
E �[0;32m (...)�[0m
E �[1;32m 38�[0m log_file_name�[38;5;241m=�[39m�[38;5;28;01mNone�[39;00m,
E �[1;32m 39�[0m ):
E �[1;32m 40�[0m �[38;5;124;03m"""�[39;00m
E �[1;32m 41�[0m �[38;5;124;03m Finalizes and then initializes RMM using the options passed. Using memory�[39;00m
E �[1;32m 42�[0m �[38;5;124;03m from a previous initialization of RMM is undefined behavior and should be�[39;00m
E �[0;32m (...)�[0m
E �[1;32m 82�[0m �[38;5;124;03m corresponding to each device.�[39;00m
E �[1;32m 83�[0m �[38;5;124;03m """�[39;00m
E �[0;32m---> 84�[0m rmm�[38;5;241m.�[39mmr�[38;5;241m.�[39m_initialize(
E �[1;32m 85�[0m pool_allocator�[38;5;241m=�[39mpool_allocator,
E �[1;32m 86�[0m managed_memory�[38;5;241m=�[39mmanaged_memory,
E �[1;32m 87�[0m initial_pool_size�[38;5;241m=�[39minitial_pool_size,
E �[1;32m 88�[0m maximum_pool_size�[38;5;241m=�[39mmaximum_pool_size,
E �[1;32m 89�[0m devices�[38;5;241m=�[39mdevices,
E �[1;32m 90�[0m logging�[38;5;241m=�[39mlogging,
E �[1;32m 91�[0m log_file_name�[38;5;241m=�[39mlog_file_name,
E �[1;32m 92�[0m )
E
E File �[0;32mrmm/_lib/memory_resource.pyx:674�[0m, in �[0;36mrmm._lib.memory_resource._initialize�[0;34m()�[0m
E
E File �[0;32mrmm/_lib/memory_resource.pyx:734�[0m, in �[0;36mrmm._lib.memory_resource._initialize�[0;34m()�[0m
E
E File �[0;32mrmm/_lib/memory_resource.pyx:272�[0m, in �[0;36mrmm._lib.memory_resource.PoolMemoryResource.__cinit__�[0;34m()�[0m
E
E �[0;31mMemoryError�[0m: std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded
E MemoryError: std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded

/usr/local/lib/python3.8/dist-packages/nbclient/client.py:919: CellExecutionError
----------------------------- Captured stderr call -----------------------------
2022-08-24 15:47:25,823 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize
2022-08-24 15:47:25,901 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize
2022-08-24 15:47:26,087 - distributed.utils - ERROR - std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/distributed/utils.py", line 693, in log_errors
yield
File "/usr/local/lib/python3.8/dist-packages/distributed/worker.py", line 3248, in plugin_add
result = plugin.setup(worker=self)
File "/usr/local/lib/python3.8/dist-packages/dask_cuda/utils.py", line 78, in setup
rmm.reinitialize(
File "/usr/local/lib/python3.8/dist-packages/rmm/rmm.py", line 84, in reinitialize
rmm.mr._initialize(
File "rmm/_lib/memory_resource.pyx", line 674, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 734, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 272, in rmm._lib.memory_resource.PoolMemoryResource.cinit
MemoryError: std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded
2022-08-24 15:47:26,089 - distributed.nanny - ERROR - Failed to start worker
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/distributed/nanny.py", line 869, in run
await worker
File "/usr/local/lib/python3.8/dist-packages/distributed/core.py", line 299, in _
await self.start()
File "/usr/local/lib/python3.8/dist-packages/distributed/worker.py", line 1372, in start
raise plugins_exceptions[0]
File "/usr/local/lib/python3.8/dist-packages/distributed/worker.py", line 3248, in plugin_add
result = plugin.setup(worker=self)
File "/usr/local/lib/python3.8/dist-packages/dask_cuda/utils.py", line 78, in setup
rmm.reinitialize(
File "/usr/local/lib/python3.8/dist-packages/rmm/rmm.py", line 84, in reinitialize
rmm.mr._initialize(
File "rmm/_lib/memory_resource.pyx", line 674, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 734, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 272, in rmm._lib.memory_resource.PoolMemoryResource.cinit
MemoryError: std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded
2022-08-24 15:47:27,532 - distributed.diskutils - INFO - Found stale lock file and directory '/tmp/output/criteo/test_dask/workdir/dask-worker-space/worker-5t_x72og', purging
2022-08-24 15:47:27,532 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize
2022-08-24 15:47:27,803 - distributed.utils - ERROR - std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/distributed/utils.py", line 693, in log_errors
yield
File "/usr/local/lib/python3.8/dist-packages/distributed/worker.py", line 3248, in plugin_add
result = plugin.setup(worker=self)
File "/usr/local/lib/python3.8/dist-packages/dask_cuda/utils.py", line 78, in setup
rmm.reinitialize(
File "/usr/local/lib/python3.8/dist-packages/rmm/rmm.py", line 84, in reinitialize
rmm.mr._initialize(
File "rmm/_lib/memory_resource.pyx", line 674, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 734, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 272, in rmm._lib.memory_resource.PoolMemoryResource.cinit
MemoryError: std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded
2022-08-24 15:47:27,804 - distributed.nanny - ERROR - Failed to start worker
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/distributed/nanny.py", line 869, in run
await worker
File "/usr/local/lib/python3.8/dist-packages/distributed/core.py", line 299, in _
await self.start()
File "/usr/local/lib/python3.8/dist-packages/distributed/worker.py", line 1372, in start
raise plugins_exceptions[0]
File "/usr/local/lib/python3.8/dist-packages/distributed/worker.py", line 3248, in plugin_add
result = plugin.setup(worker=self)
File "/usr/local/lib/python3.8/dist-packages/dask_cuda/utils.py", line 78, in setup
rmm.reinitialize(
File "/usr/local/lib/python3.8/dist-packages/rmm/rmm.py", line 84, in reinitialize
rmm.mr._initialize(
File "rmm/_lib/memory_resource.pyx", line 674, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 734, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 272, in rmm._lib.memory_resource.PoolMemoryResource.cinit
MemoryError: std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded
2022-08-24 15:47:29,318 - distributed.diskutils - INFO - Found stale lock file and directory '/tmp/output/criteo/test_dask/workdir/dask-worker-space/worker-g4ahr4q1', purging
2022-08-24 15:47:29,319 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize
2022-08-24 15:47:29,322 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize
2022-08-24 15:47:29,633 - distributed.utils - ERROR - std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/distributed/utils.py", line 693, in log_errors
yield
File "/usr/local/lib/python3.8/dist-packages/distributed/worker.py", line 3248, in plugin_add
result = plugin.setup(worker=self)
File "/usr/local/lib/python3.8/dist-packages/dask_cuda/utils.py", line 78, in setup
rmm.reinitialize(
File "/usr/local/lib/python3.8/dist-packages/rmm/rmm.py", line 84, in reinitialize
rmm.mr._initialize(
File "rmm/_lib/memory_resource.pyx", line 674, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 734, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 272, in rmm._lib.memory_resource.PoolMemoryResource.cinit
MemoryError: std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded
2022-08-24 15:47:29,634 - distributed.nanny - ERROR - Failed to start worker
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/distributed/nanny.py", line 869, in run
await worker
File "/usr/local/lib/python3.8/dist-packages/distributed/core.py", line 299, in _
await self.start()
File "/usr/local/lib/python3.8/dist-packages/distributed/worker.py", line 1372, in start
raise plugins_exceptions[0]
File "/usr/local/lib/python3.8/dist-packages/distributed/worker.py", line 3248, in plugin_add
result = plugin.setup(worker=self)
File "/usr/local/lib/python3.8/dist-packages/dask_cuda/utils.py", line 78, in setup
rmm.reinitialize(
File "/usr/local/lib/python3.8/dist-packages/rmm/rmm.py", line 84, in reinitialize
rmm.mr._initialize(
File "rmm/_lib/memory_resource.pyx", line 674, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 734, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 272, in rmm._lib.memory_resource.PoolMemoryResource.cinit
MemoryError: std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded
2022-08-24 15:47:29,647 - distributed.utils - ERROR - std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/distributed/utils.py", line 693, in log_errors
yield
File "/usr/local/lib/python3.8/dist-packages/distributed/worker.py", line 3248, in plugin_add
result = plugin.setup(worker=self)
File "/usr/local/lib/python3.8/dist-packages/dask_cuda/utils.py", line 78, in setup
rmm.reinitialize(
File "/usr/local/lib/python3.8/dist-packages/rmm/rmm.py", line 84, in reinitialize
rmm.mr._initialize(
File "rmm/_lib/memory_resource.pyx", line 674, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 734, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 272, in rmm._lib.memory_resource.PoolMemoryResource.cinit
MemoryError: std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded
2022-08-24 15:47:29,647 - distributed.nanny - ERROR - Failed to start worker
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/distributed/nanny.py", line 869, in run
await worker
File "/usr/local/lib/python3.8/dist-packages/distributed/core.py", line 299, in _
await self.start()
File "/usr/local/lib/python3.8/dist-packages/distributed/worker.py", line 1372, in start
raise plugins_exceptions[0]
File "/usr/local/lib/python3.8/dist-packages/distributed/worker.py", line 3248, in plugin_add
result = plugin.setup(worker=self)
File "/usr/local/lib/python3.8/dist-packages/dask_cuda/utils.py", line 78, in setup
rmm.reinitialize(
File "/usr/local/lib/python3.8/dist-packages/rmm/rmm.py", line 84, in reinitialize
rmm.mr._initialize(
File "rmm/_lib/memory_resource.pyx", line 674, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 734, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 272, in rmm.lib.memory_resource.PoolMemoryResource.cinit
MemoryError: std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded
2022-08-24 15:47:31,034 - distributed.diskutils - INFO - Found stale lock file and directory '/tmp/output/criteo/test_dask/workdir/dask-worker-space/worker-tewq0b4
', purging
2022-08-24 15:47:31,035 - distributed.diskutils - INFO - Found stale lock file and directory '/tmp/output/criteo/test_dask/workdir/dask-worker-space/worker-ow1s18rv', purging
2022-08-24 15:47:31,035 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize
2022-08-24 15:47:31,267 - distributed.utils - ERROR - std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/distributed/utils.py", line 693, in log_errors
yield
File "/usr/local/lib/python3.8/dist-packages/distributed/worker.py", line 3248, in plugin_add
result = plugin.setup(worker=self)
File "/usr/local/lib/python3.8/dist-packages/dask_cuda/utils.py", line 78, in setup
rmm.reinitialize(
File "/usr/local/lib/python3.8/dist-packages/rmm/rmm.py", line 84, in reinitialize
rmm.mr._initialize(
File "rmm/_lib/memory_resource.pyx", line 674, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 734, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 272, in rmm._lib.memory_resource.PoolMemoryResource.cinit
MemoryError: std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded
2022-08-24 15:47:31,267 - distributed.nanny - ERROR - Failed to start worker
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/distributed/nanny.py", line 869, in run
await worker
File "/usr/local/lib/python3.8/dist-packages/distributed/core.py", line 299, in _
await self.start()
File "/usr/local/lib/python3.8/dist-packages/distributed/worker.py", line 1372, in start
raise plugins_exceptions[0]
File "/usr/local/lib/python3.8/dist-packages/distributed/worker.py", line 3248, in plugin_add
result = plugin.setup(worker=self)
File "/usr/local/lib/python3.8/dist-packages/dask_cuda/utils.py", line 78, in setup
rmm.reinitialize(
File "/usr/local/lib/python3.8/dist-packages/rmm/rmm.py", line 84, in reinitialize
rmm.mr._initialize(
File "rmm/_lib/memory_resource.pyx", line 674, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 734, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 272, in rmm._lib.memory_resource.PoolMemoryResource.cinit
MemoryError: std::bad_alloc: out_of_memory: RMM failure at:/usr/include/rmm/mr/device/pool_memory_resource.hpp:183: Maximum pool size exceeded
/usr/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 30 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
=========================== short test summary info ============================
FAILED tests/unit/examples/test_scaling_criteo_merlin_models.py::test_func - ...
=================== 1 failed, 1 passed, 1 skipped in 20.67s ====================
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://github.com/gitapi/repos/NVIDIA-Merlin/Merlin/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_merlin] $ /bin/bash /tmp/jenkins16837858157436350681.sh

@github-actions
Copy link

Documentation preview

https://nvidia-merlin.github.io/Merlin/review/pr-558

@karlhigley
Copy link
Contributor Author

rerun tests

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #558 of commit 542e317364fbc00b47c0cc1e9874551083996f94, no merge conflicts.
Running as SYSTEM
Setting status of 542e317364fbc00b47c0cc1e9874551083996f94 to PENDING with url https://10.20.13.93:8080/job/merlin_merlin/361/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_merlin
using credential systems-login
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Merlin # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Merlin
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Merlin +refs/pull/558/*:refs/remotes/origin/pr/558/* # timeout=10
 > git rev-parse 542e317364fbc00b47c0cc1e9874551083996f94^{commit} # timeout=10
Checking out Revision 542e317364fbc00b47c0cc1e9874551083996f94 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 542e317364fbc00b47c0cc1e9874551083996f94 # timeout=10
Commit message: "Update the Merlin repos in the CI image build"
 > git rev-list --no-walk 542e317364fbc00b47c0cc1e9874551083996f94 # timeout=10
[merlin_merlin] $ /bin/bash /tmp/jenkins5775396496536804410.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_merlin/merlin
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 3 items

tests/unit/test_version.py . [ 33%]
tests/unit/examples/test_building_deploying_multi_stage_RecSys.py s [ 66%]
tests/unit/examples/test_scaling_criteo_merlin_models.py . [100%]

=================== 2 passed, 1 skipped in 108.88s (0:01:48) ===================
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://github.com/gitapi/repos/NVIDIA-Merlin/Merlin/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_merlin] $ /bin/bash /tmp/jenkins17910477551929570870.sh

@jperez999 jperez999 merged commit dc7f070 into main Aug 24, 2022
@benfred benfred deleted the ci/update-merlin-repos branch August 24, 2022 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore Infrastructure update ci
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants