Revert CMake changes #538

karlhigley · 2022-08-10T18:02:26Z

No description provided.

This reverts commit eb0789c.

…uild)" This reverts commit df88d4b.

This reverts commit cf83cb3.

github-actions · 2022-08-10T18:04:46Z

Documentation preview

https://nvidia-merlin.github.io/Merlin/review/pr-538

nvidia-merlin-bot · 2022-08-10T18:06:38Z

Click to view CI Results

GitHub pull request #538 of commit e417fec111908e44b5ecfbaa75046c4d5ed7baba, no merge conflicts.
Running as SYSTEM
Setting status of e417fec111908e44b5ecfbaa75046c4d5ed7baba to PENDING with url https://10.20.13.93:8080/job/merlin_merlin/343/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_merlin
using credential systems-login
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Merlin # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Merlin
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Merlin +refs/pull/538/*:refs/remotes/origin/pr/538/* # timeout=10
 > git rev-parse e417fec111908e44b5ecfbaa75046c4d5ed7baba^{commit} # timeout=10
Checking out Revision e417fec111908e44b5ecfbaa75046c4d5ed7baba (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f e417fec111908e44b5ecfbaa75046c4d5ed7baba # timeout=10
Commit message: "Revert "Install CMake in Merlin base image (instead of copying from build)""
 > git rev-list --no-walk eb0789cb30033872c2d0aa82b722fc6e94b24da8 # timeout=10
[merlin_merlin] $ /bin/bash /tmp/jenkins1381260785184156136.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_merlin/merlin
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 3 items
tests/unit/test_version.py .                                             [ 33%]

tests/unit/examples/test_building_deploying_multi_stage_RecSys.py F      [ 66%]

tests/unit/examples/test_scaling_criteo_merlin_models.py .               [100%]
=================================== FAILURES ===================================

__________________________________ test_func ___________________________________
self = <testbook.client.TestbookNotebookClient object at 0x7f4e3c5d4b50>

cell = [53], kwargs = {}, cell_indexes = [53], executed_cells = [], idx = 53
def execute_cell(self, cell, **kwargs) -> Union[Dict, List[Dict]]:
    """
    Executes a cell or list of cells
    """
    if isinstance(cell, slice):
        start, stop = self._cell_index(cell.start), self._cell_index(cell.stop)
        if cell.step is not None:
            raise TestbookError('testbook does not support step argument')

        cell = range(start, stop + 1)
    elif isinstance(cell, str) or isinstance(cell, int):
        cell = [cell]

    cell_indexes = cell

    if all(isinstance(x, str) for x in cell):
        cell_indexes = [self._cell_index(tag) for tag in cell]

    executed_cells = []
    for idx in cell_indexes:
        try:


          cell = super().execute_cell(self.nb['cells'][idx], idx, **kwargs)


/usr/local/lib/python3.8/dist-packages/testbook/client.py:133:

args = (<testbook.client.TestbookNotebookClient object at 0x7f4e3c5d4b50>, {'id': '1d24bffc', 'cell_type': 'code', 'metadata'...ast.py, line 299 in transform>]"\n\nAt:\n  /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}, 53)

kwargs = {}
def wrapped(*args, **kwargs):


  return just_run(coro(*args, **kwargs))


/usr/local/lib/python3.8/dist-packages/nbclient/util.py:85:

coro = <coroutine object NotebookClient.async_execute_cell at 0x7f4d63df33c0>
def just_run(coro: Awaitable) -> Any:
    """Make the coroutine run, even if there is an event loop running (using nest_asyncio)"""
    try:
        loop = asyncio.get_running_loop()
    except RuntimeError:
        loop = None
    if loop is None:
        had_running_loop = False
        loop = asyncio.new_event_loop()
        asyncio.set_event_loop(loop)
    else:
        had_running_loop = True
    if had_running_loop:
        # if there is a running loop, we patch using nest_asyncio
        # to have reentrant event loops
        check_ipython()
        import nest_asyncio

        nest_asyncio.apply()
        check_patch_tornado()


  return loop.run_until_complete(coro)


/usr/local/lib/python3.8/dist-packages/nbclient/util.py:60:

self = <_UnixSelectorEventLoop running=False closed=False debug=False>

future = <Task finished name='Task-369' coro=<NotebookClient.async_execute_cell() done, defined at /usr/local/lib/python3.8/dis...ps/feast.py, line 299 in transform>]"\n\nAt:\n  /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n\n')>
def run_until_complete(self, future):
    """Run until the Future is done.

    If the argument is a coroutine, it is wrapped in a Task.

    WARNING: It would be disastrous to call run_until_complete()
    with the same coroutine twice -- it would wrap it in two
    different Tasks and that can't be good.

    Return the Future's result, or raise its exception.
    """
    self._check_closed()
    self._check_running()

    new_task = not futures.isfuture(future)
    future = tasks.ensure_future(future, loop=self)
    if new_task:
        # An exception is raised if the future didn't complete, so there
        # is no need to log the "destroy pending task" message
        future._log_destroy_pending = False

    future.add_done_callback(_run_until_complete_cb)
    try:
        self.run_forever()
    except:
        if new_task and future.done() and not future.cancelled():
            # The coroutine raised a BaseException. Consume the exception
            # to not log a warning, the caller doesn't have access to the
            # local task.
            future.exception()
        raise
    finally:
        future.remove_done_callback(_run_until_complete_cb)
    if not future.done():
        raise RuntimeError('Event loop stopped before Future completed.')


  return future.result()


/usr/lib/python3.8/asyncio/base_events.py:616:

self = <testbook.client.TestbookNotebookClient object at 0x7f4e3c5d4b50>

cell = {'id': '1d24bffc', 'cell_type': 'code', 'metadata': {'execution': {'iopub.status.busy': '2022-08-10T18:04:27.719545Z',...ps/feast.py, line 299 in transform>]"\n\nAt:\n  /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}

cell_index = 53, execution_count = None, store_history = True
async def async_execute_cell(
    self,
    cell: NotebookNode,
    cell_index: int,
    execution_count: t.Optional[int] = None,
    store_history: bool = True,
) -> NotebookNode:
    """
    Executes a single code cell.

    To execute all cells see :meth:`execute`.

    Parameters
    ----------
    cell : nbformat.NotebookNode
        The cell which is currently being processed.
    cell_index : int
        The position of the cell within the notebook object.
    execution_count : int
        The execution count to be assigned to the cell (default: Use kernel response)
    store_history : bool
        Determines if history should be stored in the kernel (default: False).
        Specific to ipython kernels, which can store command histories.

    Returns
    -------
    output : dict
        The execution output payload (or None for no output).

    Raises
    ------
    CellExecutionError
        If execution failed and should raise an exception, this will be raised
        with defaults about the failure.

    Returns
    -------
    cell : NotebookNode
        The cell which was just processed.
    """
    assert self.kc is not None

    await run_hook(self.on_cell_start, cell=cell, cell_index=cell_index)

    if cell.cell_type != 'code' or not cell.source.strip():
        self.log.debug("Skipping non-executing cell %s", cell_index)
        return cell

    if self.skip_cells_with_tag in cell.metadata.get("tags", []):
        self.log.debug("Skipping tagged cell %s", cell_index)
        return cell

    if self.record_timing:  # clear execution metadata prior to execution
        cell['metadata']['execution'] = {}

    self.log.debug("Executing cell:\n%s", cell.source)

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors or "raises-exception" in cell.metadata.get("tags", [])
    )

    await run_hook(self.on_cell_execute, cell=cell, cell_index=cell_index)
    parent_msg_id = await ensure_async(
        self.kc.execute(
            cell.source, store_history=store_history, stop_on_error=not cell_allows_errors
        )
    )
    await run_hook(self.on_cell_complete, cell=cell, cell_index=cell_index)
    # We launched a code cell to execute
    self.code_cells_executed += 1
    exec_timeout = self._get_timeout(cell)

    cell.outputs = []
    self.clear_before_next_output = False

    task_poll_kernel_alive = asyncio.ensure_future(self._async_poll_kernel_alive())
    task_poll_output_msg = asyncio.ensure_future(
        self._async_poll_output_msg(parent_msg_id, cell, cell_index)
    )
    self.task_poll_for_reply = asyncio.ensure_future(
        self._async_poll_for_reply(
            parent_msg_id, cell, exec_timeout, task_poll_output_msg, task_poll_kernel_alive
        )
    )
    try:
        exec_reply = await self.task_poll_for_reply
    except asyncio.CancelledError:
        # can only be cancelled by task_poll_kernel_alive when the kernel is dead
        task_poll_output_msg.cancel()
        raise DeadKernelError("Kernel died")
    except Exception as e:
        # Best effort to cancel request if it hasn't been resolved
        try:
            # Check if the task_poll_output is doing the raising for us
            if not isinstance(e, CellControlSignal):
                task_poll_output_msg.cancel()
        finally:
            raise

    if execution_count:
        cell['execution_count'] = execution_count
    await run_hook(
        self.on_cell_executed, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )


  await self._check_raise_for_error(cell, cell_index, exec_reply)


/usr/local/lib/python3.8/dist-packages/nbclient/client.py:1022:

self = <testbook.client.TestbookNotebookClient object at 0x7f4e3c5d4b50>

cell = {'id': '1d24bffc', 'cell_type': 'code', 'metadata': {'execution': {'iopub.status.busy': '2022-08-10T18:04:27.719545Z',...ps/feast.py, line 299 in transform>]"\n\nAt:\n  /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}

cell_index = 53

exec_reply = {'buffers': [], 'content': {'ename': 'InferenceServerException', 'engine_info': {'engine_id': -1, 'engine_uuid': '13b2...e, 'engine': '13b2e983-e38b-40e0-bdd3-64a3c0777f24', 'started': '2022-08-10T18:04:27.719810Z', 'status': 'error'}, ...}
async def _check_raise_for_error(
    self, cell: NotebookNode, cell_index: int, exec_reply: t.Optional[t.Dict]
) -> None:

    if exec_reply is None:
        return None

    exec_reply_content = exec_reply['content']
    if exec_reply_content['status'] != 'error':
        return None

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors
        or exec_reply_content.get('ename') in self.allow_error_names
        or "raises-exception" in cell.metadata.get("tags", [])
    )
    await run_hook(
        self.on_cell_error, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )
    if not cell_allows_errors:


      raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)


E           nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:

E           ------------------

E

E           import shutil

E           from merlin.models.loader.tf_utils import configure_tensorflow

E           configure_tensorflow()

E           from merlin.systems.triton.utils import run_ensemble_on_tritonserver

E           response = run_ensemble_on_tritonserver(

E               "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"

E           )

E           response = [x.tolist()[0] for x in response["ordered_ids"]]

E           shutil.rmtree("/tmp/examples/", ignore_errors=True)

E

E           ------------------

E

E           �[0;31m---------------------------------------------------------------------------�[0m

E           �[0;31mInferenceServerException�[0m                  Traceback (most recent call last)

E           Input �[0;32mIn [32]�[0m, in �[0;36m<cell line: 5>�[0;34m()�[0m

E           �[1;32m      3�[0m configure_tensorflow()

E           �[1;32m      4�[0m �[38;5;28;01mfrom�[39;00m �[38;5;21;01mmerlin�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01msystems�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mtriton�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mutils�[39;00m �[38;5;28;01mimport�[39;00m run_ensemble_on_tritonserver

E           �[0;32m----> 5�[0m response �[38;5;241m=�[39m �[43mrun_ensemble_on_tritonserver�[49m�[43m(�[49m

E           �[1;32m      6�[0m �[43m    �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43m/tmp/examples/poc_ensemble�[39;49m�[38;5;124;43m"�[39;49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest�[49m�[43m,�[49m�[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43mensemble_model�[39;49m�[38;5;124;43m"�[39;49m

E           �[1;32m      7�[0m �[43m)�[49m

E           �[1;32m      8�[0m response �[38;5;241m=�[39m [x�[38;5;241m.�[39mtolist()[�[38;5;241m0�[39m] �[38;5;28;01mfor�[39;00m x �[38;5;129;01min�[39;00m response[�[38;5;124m"�[39m�[38;5;124mordered_ids�[39m�[38;5;124m"�[39m]]

E           �[1;32m      9�[0m shutil�[38;5;241m.�[39mrmtree(�[38;5;124m"�[39m�[38;5;124m/tmp/examples/�[39m�[38;5;124m"�[39m, ignore_errors�[38;5;241m=�[39m�[38;5;28;01mTrue�[39;00m)

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:93�[0m, in �[0;36mrun_ensemble_on_tritonserver�[0;34m(tmpdir, output_columns, df, model_name)�[0m

E           �[1;32m     91�[0m response �[38;5;241m=�[39m �[38;5;28;01mNone�[39;00m

E           �[1;32m     92�[0m �[38;5;28;01mwith�[39;00m run_triton_server(tmpdir) �[38;5;28;01mas�[39;00m client:

E           �[0;32m---> 93�[0m     response �[38;5;241m=�[39m �[43msend_triton_request�[49m�[43m(�[49m�[43mdf�[49m�[43m,�[49m�[43m �[49m�[43moutput_columns�[49m�[43m,�[49m�[43m �[49m�[43mclient�[49m�[38;5;241;43m=�[39;49m�[43mclient�[49m�[43m,�[49m�[43m �[49m�[43mtriton_model�[49m�[38;5;241;43m=�[39;49m�[43mmodel_name�[49m�[43m)�[49m

E           �[1;32m     95�[0m �[38;5;28;01mreturn�[39;00m response

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:141�[0m, in �[0;36msend_triton_request�[0;34m(df, outputs_list, client, endpoint, request_id, triton_model)�[0m

E           �[1;32m    139�[0m outputs �[38;5;241m=�[39m [grpcclient�[38;5;241m.�[39mInferRequestedOutput(col) �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list]

E           �[1;32m    140�[0m �[38;5;28;01mwith�[39;00m client:

E           �[0;32m--> 141�[0m     response �[38;5;241m=�[39m �[43mclient�[49m�[38;5;241;43m.�[39;49m�[43minfer�[49m�[43m(�[49m�[43mtriton_model�[49m�[43m,�[49m�[43m �[49m�[43minputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest_id�[49m�[38;5;241;43m=�[39;49m�[43mrequest_id�[49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[38;5;241;43m=�[39;49m�[43moutputs�[49m�[43m)�[49m

E           �[1;32m    143�[0m results �[38;5;241m=�[39m {}

E           �[1;32m    144�[0m �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list:

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322�[0m, in �[0;36mInferenceServerClient.infer�[0;34m(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)�[0m

E           �[1;32m   1320�[0m     �[38;5;28;01mreturn�[39;00m result

E           �[1;32m   1321�[0m �[38;5;28;01mexcept�[39;00m grpc�[38;5;241m.�[39mRpcError �[38;5;28;01mas�[39;00m rpc_error:

E           �[0;32m-> 1322�[0m     �[43mraise_error_grpc�[49m�[43m(�[49m�[43mrpc_error�[49m�[43m)�[49m

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62�[0m, in �[0;36mraise_error_grpc�[0;34m(rpc_error)�[0m

E           �[1;32m     61�[0m �[38;5;28;01mdef�[39;00m �[38;5;21mraise_error_grpc�[39m(rpc_error):

E           �[0;32m---> 62�[0m     �[38;5;28;01mraise�[39;00m get_error_grpc(rpc_error) �[38;5;28;01mfrom�[39;00m �[38;5;28mNone�[39m

E

E           �[0;31mInferenceServerException�[0m: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:

E               1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)

E

E           Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"

E

E           At:

E             /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute

E

E           InferenceServerException: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:

E               1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)

E

E           Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"

E

E           At:

E             /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
/usr/local/lib/python3.8/dist-packages/nbclient/client.py:916: CellExecutionError
During handling of the above exception, another exception occurred:
def test_func():
    with testbook(
        REPO_ROOT
        / "examples"
        / "Building-and-deploying-multi-stage-RecSys"
        / "01-Building-Recommender-Systems-with-Merlin.ipynb",
        execute=False,
    ) as tb1:
        tb1.inject(
            """
            import os
            os.environ["DATA_FOLDER"] = "/tmp/data/"
            os.environ["NUM_ROWS"] = "10000"
            os.system("mkdir -p /tmp/examples")
            os.environ["BASE_DIR"] = "/tmp/examples/"
            """
        )
        tb1.execute()
        assert os.path.isdir("/tmp/examples/dlrm")
        assert os.path.isdir("/tmp/examples/feature_repo")
        assert os.path.isdir("/tmp/examples/query_tower")
        assert os.path.isfile("/tmp/examples/item_embeddings.parquet")
        assert os.path.isfile("/tmp/examples/feature_repo/user_features.py")
        assert os.path.isfile("/tmp/examples/feature_repo/item_features.py")

    with testbook(
        REPO_ROOT
        / "examples"
        / "Building-and-deploying-multi-stage-RecSys"
        / "02-Deploying-multi-stage-RecSys-with-Merlin-Systems.ipynb",
        execute=False,
    ) as tb2:
        tb2.inject(
            """
            import os
            os.environ["DATA_FOLDER"] = "/tmp/data/"
            os.environ["BASE_DIR"] = "/tmp/examples/"
            """
        )
        NUM_OF_CELLS = len(tb2.cells)
        tb2.execute_cell(list(range(0, NUM_OF_CELLS - 3)))
        top_k = tb2.ref("top_k")
        outputs = tb2.ref("outputs")
        assert outputs[0] == "ordered_ids"


      tb2.inject(


            """
            import shutil
            from merlin.models.loader.tf_utils import configure_tensorflow
            configure_tensorflow()
            from merlin.systems.triton.utils import run_ensemble_on_tritonserver
            response = run_ensemble_on_tritonserver(
                "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
            )
            response = [x.tolist()[0] for x in response["ordered_ids"]]
            shutil.rmtree("/tmp/examples/", ignore_errors=True)
            """
        )

tests/unit/examples/test_building_deploying_multi_stage_RecSys.py:57:

/usr/local/lib/python3.8/dist-packages/testbook/client.py:237: in inject

cell = TestbookNode(self.execute_cell(inject_idx)) if run else TestbookNode(code_cell)

self = <testbook.client.TestbookNotebookClient object at 0x7f4e3c5d4b50>

cell = [53], kwargs = {}, cell_indexes = [53], executed_cells = [], idx = 53
def execute_cell(self, cell, **kwargs) -> Union[Dict, List[Dict]]:
    """
    Executes a cell or list of cells
    """
    if isinstance(cell, slice):
        start, stop = self._cell_index(cell.start), self._cell_index(cell.stop)
        if cell.step is not None:
            raise TestbookError('testbook does not support step argument')

        cell = range(start, stop + 1)
    elif isinstance(cell, str) or isinstance(cell, int):
        cell = [cell]

    cell_indexes = cell

    if all(isinstance(x, str) for x in cell):
        cell_indexes = [self._cell_index(tag) for tag in cell]

    executed_cells = []
    for idx in cell_indexes:
        try:
            cell = super().execute_cell(self.nb['cells'][idx], idx, **kwargs)
        except CellExecutionError as ce:


          raise TestbookRuntimeError(ce.evalue, ce, self._get_error_class(ce.ename))


E               testbook.exceptions.TestbookRuntimeError: An error occurred while executing the following cell:

E               ------------------

E

E               import shutil

E               from merlin.models.loader.tf_utils import configure_tensorflow

E               configure_tensorflow()

E               from merlin.systems.triton.utils import run_ensemble_on_tritonserver

E               response = run_ensemble_on_tritonserver(

E                   "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"

E               )

E               response = [x.tolist()[0] for x in response["ordered_ids"]]

E               shutil.rmtree("/tmp/examples/", ignore_errors=True)

E

E               ------------------

E

E               �[0;31m---------------------------------------------------------------------------�[0m

E               �[0;31mInferenceServerException�[0m                  Traceback (most recent call last)

E               Input �[0;32mIn [32]�[0m, in �[0;36m<cell line: 5>�[0;34m()�[0m

E               �[1;32m      3�[0m configure_tensorflow()

E               �[1;32m      4�[0m �[38;5;28;01mfrom�[39;00m �[38;5;21;01mmerlin�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01msystems�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mtriton�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mutils�[39;00m �[38;5;28;01mimport�[39;00m run_ensemble_on_tritonserver

E               �[0;32m----> 5�[0m response �[38;5;241m=�[39m �[43mrun_ensemble_on_tritonserver�[49m�[43m(�[49m

E               �[1;32m      6�[0m �[43m    �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43m/tmp/examples/poc_ensemble�[39;49m�[38;5;124;43m"�[39;49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest�[49m�[43m,�[49m�[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43mensemble_model�[39;49m�[38;5;124;43m"�[39;49m

E               �[1;32m      7�[0m �[43m)�[49m

E               �[1;32m      8�[0m response �[38;5;241m=�[39m [x�[38;5;241m.�[39mtolist()[�[38;5;241m0�[39m] �[38;5;28;01mfor�[39;00m x �[38;5;129;01min�[39;00m response[�[38;5;124m"�[39m�[38;5;124mordered_ids�[39m�[38;5;124m"�[39m]]

E               �[1;32m      9�[0m shutil�[38;5;241m.�[39mrmtree(�[38;5;124m"�[39m�[38;5;124m/tmp/examples/�[39m�[38;5;124m"�[39m, ignore_errors�[38;5;241m=�[39m�[38;5;28;01mTrue�[39;00m)

E

E               File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:93�[0m, in �[0;36mrun_ensemble_on_tritonserver�[0;34m(tmpdir, output_columns, df, model_name)�[0m

E               �[1;32m     91�[0m response �[38;5;241m=�[39m �[38;5;28;01mNone�[39;00m

E               �[1;32m     92�[0m �[38;5;28;01mwith�[39;00m run_triton_server(tmpdir) �[38;5;28;01mas�[39;00m client:

E               �[0;32m---> 93�[0m     response �[38;5;241m=�[39m �[43msend_triton_request�[49m�[43m(�[49m�[43mdf�[49m�[43m,�[49m�[43m �[49m�[43moutput_columns�[49m�[43m,�[49m�[43m �[49m�[43mclient�[49m�[38;5;241;43m=�[39;49m�[43mclient�[49m�[43m,�[49m�[43m �[49m�[43mtriton_model�[49m�[38;5;241;43m=�[39;49m�[43mmodel_name�[49m�[43m)�[49m

E               �[1;32m     95�[0m �[38;5;28;01mreturn�[39;00m response

E

E               File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:141�[0m, in �[0;36msend_triton_request�[0;34m(df, outputs_list, client, endpoint, request_id, triton_model)�[0m

E               �[1;32m    139�[0m outputs �[38;5;241m=�[39m [grpcclient�[38;5;241m.�[39mInferRequestedOutput(col) �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list]

E               �[1;32m    140�[0m �[38;5;28;01mwith�[39;00m client:

E               �[0;32m--> 141�[0m     response �[38;5;241m=�[39m �[43mclient�[49m�[38;5;241;43m.�[39;49m�[43minfer�[49m�[43m(�[49m�[43mtriton_model�[49m�[43m,�[49m�[43m �[49m�[43minputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest_id�[49m�[38;5;241;43m=�[39;49m�[43mrequest_id�[49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[38;5;241;43m=�[39;49m�[43moutputs�[49m�[43m)�[49m

E               �[1;32m    143�[0m results �[38;5;241m=�[39m {}

E               �[1;32m    144�[0m �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list:

E

E               File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322�[0m, in �[0;36mInferenceServerClient.infer�[0;34m(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)�[0m

E               �[1;32m   1320�[0m     �[38;5;28;01mreturn�[39;00m result

E               �[1;32m   1321�[0m �[38;5;28;01mexcept�[39;00m grpc�[38;5;241m.�[39mRpcError �[38;5;28;01mas�[39;00m rpc_error:

E               �[0;32m-> 1322�[0m     �[43mraise_error_grpc�[49m�[43m(�[49m�[43mrpc_error�[49m�[43m)�[49m

E

E               File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62�[0m, in �[0;36mraise_error_grpc�[0;34m(rpc_error)�[0m

E               �[1;32m     61�[0m �[38;5;28;01mdef�[39;00m �[38;5;21mraise_error_grpc�[39m(rpc_error):

E               �[0;32m---> 62�[0m     �[38;5;28;01mraise�[39;00m get_error_grpc(rpc_error) �[38;5;28;01mfrom�[39;00m �[38;5;28mNone�[39m

E

E               �[0;31mInferenceServerException�[0m: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:

E                   1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)

E

E               Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"

E

E               At:

E                 /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute

E

E               InferenceServerException: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:

E                   1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)

E

E               Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"

E

E               At:

E                 /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
/usr/local/lib/python3.8/dist-packages/testbook/client.py:135: TestbookRuntimeError

----------------------------- Captured stdout call -----------------------------

Signal (2) received.

----------------------------- Captured stderr call -----------------------------

2022-08-10 18:02:48.661344: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA

To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

2022-08-10 18:02:50.651043: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1627 MB memory:  -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0

2022-08-10 18:02:50.651792: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 15153 MB memory:  -> device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0

Error in atexit._run_exitfuncs:

Traceback (most recent call last):

File "/usr/lib/python3.8/logging/init.py", line 2127, in shutdown

h.close()

File "/usr/local/lib/python3.8/dist-packages/absl/logging/init.py", line 934, in close

self.stream.close()

File "/usr/local/lib/python3.8/dist-packages/ipykernel/iostream.py", line 438, in close

self.watch_fd_thread.join()

AttributeError: 'OutStream' object has no attribute 'watch_fd_thread'

WARNING clustering 244 points to 32 centroids: please provide at least 1248 training points

2022-08-10 18:04:20.840599: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA

To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

2022-08-10 18:04:22.808297: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1627 MB memory:  -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0

2022-08-10 18:04:22.809041: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 15153 MB memory:  -> device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0

I0810 18:04:27.987115 19319 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f881e000000' with size 268435456

I0810 18:04:27.987889 19319 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864

I0810 18:04:27.995060 19319 model_repository_manager.cc:1191] loading: 0_queryfeast:1

I0810 18:04:28.095370 19319 model_repository_manager.cc:1191] loading: 1_predicttensorflow:1

I0810 18:04:28.102490 19319 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 0_queryfeast (GPU device 0)

I0810 18:04:28.195701 19319 model_repository_manager.cc:1191] loading: 2_queryfaiss:1

I0810 18:04:28.295908 19319 model_repository_manager.cc:1191] loading: 3_queryfeast:1

I0810 18:04:28.396177 19319 model_repository_manager.cc:1191] loading: 4_unrollfeatures:1

I0810 18:04:28.496440 19319 model_repository_manager.cc:1191] loading: 5_predicttensorflow:1

I0810 18:04:28.596724 19319 model_repository_manager.cc:1191] loading: 6_softmaxsampling:1

I0810 18:04:30.375890 19319 model_repository_manager.cc:1345] successfully loaded '0_queryfeast' version 1

I0810 18:04:30.658497 19319 tensorflow.cc:2181] TRITONBACKEND_Initialize: tensorflow

I0810 18:04:30.658529 19319 tensorflow.cc:2191] Triton TRITONBACKEND API version: 1.9

I0810 18:04:30.658536 19319 tensorflow.cc:2197] 'tensorflow' TRITONBACKEND API version: 1.9

I0810 18:04:30.658541 19319 tensorflow.cc:2221] backend configuration:

{"cmdline":{"auto-complete-config":"false","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","version":"2","default-max-batch-size":"4"}}

I0810 18:04:30.658577 19319 tensorflow.cc:2281] TRITONBACKEND_ModelInitialize: 1_predicttensorflow (version 1)

I0810 18:04:30.661958 19319 tensorflow.cc:2281] TRITONBACKEND_ModelInitialize: 5_predicttensorflow (version 1)

I0810 18:04:30.663740 19319 tensorflow.cc:2330] TRITONBACKEND_ModelInstanceInitialize: 1_predicttensorflow (GPU device 0)

2022-08-10 18:04:31.006951: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel

2022-08-10 18:04:31.010333: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }

2022-08-10 18:04:31.010362: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel

2022-08-10 18:04:31.010470: I tensorflow/core/platform/cpu_feature_guard.cc:152] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX

To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

2022-08-10 18:04:31.046373: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 12648 MB memory:  -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0

2022-08-10 18:04:31.089872: I tensorflow/cc/saved_model/loader.cc:230] Restoring SavedModel bundle.

2022-08-10 18:04:31.169147: I tensorflow/cc/saved_model/loader.cc:214] Running initialization op on SavedModel bundle at path: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel

2022-08-10 18:04:31.193671: I tensorflow/cc/saved_model/loader.cc:321] SavedModel load for tags { serve }; Status: success: OK. Took 186741 microseconds.

I0810 18:04:31.193790 19319 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 2_queryfaiss (GPU device 0)

I0810 18:04:31.193883 19319 model_repository_manager.cc:1345] successfully loaded '1_predicttensorflow' version 1

I0810 18:04:33.564477 19319 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 3_queryfeast (GPU device 0)

I0810 18:04:33.566109 19319 model_repository_manager.cc:1345] successfully loaded '2_queryfaiss' version 1

I0810 18:04:35.894277 19319 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 4_unrollfeatures (GPU device 0)

I0810 18:04:35.894593 19319 model_repository_manager.cc:1345] successfully loaded '3_queryfeast' version 1

I0810 18:04:37.969004 19319 tensorflow.cc:2330] TRITONBACKEND_ModelInstanceInitialize: 5_predicttensorflow (GPU device 0)

I0810 18:04:37.969252 19319 model_repository_manager.cc:1345] successfully loaded '4_unrollfeatures' version 1

2022-08-10 18:04:37.970339: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel

2022-08-10 18:04:37.987152: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }

2022-08-10 18:04:37.987191: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel

2022-08-10 18:04:37.989441: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 12648 MB memory:  -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0

2022-08-10 18:04:38.010485: I tensorflow/cc/saved_model/loader.cc:230] Restoring SavedModel bundle.

2022-08-10 18:04:38.163464: I tensorflow/cc/saved_model/loader.cc:214] Running initialization op on SavedModel bundle at path: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel

2022-08-10 18:04:38.214070: I tensorflow/cc/saved_model/loader.cc:321] SavedModel load for tags { serve }; Status: success: OK. Took 243742 microseconds.

I0810 18:04:38.214204 19319 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 6_softmaxsampling (GPU device 0)

I0810 18:04:38.214492 19319 model_repository_manager.cc:1345] successfully loaded '5_predicttensorflow' version 1

I0810 18:04:40.285347 19319 model_repository_manager.cc:1345] successfully loaded '6_softmaxsampling' version 1

I0810 18:04:40.288308 19319 model_repository_manager.cc:1191] loading: ensemble_model:1

I0810 18:04:40.389119 19319 model_repository_manager.cc:1345] successfully loaded 'ensemble_model' version 1

I0810 18:04:40.389291 19319 server.cc:556]

+------------------+------+

| Repository Agent | Path |

+------------------+------+

+------------------+------+
I0810 18:04:40.389404 19319 server.cc:583]

+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| Backend    | Path                                                            | Config                                                                                                                                                                       |

+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| python     | /opt/tritonserver/backends/python/libtriton_python.so           | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}}               |

| tensorflow | /opt/tritonserver/backends/tensorflow2/libtriton_tensorflow2.so | {"cmdline":{"auto-complete-config":"false","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","version":"2","default-max-batch-size":"4"}} |

+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0810 18:04:40.389516 19319 server.cc:626]

+---------------------+---------+--------+

| Model               | Version | Status |

+---------------------+---------+--------+

| 0_queryfeast        | 1       | READY  |

| 1_predicttensorflow | 1       | READY  |

| 2_queryfaiss        | 1       | READY  |

| 3_queryfeast        | 1       | READY  |

| 4_unrollfeatures    | 1       | READY  |

| 5_predicttensorflow | 1       | READY  |

| 6_softmaxsampling   | 1       | READY  |

| ensemble_model      | 1       | READY  |

+---------------------+---------+--------+
I0810 18:04:40.452798 19319 metrics.cc:650] Collecting metrics for GPU 0: Tesla P100-DGXS-16GB

I0810 18:04:40.453645 19319 tritonserver.cc:2138]

+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| Option                           | Value                                                                                                                                                                                        |

+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| server_id                        | triton                                                                                                                                                                                       |

| server_version                   | 2.22.0                                                                                                                                                                                       |

| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |

| model_repository_path[0]         | /tmp/examples/poc_ensemble                                                                                                                                                                   |

| model_control_mode               | MODE_NONE                                                                                                                                                                                    |

| strict_model_config              | 1                                                                                                                                                                                            |

| rate_limit                       | OFF                                                                                                                                                                                          |

| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                                    |

| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                                                     |

| response_cache_byte_size         | 0                                                                                                                                                                                            |

| min_supported_compute_capability | 6.0                                                                                                                                                                                          |

| strict_readiness                 | 1                                                                                                                                                                                            |

| exit_timeout                     | 30                                                                                                                                                                                           |

+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0810 18:04:40.454427 19319 grpc_server.cc:4589] Started GRPCInferenceService at 0.0.0.0:8001

I0810 18:04:40.454970 19319 http_server.cc:3303] Started HTTPService at 0.0.0.0:8000

I0810 18:04:40.496223 19319 http_server.cc:178] Started Metrics Service at 0.0.0.0:8002

W0810 18:04:41.469927 19319 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0

W0810 18:04:41.469992 19319 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0

W0810 18:04:42.470152 19319 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0

W0810 18:04:42.470210 19319 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0

0810 18:04:43.250588 19576 pb_stub.cc:749] Failed to process the request(s) for model '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:

1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
At:

/tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
I0810 18:04:43.254793 19319 server.cc:257] Waiting for in-flight requests to complete.

I0810 18:04:43.254835 19319 server.cc:273] Timeout 30: Found 0 model versions that have in-flight inferences

I0810 18:04:43.254844 19319 model_repository_manager.cc:1223] unloading: ensemble_model:1

I0810 18:04:43.254894 19319 model_repository_manager.cc:1223] unloading: 6_softmaxsampling:1

I0810 18:04:43.254926 19319 model_repository_manager.cc:1223] unloading: 5_predicttensorflow:1

I0810 18:04:43.254955 19319 model_repository_manager.cc:1223] unloading: 4_unrollfeatures:1

I0810 18:04:43.255031 19319 model_repository_manager.cc:1223] unloading: 3_queryfeast:1

I0810 18:04:43.255053 19319 model_repository_manager.cc:1223] unloading: 2_queryfaiss:1

I0810 18:04:43.255077 19319 model_repository_manager.cc:1328] successfully unloaded 'ensemble_model' version 1

I0810 18:04:43.255103 19319 tensorflow.cc:2368] TRITONBACKEND_ModelInstanceFinalize: delete instance state

I0810 18:04:43.255107 19319 model_repository_manager.cc:1223] unloading: 1_predicttensorflow:1

I0810 18:04:43.255208 19319 model_repository_manager.cc:1223] unloading: 0_queryfeast:1

I0810 18:04:43.255228 19319 server.cc:288] All models are stopped, unloading models

I0810 18:04:43.255244 19319 server.cc:295] Timeout 30: Found 7 live models and 0 in-flight non-inference requests

I0810 18:04:43.255399 19319 tensorflow.cc:2368] TRITONBACKEND_ModelInstanceFinalize: delete instance state

I0810 18:04:43.255429 19319 tensorflow.cc:2307] TRITONBACKEND_ModelFinalize: delete model state

I0810 18:04:43.255594 19319 tensorflow.cc:2307] TRITONBACKEND_ModelFinalize: delete model state

I0810 18:04:43.271212 19319 model_repository_manager.cc:1328] successfully unloaded '1_predicttensorflow' version 1

I0810 18:04:43.278970 19319 model_repository_manager.cc:1328] successfully unloaded '5_predicttensorflow' version 1

W0810 18:04:43.489145 19319 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0

W0810 18:04:43.489197 19319 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0

I0810 18:04:44.255433 19319 server.cc:295] Timeout 29: Found 5 live models and 0 in-flight non-inference requests

I0810 18:04:44.620348 19319 model_repository_manager.cc:1328] successfully unloaded '6_softmaxsampling' version 1

I0810 18:04:44.708426 19319 model_repository_manager.cc:1328] successfully unloaded '4_unrollfeatures' version 1

I0810 18:04:44.870496 19319 model_repository_manager.cc:1328] successfully unloaded '2_queryfaiss' version 1

I0810 18:04:45.255595 19319 server.cc:295] Timeout 28: Found 2 live models and 0 in-flight non-inference requests

I0810 18:04:46.255722 19319 server.cc:295] Timeout 27: Found 2 live models and 0 in-flight non-inference requests

I0810 18:04:47.255852 19319 server.cc:295] Timeout 26: Found 2 live models and 0 in-flight non-inference requests

I0810 18:04:48.255989 19319 server.cc:295] Timeout 25: Found 2 live models and 0 in-flight non-inference requests

/usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py:15: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

ValueType.FLOAT: (np.float, False, False),

I0810 18:04:48.834536 19319 model_repository_manager.cc:1328] successfully unloaded '0_queryfeast' version 1

I0810 18:04:49.256143 19319 server.cc:295] Timeout 24: Found 1 live models and 0 in-flight non-inference requests

I0810 18:04:50.256278 19319 server.cc:295] Timeout 23: Found 1 live models and 0 in-flight non-inference requests

I0810 18:04:51.256412 19319 server.cc:295] Timeout 22: Found 1 live models and 0 in-flight non-inference requests

I0810 18:04:52.256548 19319 server.cc:295] Timeout 21: Found 1 live models and 0 in-flight non-inference requests

I0810 18:04:53.256684 19319 server.cc:295] Timeout 20: Found 1 live models and 0 in-flight non-inference requests

I0810 18:04:54.256870 19319 server.cc:295] Timeout 19: Found 1 live models and 0 in-flight non-inference requests

I0810 18:04:55.257010 19319 server.cc:295] Timeout 18: Found 1 live models and 0 in-flight non-inference requests

/usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py:15: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

ValueType.FLOAT: (np.float, False, False),

I0810 18:04:55.517263 19319 model_repository_manager.cc:1328] successfully unloaded '3_queryfeast' version 1

I0810 18:04:56.257151 19319 server.cc:295] Timeout 17: Found 0 live models and 0 in-flight non-inference requests

=========================== short test summary info ============================

FAILED tests/unit/examples/test_building_deploying_multi_stage_RecSys.py::test_func

=================== 1 failed, 2 passed in 236.46s (0:03:56) ====================

Build step 'Execute shell' marked build as failure

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

cd /var/jenkins_home/

CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://github.com/gitapi/repos/NVIDIA-Merlin/Merlin/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[merlin_merlin] $ /bin/bash /tmp/jenkins12207878488410213477.sh

nvidia-merlin-bot · 2022-08-10T18:09:24Z

Click to view CI Results

GitHub pull request #538 of commit b09e3e75ed95ecfc8f8ae9ef0ca06088ee55d7f7, no merge conflicts.
Running as SYSTEM
Building on master in workspace /var/jenkins_home/workspace/merlin_merlin
using credential systems-login
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Merlin # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Merlin
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Merlin +refs/pull/538/*:refs/remotes/origin/pr/538/* # timeout=10
 > git rev-parse b09e3e75ed95ecfc8f8ae9ef0ca06088ee55d7f7^{commit} # timeout=10
Checking out Revision b09e3e75ed95ecfc8f8ae9ef0ca06088ee55d7f7 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f b09e3e75ed95ecfc8f8ae9ef0ca06088ee55d7f7 # timeout=10
Commit message: "Revert "Remove duplicate CMake installs (#523)""
 > git rev-list --no-walk e417fec111908e44b5ecfbaa75046c4d5ed7baba # timeout=10
[merlin_merlin] $ /bin/bash /tmp/jenkins5161626323828587418.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_merlin/merlin
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 3 items
tests/unit/test_version.py .                                             [ 33%]

tests/unit/examples/test_building_deploying_multi_stage_RecSys.py F      [ 66%]

tests/unit/examples/test_scaling_criteo_merlin_models.py F               [100%]
=================================== FAILURES ===================================

__________________________________ test_func ___________________________________
self = <testbook.client.TestbookNotebookClient object at 0x7f38467b3f40>

cell = [53], kwargs = {}, cell_indexes = [53], executed_cells = [], idx = 53
def execute_cell(self, cell, **kwargs) -> Union[Dict, List[Dict]]:
    """
    Executes a cell or list of cells
    """
    if isinstance(cell, slice):
        start, stop = self._cell_index(cell.start), self._cell_index(cell.stop)
        if cell.step is not None:
            raise TestbookError('testbook does not support step argument')

        cell = range(start, stop + 1)
    elif isinstance(cell, str) or isinstance(cell, int):
        cell = [cell]

    cell_indexes = cell

    if all(isinstance(x, str) for x in cell):
        cell_indexes = [self._cell_index(tag) for tag in cell]

    executed_cells = []
    for idx in cell_indexes:
        try:


          cell = super().execute_cell(self.nb['cells'][idx], idx, **kwargs)


/usr/local/lib/python3.8/dist-packages/testbook/client.py:133:

args = (<testbook.client.TestbookNotebookClient object at 0x7f38467b3f40>, {'id': '5dcd84e1', 'cell_type': 'code', 'metadata'...ast.py, line 299 in transform>]"\n\nAt:\n  /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}, 53)

kwargs = {}
def wrapped(*args, **kwargs):


  return just_run(coro(*args, **kwargs))


/usr/local/lib/python3.8/dist-packages/nbclient/util.py:85:

coro = <coroutine object NotebookClient.async_execute_cell at 0x7f3845be23c0>
def just_run(coro: Awaitable) -> Any:
    """Make the coroutine run, even if there is an event loop running (using nest_asyncio)"""
    try:
        loop = asyncio.get_running_loop()
    except RuntimeError:
        loop = None
    if loop is None:
        had_running_loop = False
        loop = asyncio.new_event_loop()
        asyncio.set_event_loop(loop)
    else:
        had_running_loop = True
    if had_running_loop:
        # if there is a running loop, we patch using nest_asyncio
        # to have reentrant event loops
        check_ipython()
        import nest_asyncio

        nest_asyncio.apply()
        check_patch_tornado()


  return loop.run_until_complete(coro)


/usr/local/lib/python3.8/dist-packages/nbclient/util.py:60:

self = <_UnixSelectorEventLoop running=False closed=False debug=False>

future = <Task finished name='Task-369' coro=<NotebookClient.async_execute_cell() done, defined at /usr/local/lib/python3.8/dis...ps/feast.py, line 299 in transform>]"\n\nAt:\n  /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n\n')>
def run_until_complete(self, future):
    """Run until the Future is done.

    If the argument is a coroutine, it is wrapped in a Task.

    WARNING: It would be disastrous to call run_until_complete()
    with the same coroutine twice -- it would wrap it in two
    different Tasks and that can't be good.

    Return the Future's result, or raise its exception.
    """
    self._check_closed()
    self._check_running()

    new_task = not futures.isfuture(future)
    future = tasks.ensure_future(future, loop=self)
    if new_task:
        # An exception is raised if the future didn't complete, so there
        # is no need to log the "destroy pending task" message
        future._log_destroy_pending = False

    future.add_done_callback(_run_until_complete_cb)
    try:
        self.run_forever()
    except:
        if new_task and future.done() and not future.cancelled():
            # The coroutine raised a BaseException. Consume the exception
            # to not log a warning, the caller doesn't have access to the
            # local task.
            future.exception()
        raise
    finally:
        future.remove_done_callback(_run_until_complete_cb)
    if not future.done():
        raise RuntimeError('Event loop stopped before Future completed.')


  return future.result()


/usr/lib/python3.8/asyncio/base_events.py:616:

self = <testbook.client.TestbookNotebookClient object at 0x7f38467b3f40>

cell = {'id': '5dcd84e1', 'cell_type': 'code', 'metadata': {'execution': {'iopub.status.busy': '2022-08-10T18:08:29.559238Z',...ps/feast.py, line 299 in transform>]"\n\nAt:\n  /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}

cell_index = 53, execution_count = None, store_history = True
async def async_execute_cell(
    self,
    cell: NotebookNode,
    cell_index: int,
    execution_count: t.Optional[int] = None,
    store_history: bool = True,
) -> NotebookNode:
    """
    Executes a single code cell.

    To execute all cells see :meth:`execute`.

    Parameters
    ----------
    cell : nbformat.NotebookNode
        The cell which is currently being processed.
    cell_index : int
        The position of the cell within the notebook object.
    execution_count : int
        The execution count to be assigned to the cell (default: Use kernel response)
    store_history : bool
        Determines if history should be stored in the kernel (default: False).
        Specific to ipython kernels, which can store command histories.

    Returns
    -------
    output : dict
        The execution output payload (or None for no output).

    Raises
    ------
    CellExecutionError
        If execution failed and should raise an exception, this will be raised
        with defaults about the failure.

    Returns
    -------
    cell : NotebookNode
        The cell which was just processed.
    """
    assert self.kc is not None

    await run_hook(self.on_cell_start, cell=cell, cell_index=cell_index)

    if cell.cell_type != 'code' or not cell.source.strip():
        self.log.debug("Skipping non-executing cell %s", cell_index)
        return cell

    if self.skip_cells_with_tag in cell.metadata.get("tags", []):
        self.log.debug("Skipping tagged cell %s", cell_index)
        return cell

    if self.record_timing:  # clear execution metadata prior to execution
        cell['metadata']['execution'] = {}

    self.log.debug("Executing cell:\n%s", cell.source)

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors or "raises-exception" in cell.metadata.get("tags", [])
    )

    await run_hook(self.on_cell_execute, cell=cell, cell_index=cell_index)
    parent_msg_id = await ensure_async(
        self.kc.execute(
            cell.source, store_history=store_history, stop_on_error=not cell_allows_errors
        )
    )
    await run_hook(self.on_cell_complete, cell=cell, cell_index=cell_index)
    # We launched a code cell to execute
    self.code_cells_executed += 1
    exec_timeout = self._get_timeout(cell)

    cell.outputs = []
    self.clear_before_next_output = False

    task_poll_kernel_alive = asyncio.ensure_future(self._async_poll_kernel_alive())
    task_poll_output_msg = asyncio.ensure_future(
        self._async_poll_output_msg(parent_msg_id, cell, cell_index)
    )
    self.task_poll_for_reply = asyncio.ensure_future(
        self._async_poll_for_reply(
            parent_msg_id, cell, exec_timeout, task_poll_output_msg, task_poll_kernel_alive
        )
    )
    try:
        exec_reply = await self.task_poll_for_reply
    except asyncio.CancelledError:
        # can only be cancelled by task_poll_kernel_alive when the kernel is dead
        task_poll_output_msg.cancel()
        raise DeadKernelError("Kernel died")
    except Exception as e:
        # Best effort to cancel request if it hasn't been resolved
        try:
            # Check if the task_poll_output is doing the raising for us
            if not isinstance(e, CellControlSignal):
                task_poll_output_msg.cancel()
        finally:
            raise

    if execution_count:
        cell['execution_count'] = execution_count
    await run_hook(
        self.on_cell_executed, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )


  await self._check_raise_for_error(cell, cell_index, exec_reply)


/usr/local/lib/python3.8/dist-packages/nbclient/client.py:1022:

self = <testbook.client.TestbookNotebookClient object at 0x7f38467b3f40>

cell = {'id': '5dcd84e1', 'cell_type': 'code', 'metadata': {'execution': {'iopub.status.busy': '2022-08-10T18:08:29.559238Z',...ps/feast.py, line 299 in transform>]"\n\nAt:\n  /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}

cell_index = 53

exec_reply = {'buffers': [], 'content': {'ename': 'InferenceServerException', 'engine_info': {'engine_id': -1, 'engine_uuid': '883a...e, 'engine': '883a1b48-08f5-451b-adc9-400132eefb5b', 'started': '2022-08-10T18:08:29.559503Z', 'status': 'error'}, ...}
async def _check_raise_for_error(
    self, cell: NotebookNode, cell_index: int, exec_reply: t.Optional[t.Dict]
) -> None:

    if exec_reply is None:
        return None

    exec_reply_content = exec_reply['content']
    if exec_reply_content['status'] != 'error':
        return None

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors
        or exec_reply_content.get('ename') in self.allow_error_names
        or "raises-exception" in cell.metadata.get("tags", [])
    )
    await run_hook(
        self.on_cell_error, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )
    if not cell_allows_errors:


      raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)


E           nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:

E           ------------------

E

E           import shutil

E           from merlin.models.loader.tf_utils import configure_tensorflow

E           configure_tensorflow()

E           from merlin.systems.triton.utils import run_ensemble_on_tritonserver

E           response = run_ensemble_on_tritonserver(

E               "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"

E           )

E           response = [x.tolist()[0] for x in response["ordered_ids"]]

E           shutil.rmtree("/tmp/examples/", ignore_errors=True)

E

E           ------------------

E

E           �[0;31m---------------------------------------------------------------------------�[0m

E           �[0;31mInferenceServerException�[0m                  Traceback (most recent call last)

E           Input �[0;32mIn [32]�[0m, in �[0;36m<cell line: 5>�[0;34m()�[0m

E           �[1;32m      3�[0m configure_tensorflow()

E           �[1;32m      4�[0m �[38;5;28;01mfrom�[39;00m �[38;5;21;01mmerlin�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01msystems�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mtriton�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mutils�[39;00m �[38;5;28;01mimport�[39;00m run_ensemble_on_tritonserver

E           �[0;32m----> 5�[0m response �[38;5;241m=�[39m �[43mrun_ensemble_on_tritonserver�[49m�[43m(�[49m

E           �[1;32m      6�[0m �[43m    �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43m/tmp/examples/poc_ensemble�[39;49m�[38;5;124;43m"�[39;49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest�[49m�[43m,�[49m�[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43mensemble_model�[39;49m�[38;5;124;43m"�[39;49m

E           �[1;32m      7�[0m �[43m)�[49m

E           �[1;32m      8�[0m response �[38;5;241m=�[39m [x�[38;5;241m.�[39mtolist()[�[38;5;241m0�[39m] �[38;5;28;01mfor�[39;00m x �[38;5;129;01min�[39;00m response[�[38;5;124m"�[39m�[38;5;124mordered_ids�[39m�[38;5;124m"�[39m]]

E           �[1;32m      9�[0m shutil�[38;5;241m.�[39mrmtree(�[38;5;124m"�[39m�[38;5;124m/tmp/examples/�[39m�[38;5;124m"�[39m, ignore_errors�[38;5;241m=�[39m�[38;5;28;01mTrue�[39;00m)

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:93�[0m, in �[0;36mrun_ensemble_on_tritonserver�[0;34m(tmpdir, output_columns, df, model_name)�[0m

E           �[1;32m     91�[0m response �[38;5;241m=�[39m �[38;5;28;01mNone�[39;00m

E           �[1;32m     92�[0m �[38;5;28;01mwith�[39;00m run_triton_server(tmpdir) �[38;5;28;01mas�[39;00m client:

E           �[0;32m---> 93�[0m     response �[38;5;241m=�[39m �[43msend_triton_request�[49m�[43m(�[49m�[43mdf�[49m�[43m,�[49m�[43m �[49m�[43moutput_columns�[49m�[43m,�[49m�[43m �[49m�[43mclient�[49m�[38;5;241;43m=�[39;49m�[43mclient�[49m�[43m,�[49m�[43m �[49m�[43mtriton_model�[49m�[38;5;241;43m=�[39;49m�[43mmodel_name�[49m�[43m)�[49m

E           �[1;32m     95�[0m �[38;5;28;01mreturn�[39;00m response

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:141�[0m, in �[0;36msend_triton_request�[0;34m(df, outputs_list, client, endpoint, request_id, triton_model)�[0m

E           �[1;32m    139�[0m outputs �[38;5;241m=�[39m [grpcclient�[38;5;241m.�[39mInferRequestedOutput(col) �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list]

E           �[1;32m    140�[0m �[38;5;28;01mwith�[39;00m client:

E           �[0;32m--> 141�[0m     response �[38;5;241m=�[39m �[43mclient�[49m�[38;5;241;43m.�[39;49m�[43minfer�[49m�[43m(�[49m�[43mtriton_model�[49m�[43m,�[49m�[43m �[49m�[43minputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest_id�[49m�[38;5;241;43m=�[39;49m�[43mrequest_id�[49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[38;5;241;43m=�[39;49m�[43moutputs�[49m�[43m)�[49m

E           �[1;32m    143�[0m results �[38;5;241m=�[39m {}

E           �[1;32m    144�[0m �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list:

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322�[0m, in �[0;36mInferenceServerClient.infer�[0;34m(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)�[0m

E           �[1;32m   1320�[0m     �[38;5;28;01mreturn�[39;00m result

E           �[1;32m   1321�[0m �[38;5;28;01mexcept�[39;00m grpc�[38;5;241m.�[39mRpcError �[38;5;28;01mas�[39;00m rpc_error:

E           �[0;32m-> 1322�[0m     �[43mraise_error_grpc�[49m�[43m(�[49m�[43mrpc_error�[49m�[43m)�[49m

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62�[0m, in �[0;36mraise_error_grpc�[0;34m(rpc_error)�[0m

E           �[1;32m     61�[0m �[38;5;28;01mdef�[39;00m �[38;5;21mraise_error_grpc�[39m(rpc_error):

E           �[0;32m---> 62�[0m     �[38;5;28;01mraise�[39;00m get_error_grpc(rpc_error) �[38;5;28;01mfrom�[39;00m �[38;5;28mNone�[39m

E

E           �[0;31mInferenceServerException�[0m: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:

E               1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)

E

E           Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"

E

E           At:

E             /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute

E

E           InferenceServerException: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:

E               1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)

E

E           Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"

E

E           At:

E             /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
/usr/local/lib/python3.8/dist-packages/nbclient/client.py:916: CellExecutionError
During handling of the above exception, another exception occurred:
def test_func():
    with testbook(
        REPO_ROOT
        / "examples"
        / "Building-and-deploying-multi-stage-RecSys"
        / "01-Building-Recommender-Systems-with-Merlin.ipynb",
        execute=False,
    ) as tb1:
        tb1.inject(
            """
            import os
            os.environ["DATA_FOLDER"] = "/tmp/data/"
            os.environ["NUM_ROWS"] = "10000"
            os.system("mkdir -p /tmp/examples")
            os.environ["BASE_DIR"] = "/tmp/examples/"
            """
        )
        tb1.execute()
        assert os.path.isdir("/tmp/examples/dlrm")
        assert os.path.isdir("/tmp/examples/feature_repo")
        assert os.path.isdir("/tmp/examples/query_tower")
        assert os.path.isfile("/tmp/examples/item_embeddings.parquet")
        assert os.path.isfile("/tmp/examples/feature_repo/user_features.py")
        assert os.path.isfile("/tmp/examples/feature_repo/item_features.py")

    with testbook(
        REPO_ROOT
        / "examples"
        / "Building-and-deploying-multi-stage-RecSys"
        / "02-Deploying-multi-stage-RecSys-with-Merlin-Systems.ipynb",
        execute=False,
    ) as tb2:
        tb2.inject(
            """
            import os
            os.environ["DATA_FOLDER"] = "/tmp/data/"
            os.environ["BASE_DIR"] = "/tmp/examples/"
            """
        )
        NUM_OF_CELLS = len(tb2.cells)
        tb2.execute_cell(list(range(0, NUM_OF_CELLS - 3)))
        top_k = tb2.ref("top_k")
        outputs = tb2.ref("outputs")
        assert outputs[0] == "ordered_ids"


      tb2.inject(


            """
            import shutil
            from merlin.models.loader.tf_utils import configure_tensorflow
            configure_tensorflow()
            from merlin.systems.triton.utils import run_ensemble_on_tritonserver
            response = run_ensemble_on_tritonserver(
                "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
            )
            response = [x.tolist()[0] for x in response["ordered_ids"]]
            shutil.rmtree("/tmp/examples/", ignore_errors=True)
            """
        )

tests/unit/examples/test_building_deploying_multi_stage_RecSys.py:57:

/usr/local/lib/python3.8/dist-packages/testbook/client.py:237: in inject

cell = TestbookNode(self.execute_cell(inject_idx)) if run else TestbookNode(code_cell)

self = <testbook.client.TestbookNotebookClient object at 0x7f38467b3f40>

cell = [53], kwargs = {}, cell_indexes = [53], executed_cells = [], idx = 53
def execute_cell(self, cell, **kwargs) -> Union[Dict, List[Dict]]:
    """
    Executes a cell or list of cells
    """
    if isinstance(cell, slice):
        start, stop = self._cell_index(cell.start), self._cell_index(cell.stop)
        if cell.step is not None:
            raise TestbookError('testbook does not support step argument')

        cell = range(start, stop + 1)
    elif isinstance(cell, str) or isinstance(cell, int):
        cell = [cell]

    cell_indexes = cell

    if all(isinstance(x, str) for x in cell):
        cell_indexes = [self._cell_index(tag) for tag in cell]

    executed_cells = []
    for idx in cell_indexes:
        try:
            cell = super().execute_cell(self.nb['cells'][idx], idx, **kwargs)
        except CellExecutionError as ce:


          raise TestbookRuntimeError(ce.evalue, ce, self._get_error_class(ce.ename))


E               testbook.exceptions.TestbookRuntimeError: An error occurred while executing the following cell:

E               ------------------

E

E               import shutil

E               from merlin.models.loader.tf_utils import configure_tensorflow

E               configure_tensorflow()

E               from merlin.systems.triton.utils import run_ensemble_on_tritonserver

E               response = run_ensemble_on_tritonserver(

E                   "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"

E               )

E               response = [x.tolist()[0] for x in response["ordered_ids"]]

E               shutil.rmtree("/tmp/examples/", ignore_errors=True)

E

E               ------------------

E

E               �[0;31m---------------------------------------------------------------------------�[0m

E               �[0;31mInferenceServerException�[0m                  Traceback (most recent call last)

E               Input �[0;32mIn [32]�[0m, in �[0;36m<cell line: 5>�[0;34m()�[0m

E               �[1;32m      3�[0m configure_tensorflow()

E               �[1;32m      4�[0m �[38;5;28;01mfrom�[39;00m �[38;5;21;01mmerlin�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01msystems�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mtriton�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mutils�[39;00m �[38;5;28;01mimport�[39;00m run_ensemble_on_tritonserver

E               �[0;32m----> 5�[0m response �[38;5;241m=�[39m �[43mrun_ensemble_on_tritonserver�[49m�[43m(�[49m

E               �[1;32m      6�[0m �[43m    �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43m/tmp/examples/poc_ensemble�[39;49m�[38;5;124;43m"�[39;49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest�[49m�[43m,�[49m�[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43mensemble_model�[39;49m�[38;5;124;43m"�[39;49m

E               �[1;32m      7�[0m �[43m)�[49m

E               �[1;32m      8�[0m response �[38;5;241m=�[39m [x�[38;5;241m.�[39mtolist()[�[38;5;241m0�[39m] �[38;5;28;01mfor�[39;00m x �[38;5;129;01min�[39;00m response[�[38;5;124m"�[39m�[38;5;124mordered_ids�[39m�[38;5;124m"�[39m]]

E               �[1;32m      9�[0m shutil�[38;5;241m.�[39mrmtree(�[38;5;124m"�[39m�[38;5;124m/tmp/examples/�[39m�[38;5;124m"�[39m, ignore_errors�[38;5;241m=�[39m�[38;5;28;01mTrue�[39;00m)

E

E               File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:93�[0m, in �[0;36mrun_ensemble_on_tritonserver�[0;34m(tmpdir, output_columns, df, model_name)�[0m

E               �[1;32m     91�[0m response �[38;5;241m=�[39m �[38;5;28;01mNone�[39;00m

E               �[1;32m     92�[0m �[38;5;28;01mwith�[39;00m run_triton_server(tmpdir) �[38;5;28;01mas�[39;00m client:

E               �[0;32m---> 93�[0m     response �[38;5;241m=�[39m �[43msend_triton_request�[49m�[43m(�[49m�[43mdf�[49m�[43m,�[49m�[43m �[49m�[43moutput_columns�[49m�[43m,�[49m�[43m �[49m�[43mclient�[49m�[38;5;241;43m=�[39;49m�[43mclient�[49m�[43m,�[49m�[43m �[49m�[43mtriton_model�[49m�[38;5;241;43m=�[39;49m�[43mmodel_name�[49m�[43m)�[49m

E               �[1;32m     95�[0m �[38;5;28;01mreturn�[39;00m response

E

E               File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:141�[0m, in �[0;36msend_triton_request�[0;34m(df, outputs_list, client, endpoint, request_id, triton_model)�[0m

E               �[1;32m    139�[0m outputs �[38;5;241m=�[39m [grpcclient�[38;5;241m.�[39mInferRequestedOutput(col) �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list]

E               �[1;32m    140�[0m �[38;5;28;01mwith�[39;00m client:

E               �[0;32m--> 141�[0m     response �[38;5;241m=�[39m �[43mclient�[49m�[38;5;241;43m.�[39;49m�[43minfer�[49m�[43m(�[49m�[43mtriton_model�[49m�[43m,�[49m�[43m �[49m�[43minputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest_id�[49m�[38;5;241;43m=�[39;49m�[43mrequest_id�[49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[38;5;241;43m=�[39;49m�[43moutputs�[49m�[43m)�[49m

E               �[1;32m    143�[0m results �[38;5;241m=�[39m {}

E               �[1;32m    144�[0m �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list:

E

E               File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322�[0m, in �[0;36mInferenceServerClient.infer�[0;34m(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)�[0m

E               �[1;32m   1320�[0m     �[38;5;28;01mreturn�[39;00m result

E               �[1;32m   1321�[0m �[38;5;28;01mexcept�[39;00m grpc�[38;5;241m.�[39mRpcError �[38;5;28;01mas�[39;00m rpc_error:

E               �[0;32m-> 1322�[0m     �[43mraise_error_grpc�[49m�[43m(�[49m�[43mrpc_error�[49m�[43m)�[49m

E

E               File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62�[0m, in �[0;36mraise_error_grpc�[0;34m(rpc_error)�[0m

E               �[1;32m     61�[0m �[38;5;28;01mdef�[39;00m �[38;5;21mraise_error_grpc�[39m(rpc_error):

E               �[0;32m---> 62�[0m     �[38;5;28;01mraise�[39;00m get_error_grpc(rpc_error) �[38;5;28;01mfrom�[39;00m �[38;5;28mNone�[39m

E

E               �[0;31mInferenceServerException�[0m: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:

E                   1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)

E

E               Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"

E

E               At:

E                 /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute

E

E               InferenceServerException: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:

E                   1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)

E

E               Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"

E

E               At:

E                 /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
/usr/local/lib/python3.8/dist-packages/testbook/client.py:135: TestbookRuntimeError

----------------------------- Captured stdout call -----------------------------

Signal (2) received.

----------------------------- Captured stderr call -----------------------------

2022-08-10 18:06:48.220427: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA

To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

2022-08-10 18:06:50.210235: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1627 MB memory:  -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0

2022-08-10 18:06:50.211024: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14532 MB memory:  -> device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0

Error in atexit._run_exitfuncs:

Traceback (most recent call last):

File "/usr/lib/python3.8/logging/init.py", line 2127, in shutdown

h.close()

File "/usr/local/lib/python3.8/dist-packages/absl/logging/init.py", line 934, in close

self.stream.close()

File "/usr/local/lib/python3.8/dist-packages/ipykernel/iostream.py", line 438, in close

self.watch_fd_thread.join()

AttributeError: 'OutStream' object has no attribute 'watch_fd_thread'

WARNING clustering 268 points to 32 centroids: please provide at least 1248 training points

2022-08-10 18:08:22.651716: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA

To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

2022-08-10 18:08:24.652856: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1627 MB memory:  -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0

2022-08-10 18:08:24.653591: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14532 MB memory:  -> device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0

I0810 18:08:29.827606 24551 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7fa97e000000' with size 268435456

I0810 18:08:29.828420 24551 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864

I0810 18:08:29.835839 24551 model_repository_manager.cc:1191] loading: 0_queryfeast:1

I0810 18:08:29.936144 24551 model_repository_manager.cc:1191] loading: 1_predicttensorflow:1

I0810 18:08:29.943250 24551 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 0_queryfeast (GPU device 0)

I0810 18:08:30.036495 24551 model_repository_manager.cc:1191] loading: 2_queryfaiss:1

I0810 18:08:30.136716 24551 model_repository_manager.cc:1191] loading: 3_queryfeast:1

I0810 18:08:30.237054 24551 model_repository_manager.cc:1191] loading: 4_unrollfeatures:1

I0810 18:08:30.337279 24551 model_repository_manager.cc:1191] loading: 5_predicttensorflow:1

I0810 18:08:30.437571 24551 model_repository_manager.cc:1191] loading: 6_softmaxsampling:1

I0810 18:08:32.378761 24551 model_repository_manager.cc:1345] successfully loaded '0_queryfeast' version 1

I0810 18:08:32.668568 24551 tensorflow.cc:2181] TRITONBACKEND_Initialize: tensorflow

I0810 18:08:32.668603 24551 tensorflow.cc:2191] Triton TRITONBACKEND API version: 1.9

I0810 18:08:32.668611 24551 tensorflow.cc:2197] 'tensorflow' TRITONBACKEND API version: 1.9

I0810 18:08:32.668616 24551 tensorflow.cc:2221] backend configuration:

{"cmdline":{"auto-complete-config":"false","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","version":"2","default-max-batch-size":"4"}}

I0810 18:08:32.668651 24551 tensorflow.cc:2281] TRITONBACKEND_ModelInitialize: 1_predicttensorflow (version 1)

I0810 18:08:32.672249 24551 tensorflow.cc:2281] TRITONBACKEND_ModelInitialize: 5_predicttensorflow (version 1)

I0810 18:08:32.673493 24551 tensorflow.cc:2330] TRITONBACKEND_ModelInstanceInitialize: 1_predicttensorflow (GPU device 0)

2022-08-10 18:08:33.014517: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel

2022-08-10 18:08:33.017894: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }

2022-08-10 18:08:33.017931: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel

2022-08-10 18:08:33.018035: I tensorflow/core/platform/cpu_feature_guard.cc:152] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX

To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

2022-08-10 18:08:33.056360: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 11484 MB memory:  -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0

2022-08-10 18:08:33.100244: I tensorflow/cc/saved_model/loader.cc:230] Restoring SavedModel bundle.

2022-08-10 18:08:33.160045: I tensorflow/cc/saved_model/loader.cc:214] Running initialization op on SavedModel bundle at path: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel

2022-08-10 18:08:33.196722: I tensorflow/cc/saved_model/loader.cc:321] SavedModel load for tags { serve }; Status: success: OK. Took 182224 microseconds.

I0810 18:08:33.196847 24551 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 2_queryfaiss (GPU device 0)

I0810 18:08:33.196942 24551 model_repository_manager.cc:1345] successfully loaded '1_predicttensorflow' version 1

I0810 18:08:35.648965 24551 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 3_queryfeast (GPU device 0)

I0810 18:08:35.651926 24551 model_repository_manager.cc:1345] successfully loaded '2_queryfaiss' version 1

I0810 18:08:38.045045 24551 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 4_unrollfeatures (GPU device 0)

I0810 18:08:38.045324 24551 model_repository_manager.cc:1345] successfully loaded '3_queryfeast' version 1

I0810 18:08:40.153527 24551 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 6_softmaxsampling (GPU device 0)

I0810 18:08:40.153760 24551 model_repository_manager.cc:1345] successfully loaded '4_unrollfeatures' version 1

I0810 18:08:42.221072 24551 tensorflow.cc:2330] TRITONBACKEND_ModelInstanceInitialize: 5_predicttensorflow (GPU device 0)

I0810 18:08:42.221322 24551 model_repository_manager.cc:1345] successfully loaded '6_softmaxsampling' version 1

2022-08-10 18:08:42.221989: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel

2022-08-10 18:08:42.241253: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }

2022-08-10 18:08:42.241300: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel

2022-08-10 18:08:42.243360: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 11484 MB memory:  -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0

2022-08-10 18:08:42.265962: I tensorflow/cc/saved_model/loader.cc:230] Restoring SavedModel bundle.

2022-08-10 18:08:42.424432: I tensorflow/cc/saved_model/loader.cc:214] Running initialization op on SavedModel bundle at path: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel

2022-08-10 18:08:42.476820: I tensorflow/cc/saved_model/loader.cc:321] SavedModel load for tags { serve }; Status: success: OK. Took 254844 microseconds.

I0810 18:08:42.477220 24551 model_repository_manager.cc:1345] successfully loaded '5_predicttensorflow' version 1

I0810 18:08:42.478775 24551 model_repository_manager.cc:1191] loading: ensemble_model:1

I0810 18:08:42.579483 24551 model_repository_manager.cc:1345] successfully loaded 'ensemble_model' version 1

I0810 18:08:42.579617 24551 server.cc:556]

+------------------+------+

| Repository Agent | Path |

+------------------+------+

+------------------+------+
I0810 18:08:42.579677 24551 server.cc:583]

+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| Backend    | Path                                                            | Config                                                                                                                                                                       |

+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| python     | /opt/tritonserver/backends/python/libtriton_python.so           | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}}               |

| tensorflow | /opt/tritonserver/backends/tensorflow2/libtriton_tensorflow2.so | {"cmdline":{"auto-complete-config":"false","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","version":"2","default-max-batch-size":"4"}} |

+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0810 18:08:42.579746 24551 server.cc:626]

+---------------------+---------+--------+

| Model               | Version | Status |

+---------------------+---------+--------+

| 0_queryfeast        | 1       | READY  |

| 1_predicttensorflow | 1       | READY  |

| 2_queryfaiss        | 1       | READY  |

| 3_queryfeast        | 1       | READY  |

| 4_unrollfeatures    | 1       | READY  |

| 5_predicttensorflow | 1       | READY  |

| 6_softmaxsampling   | 1       | READY  |

| ensemble_model      | 1       | READY  |

+---------------------+---------+--------+
I0810 18:08:42.641929 24551 metrics.cc:650] Collecting metrics for GPU 0: Tesla P100-DGXS-16GB

I0810 18:08:42.642745 24551 tritonserver.cc:2138]

+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| Option                           | Value                                                                                                                                                                                        |

+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| server_id                        | triton                                                                                                                                                                                       |

| server_version                   | 2.22.0                                                                                                                                                                                       |

| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |

| model_repository_path[0]         | /tmp/examples/poc_ensemble                                                                                                                                                                   |

| model_control_mode               | MODE_NONE                                                                                                                                                                                    |

| strict_model_config              | 1                                                                                                                                                                                            |

| rate_limit                       | OFF                                                                                                                                                                                          |

| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                                    |

| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                                                     |

| response_cache_byte_size         | 0                                                                                                                                                                                            |

| min_supported_compute_capability | 6.0                                                                                                                                                                                          |

| strict_readiness                 | 1                                                                                                                                                                                            |

| exit_timeout                     | 30                                                                                                                                                                                           |

+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0810 18:08:42.643549 24551 grpc_server.cc:4589] Started GRPCInferenceService at 0.0.0.0:8001

I0810 18:08:42.644119 24551 http_server.cc:3303] Started HTTPService at 0.0.0.0:8000

I0810 18:08:42.685300 24551 http_server.cc:178] Started Metrics Service at 0.0.0.0:8002

W0810 18:08:43.661684 24551 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0

W0810 18:08:43.661752 24551 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0

W0810 18:08:44.661915 24551 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0

W0810 18:08:44.661975 24551 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0

W0810 18:08:45.680863 24551 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0

W0810 18:08:45.680936 24551 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0

0810 18:08:46.164010 24888 pb_stub.cc:749] Failed to process the request(s) for model '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:

1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
At:

/tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
I0810 18:08:46.168559 24551 server.cc:257] Waiting for in-flight requests to complete.

I0810 18:08:46.168585 24551 server.cc:273] Timeout 30: Found 0 model versions that have in-flight inferences

I0810 18:08:46.168594 24551 model_repository_manager.cc:1223] unloading: ensemble_model:1

I0810 18:08:46.168669 24551 model_repository_manager.cc:1223] unloading: 6_softmaxsampling:1

I0810 18:08:46.168720 24551 model_repository_manager.cc:1223] unloading: 5_predicttensorflow:1

I0810 18:08:46.168810 24551 model_repository_manager.cc:1223] unloading: 4_unrollfeatures:1

I0810 18:08:46.168808 24551 model_repository_manager.cc:1328] successfully unloaded 'ensemble_model' version 1

I0810 18:08:46.168838 24551 model_repository_manager.cc:1223] unloading: 3_queryfeast:1

I0810 18:08:46.168855 24551 tensorflow.cc:2368] TRITONBACKEND_ModelInstanceFinalize: delete instance state

I0810 18:08:46.168882 24551 model_repository_manager.cc:1223] unloading: 2_queryfaiss:1

I0810 18:08:46.168905 24551 model_repository_manager.cc:1223] unloading: 1_predicttensorflow:1

I0810 18:08:46.168922 24551 tensorflow.cc:2307] TRITONBACKEND_ModelFinalize: delete model state

I0810 18:08:46.169019 24551 model_repository_manager.cc:1223] unloading: 0_queryfeast:1

I0810 18:08:46.169044 24551 server.cc:288] All models are stopped, unloading models

I0810 18:08:46.169054 24551 server.cc:295] Timeout 30: Found 7 live models and 0 in-flight non-inference requests

I0810 18:08:46.169205 24551 tensorflow.cc:2368] TRITONBACKEND_ModelInstanceFinalize: delete instance state

I0810 18:08:46.169421 24551 tensorflow.cc:2307] TRITONBACKEND_ModelFinalize: delete model state

I0810 18:08:46.183427 24551 model_repository_manager.cc:1328] successfully unloaded '5_predicttensorflow' version 1

I0810 18:08:46.185354 24551 model_repository_manager.cc:1328] successfully unloaded '1_predicttensorflow' version 1

I0810 18:08:47.169175 24551 server.cc:295] Timeout 29: Found 5 live models and 0 in-flight non-inference requests

I0810 18:08:47.804515 24551 model_repository_manager.cc:1328] successfully unloaded '4_unrollfeatures' version 1

I0810 18:08:47.828810 24551 model_repository_manager.cc:1328] successfully unloaded '6_softmaxsampling' version 1

I0810 18:08:47.856567 24551 model_repository_manager.cc:1328] successfully unloaded '2_queryfaiss' version 1

I0810 18:08:48.169356 24551 server.cc:295] Timeout 28: Found 2 live models and 0 in-flight non-inference requests

I0810 18:08:49.169478 24551 server.cc:295] Timeout 27: Found 2 live models and 0 in-flight non-inference requests

I0810 18:08:50.169613 24551 server.cc:295] Timeout 26: Found 2 live models and 0 in-flight non-inference requests

I0810 18:08:51.169751 24551 server.cc:295] Timeout 25: Found 2 live models and 0 in-flight non-inference requests

I0810 18:08:52.169880 24551 server.cc:295] Timeout 24: Found 2 live models and 0 in-flight non-inference requests

I0810 18:08:53.169997 24551 server.cc:295] Timeout 23: Found 2 live models and 0 in-flight non-inference requests

I0810 18:08:54.170130 24551 server.cc:295] Timeout 22: Found 2 live models and 0 in-flight non-inference requests

I0810 18:08:55.170268 24551 server.cc:295] Timeout 21: Found 2 live models and 0 in-flight non-inference requests

/usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py:15: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

ValueType.FLOAT: (np.float, False, False),

I0810 18:08:55.554126 24551 model_repository_manager.cc:1328] successfully unloaded '3_queryfeast' version 1

I0810 18:08:56.170396 24551 server.cc:295] Timeout 20: Found 1 live models and 0 in-flight non-inference requests

/usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py:15: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

ValueType.FLOAT: (np.float, False, False),

I0810 18:08:57.018423 24551 model_repository_manager.cc:1328] successfully unloaded '0_queryfeast' version 1

I0810 18:08:57.170546 24551 server.cc:295] Timeout 19: Found 0 live models and 0 in-flight non-inference requests

__________________________________ test_func ___________________________________
def test_func():
    with testbook(
        REPO_ROOT / "examples" / "scaling-criteo" / "02-ETL-with-NVTabular.ipynb",
        execute=False,
        timeout=180,
    ) as tb1:
        tb1.inject(
            """
            import os
            os.environ["BASE_DIR"] = "/tmp/input/criteo/"
            os.environ["INPUT_DATA_DIR"] = "/tmp/input/criteo/"
            os.environ["OUTPUT_DATA_DIR"] = "/tmp/output/criteo/"
            os.system("mkdir -p /tmp/input/criteo")
            os.system("mkdir -p /tmp/output/criteo")

            from merlin.datasets.synthetic import generate_data

            train, valid = generate_data("criteo", int(1000000), set_sizes=(0.7, 0.3))

            train.to_ddf().compute().to_parquet('/tmp/input/criteo/day_0.parquet')
            valid.to_ddf().compute().to_parquet('/tmp/input/criteo/day_1.parquet')
            """
        )


      tb1.execute()


tests/unit/examples/test_scaling_criteo_merlin_models.py:33:

/usr/local/lib/python3.8/dist-packages/testbook/client.py:147: in execute

super().execute_cell(cell, index)

/usr/local/lib/python3.8/dist-packages/nbclient/util.py:85: in wrapped

return just_run(coro(*args, **kwargs))

/usr/local/lib/python3.8/dist-packages/nbclient/util.py:60: in just_run

return loop.run_until_complete(coro)

/usr/lib/python3.8/asyncio/base_events.py:616: in run_until_complete

return future.result()

/usr/local/lib/python3.8/dist-packages/nbclient/client.py:1022: in async_execute_cell

await self._check_raise_for_error(cell, cell_index, exec_reply)

self = <testbook.client.TestbookNotebookClient object at 0x7f3845d05610>

cell = {'id': '2f821df7', 'cell_type': 'code', 'metadata': {'execution': {'iopub.status.busy': '2022-08-10T18:09:18.256954Z',...ry: CUDA error at: /usr/include/rmm/mr/device/cuda_memory_resource.hpp:70: cudaErrorMemoryAllocation out of memory']}]}

cell_index = 27

exec_reply = {'buffers': [], 'content': {'ename': 'MemoryError', 'engine_info': {'engine_id': -1, 'engine_uuid': '92a2eefd-1dda-4b0...e, 'engine': '92a2eefd-1dda-4b05-88c7-d5bbb4e572d3', 'started': '2022-08-10T18:09:18.257190Z', 'status': 'error'}, ...}
async def _check_raise_for_error(
    self, cell: NotebookNode, cell_index: int, exec_reply: t.Optional[t.Dict]
) -> None:

    if exec_reply is None:
        return None

    exec_reply_content = exec_reply['content']
    if exec_reply_content['status'] != 'error':
        return None

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors
        or exec_reply_content.get('ename') in self.allow_error_names
        or "raises-exception" in cell.metadata.get("tags", [])
    )
    await run_hook(
        self.on_cell_error, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )
    if not cell_allows_errors:


      raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)


E           nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:

E           ------------------

E

E           import os

E           os.environ["BASE_DIR"] = "/tmp/input/criteo/"

E           os.environ["INPUT_DATA_DIR"] = "/tmp/input/criteo/"

E           os.environ["OUTPUT_DATA_DIR"] = "/tmp/output/criteo/"

E           os.system("mkdir -p /tmp/input/criteo")

E           os.system("mkdir -p /tmp/output/criteo")

E

E           from merlin.datasets.synthetic import generate_data

E

E           train, valid = generate_data("criteo", int(1000000), set_sizes=(0.7, 0.3))

E

E           train.to_ddf().compute().to_parquet('/tmp/input/criteo/day_0.parquet')

E           valid.to_ddf().compute().to_parquet('/tmp/input/criteo/day_1.parquet')

E

E           ------------------

E

E           �[0;31m---------------------------------------------------------------------------�[0m

E           �[0;31mMemoryError�[0m                               Traceback (most recent call last)

E           Input �[0;32mIn [16]�[0m, in �[0;36m<cell line: 10>�[0;34m()�[0m

E           �[1;32m      6�[0m os�[38;5;241m.�[39msystem(�[38;5;124m"�[39m�[38;5;124mmkdir -p /tmp/output/criteo�[39m�[38;5;124m"�[39m)

E           �[1;32m      8�[0m �[38;5;28;01mfrom�[39;00m �[38;5;21;01mmerlin�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mdatasets�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01msynthetic�[39;00m �[38;5;28;01mimport�[39;00m generate_data

E           �[0;32m---> 10�[0m train, valid �[38;5;241m=�[39m �[43mgenerate_data�[49m�[43m(�[49m�[38;5;124;43m"�[39;49m�[38;5;124;43mcriteo�[39;49m�[38;5;124;43m"�[39;49m�[43m,�[49m�[43m �[49m�[38;5;28;43mint�[39;49m�[43m(�[49m�[38;5;241;43m1000000�[39;49m�[43m)�[49m�[43m,�[49m�[43m �[49m�[43mset_sizes�[49m�[38;5;241;43m=�[39;49m�[43m(�[49m�[38;5;241;43m0.7�[39;49m�[43m,�[49m�[43m �[49m�[38;5;241;43m0.3�[39;49m�[43m)�[49m�[43m)�[49m

E           �[1;32m     12�[0m train�[38;5;241m.�[39mto_ddf()�[38;5;241m.�[39mcompute()�[38;5;241m.�[39mto_parquet(�[38;5;124m'�[39m�[38;5;124m/tmp/input/criteo/day_0.parquet�[39m�[38;5;124m'�[39m)

E           �[1;32m     13�[0m valid�[38;5;241m.�[39mto_ddf()�[38;5;241m.�[39mcompute()�[38;5;241m.�[39mto_parquet(�[38;5;124m'�[39m�[38;5;124m/tmp/input/criteo/day_1.parquet�[39m�[38;5;124m'�[39m)

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/datasets/synthetic.py:129�[0m, in �[0;36mgenerate_data�[0;34m(input, num_rows, set_sizes, min_session_length, max_session_length, device)�[0m

E           �[1;32m    126�[0m         start_i �[38;5;241m=�[39m end_i

E           �[1;32m    127�[0m         output_datasets�[38;5;241m.�[39mappend(set_df)

E           �[0;32m--> 129�[0m     �[38;5;28;01mreturn�[39;00m �[38;5;28mtuple�[39m([merlin�[38;5;241m.�[39mio�[38;5;241m.�[39mDataset(d, schema�[38;5;241m=�[39mschema) �[38;5;28;01mfor�[39;00m d �[38;5;129;01min�[39;00m output_datasets])

E           �[1;32m    131�[0m �[38;5;28;01mreturn�[39;00m merlin�[38;5;241m.�[39mio�[38;5;241m.�[39mDataset(df, schema�[38;5;241m=�[39mschema)

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/datasets/synthetic.py:129�[0m, in �[0;36m�[0;34m(.0)�[0m

E           �[1;32m    126�[0m         start_i �[38;5;241m=�[39m end_i

E           �[1;32m    127�[0m         output_datasets�[38;5;241m.�[39mappend(set_df)

E           �[0;32m--> 129�[0m     �[38;5;28;01mreturn�[39;00m �[38;5;28mtuple�[39m([�[43mmerlin�[49m�[38;5;241;43m.�[39;49m�[43mio�[49m�[38;5;241;43m.�[39;49m�[43mDataset�[49m�[43m(�[49m�[43md�[49m�[43m,�[49m�[43m �[49m�[43mschema�[49m�[38;5;241;43m=�[39;49m�[43mschema�[49m�[43m)�[49m �[38;5;28;01mfor�[39;00m d �[38;5;129;01min�[39;00m output_datasets])

E           �[1;32m    131�[0m �[38;5;28;01mreturn�[39;00m merlin�[38;5;241m.�[39mio�[38;5;241m.�[39mDataset(df, schema�[38;5;241m=�[39mschema)

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/io/dataset.py:262�[0m, in �[0;36mDataset.__init__�[0;34m(self, path_or_source, engine, npartitions, part_size, part_mem_fraction, storage_options, dtypes, client, cpu, base_dataset, schema, **kwargs)�[0m

E           �[1;32m    256�[0m npartitions �[38;5;241m=�[39m npartitions �[38;5;129;01mor�[39;00m �[38;5;241m1�[39m

E           �[1;32m    257�[0m �[38;5;28;01mif�[39;00m �[38;5;28misinstance�[39m(path_or_source, dask�[38;5;241m.�[39mdataframe�[38;5;241m.�[39mDataFrame) �[38;5;129;01mor�[39;00m is_dataframe_object(

E           �[1;32m    258�[0m     path_or_source

E           �[1;32m    259�[0m ):

E           �[1;32m    260�[0m     �[38;5;66;03m# User is passing in a <dask.dataframe|cudf|pd>.DataFrame�[39;00m

E           �[1;32m    261�[0m     �[38;5;66;03m# Use DataFrameDatasetEngine�[39;00m

E           �[0;32m--> 262�[0m     _path_or_source �[38;5;241m=�[39m �[43mconvert_data�[49m�[43m(�[49m

E           �[1;32m    263�[0m �[43m        �[49m�[43mpath_or_source�[49m�[43m,�[49m�[43m �[49m�[43mcpu�[49m�[38;5;241;43m=�[39;49m�[38;5;28;43mself�[39;49m�[38;5;241;43m.�[39;49m�[43mcpu�[49m�[43m,�[49m�[43m �[49m�[43mto_collection�[49m�[38;5;241;43m=�[39;49m�[38;5;28;43;01mTrue�[39;49;00m�[43m,�[49m�[43m �[49m�[43mnpartitions�[49m�[38;5;241;43m=�[39;49m�[43mnpartitions�[49m

E           �[1;32m    264�[0m �[43m    �[49m�[43m)�[49m

E           �[1;32m    265�[0m     �[38;5;66;03m# Check if this is a collection that has now moved between host <-> device�[39;00m

E           �[1;32m    266�[0m     moved_collection �[38;5;241m=�[39m �[38;5;28misinstance�[39m(path_or_source, dask�[38;5;241m.�[39mdataframe�[38;5;241m.�[39mDataFrame) �[38;5;129;01mand�[39;00m (

E           �[1;32m    267�[0m         �[38;5;129;01mnot�[39;00m �[38;5;28misinstance�[39m(_path_or_source�[38;5;241m.�[39m_meta, �[38;5;28mtype�[39m(path_or_source�[38;5;241m.�[39m_meta))

E           �[1;32m    268�[0m     )

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/core/dispatch.py:567�[0m, in �[0;36mconvert_data�[0;34m(x, cpu, to_collection, npartitions)�[0m

E           �[1;32m    565�[0m     _x �[38;5;241m=�[39m cudf�[38;5;241m.�[39mDataFrame�[38;5;241m.�[39mfrom_arrow(x)

E           �[1;32m    566�[0m �[38;5;28;01melif�[39;00m �[38;5;28misinstance�[39m(x, pd�[38;5;241m.�[39mDataFrame):

E           �[0;32m--> 567�[0m     _x �[38;5;241m=�[39m �[43mcudf�[49m�[38;5;241;43m.�[39;49m�[43mDataFrame�[49m�[38;5;241;43m.�[39;49m�[43mfrom_pandas�[49m�[43m(�[49m�[43mx�[49m�[43m)�[49m

E           �[1;32m    568�[0m �[38;5;66;03m# Output a collection if to_collection=True�[39;00m

E           �[1;32m    569�[0m �[38;5;28;01mreturn�[39;00m (

E           �[1;32m    570�[0m     dask_cudf�[38;5;241m.�[39mfrom_cudf(_x, sort�[38;5;241m=�[39m�[38;5;28;01mFalse�[39;00m, npartitions�[38;5;241m=�[39mnpartitions)

E           �[1;32m    571�[0m     �[38;5;28;01mif�[39;00m to_collection

E           �[1;32m    572�[0m     �[38;5;28;01melse�[39;00m _x

E           �[1;32m    573�[0m )

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/nvtx/nvtx.py:101�[0m, in �[0;36mannotate.call..inner�[0;34m(args, **kwargs)�[0m

E           �[1;32m     98�[0m �[38;5;129m@wraps�[39m(func)

E           �[1;32m     99�[0m �[38;5;28;01mdef�[39;00m �[38;5;21minner�[39m(�[38;5;241m�[39margs, �[38;5;241m�[39m�[38;5;241m�[39mkwargs):

E           �[1;32m    100�[0m     libnvtx_push_range(�[38;5;28mself�[39m�[38;5;241m.�[39mattributes, �[38;5;28mself�[39m�[38;5;241m.�[39mdomain�[38;5;241m.�[39mhandle)

E           �[0;32m--> 101�[0m     result �[38;5;241m=�[39m �[43mfunc�[49m�[43m(�[49m�[38;5;241;43m�[39;49m�[43margs�[49m�[43m,�[49m�[43m �[49m�[38;5;241;43m�[39;49m�[38;5;241;43m*�[39;49m�[43mkwargs�[49m�[43m)�[49m

E           �[1;32m    102�[0m     libnvtx_pop_range(�[38;5;28mself�[39m�[38;5;241m.�[39mdomain�[38;5;241m.�[39mhandle)

E           �[1;32m    103�[0m     �[38;5;28;01mreturn�[39;00m result

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/cudf/core/dataframe.py:4386�[0m, in �[0;36mDataFrame.from_pandas�[0;34m(cls, dataframe, nan_as_null)�[0m

E           �[1;32m   4382�[0m �[38;5;28;01mfor�[39;00m col_name, col_value �[38;5;129;01min�[39;00m dataframe�[38;5;241m.�[39mitems():

E           �[1;32m   4383�[0m     �[38;5;66;03m# necessary because multi-index can return multiple�[39;00m

E           �[1;32m   4384�[0m     �[38;5;66;03m# columns for a single key�[39;00m

E           �[1;32m   4385�[0m     �[38;5;28;01mif�[39;00m �[38;5;28mlen�[39m(col_value�[38;5;241m.�[39mshape) �[38;5;241m==�[39m �[38;5;241m1�[39m:

E           �[0;32m-> 4386�[0m         df[col_name] �[38;5;241m=�[39m �[43mcolumn�[49m�[38;5;241;43m.�[39;49m�[43mas_column�[49m�[43m(�[49m

E           �[1;32m   4387�[0m �[43m            �[49m�[43mcol_value�[49m�[38;5;241;43m.�[39;49m�[43marray�[49m�[43m,�[49m�[43m �[49m�[43mnan_as_null�[49m�[38;5;241;43m=�[39;49m�[43mnan_as_null�[49m

E           �[1;32m   4388�[0m �[43m        �[49m�[43m)�[49m

E           �[1;32m   4389�[0m     �[38;5;28;01melse�[39;00m:

E           �[1;32m   4390�[0m         vals �[38;5;241m=�[39m col_value�[38;5;241m.�[39mvalues�[38;5;241m.�[39mT

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/cudf/core/column/column.py:1983�[0m, in �[0;36mas_column�[0;34m(arbitrary, nan_as_null, dtype, length)�[0m

E           �[1;32m   1981�[0m     data �[38;5;241m=�[39m as_column(pa�[38;5;241m.�[39mArray�[38;5;241m.�[39mfrom_pandas(arbitrary), dtype�[38;5;241m=�[39marb_dtype)

E           �[1;32m   1982�[0m �[38;5;28;01melse�[39;00m:

E           �[0;32m-> 1983�[0m     data �[38;5;241m=�[39m �[43mas_column�[49m�[43m(�[49m

E           �[1;32m   1984�[0m �[43m        �[49m�[43mpa�[49m�[38;5;241;43m.�[39;49m�[43marray�[49m�[43m(�[49m

E           �[1;32m   1985�[0m �[43m            �[49m�[43marbitrary�[49m�[43m,�[49m

E           �[1;32m   1986�[0m �[43m            �[49m�[43mfrom_pandas�[49m�[38;5;241;43m=�[39;49m�[38;5;28;43;01mTrue�[39;49;00m�[43m �[49m�[38;5;28;43;01mif�[39;49;00m�[43m �[49m�[43mnan_as_null�[49m�[43m �[49m�[38;5;129;43;01mis�[39;49;00m�[43m �[49m�[38;5;28;43;01mNone�[39;49;00m�[43m �[49m�[38;5;28;43;01melse�[39;49;00m�[43m �[49m�[43mnan_as_null�[49m�[43m,�[49m

E           �[1;32m   1987�[0m �[43m        �[49m�[43m)�[49m�[43m,�[49m

E           �[1;32m   1988�[0m �[43m        �[49m�[43mnan_as_null�[49m�[38;5;241;43m=�[39;49m�[43mnan_as_null�[49m�[43m,�[49m

E           �[1;32m   1989�[0m �[43m    �[49m�[43m)�[49m

E           �[1;32m   1990�[0m �[38;5;28;01mif�[39;00m dtype �[38;5;129;01mis�[39;00m �[38;5;129;01mnot�[39;00m �[38;5;28;01mNone�[39;00m:

E           �[1;32m   1991�[0m     data �[38;5;241m=�[39m data�[38;5;241m.�[39mastype(dtype)

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/cudf/core/column/column.py:1776�[0m, in �[0;36mas_column�[0;34m(arbitrary, nan_as_null, dtype, length)�[0m

E           �[1;32m   1770�[0m �[38;5;28;01mif�[39;00m �[38;5;28misinstance�[39m(arbitrary, pa�[38;5;241m.�[39mlib�[38;5;241m.�[39mHalfFloatArray):

E           �[1;32m   1771�[0m     �[38;5;28;01mraise�[39;00m �[38;5;167;01mNotImplementedError�[39;00m(

E           �[1;32m   1772�[0m         �[38;5;124m"�[39m�[38;5;124mType casting from float16 to float32 is not �[39m�[38;5;124m"�[39m

E           �[1;32m   1773�[0m         �[38;5;124m"�[39m�[38;5;124myet supported in pyarrow, see: �[39m�[38;5;124m"�[39m

E           �[1;32m   1774�[0m         �[38;5;124m"�[39m�[38;5;124mhttps://issues.apache.org/jira/browse/ARROW-3802�[39m�[38;5;124m"�[39m

E           �[1;32m   1775�[0m     )

E           �[0;32m-> 1776�[0m col �[38;5;241m=�[39m �[43mColumnBase�[49m�[38;5;241;43m.�[39;49m�[43mfrom_arrow�[49m�[43m(�[49m�[43marbitrary�[49m�[43m)�[49m

E           �[1;32m   1778�[0m �[38;5;28;01mif�[39;00m �[38;5;28misinstance�[39m(arbitrary, pa�[38;5;241m.�[39mNullArray):

E           �[1;32m   1779�[0m     new_dtype �[38;5;241m=�[39m cudf�[38;5;241m.�[39mdtype(arbitrary�[38;5;241m.�[39mtype�[38;5;241m.�[39mto_pandas_dtype())

E

E           File �[0;32m/usr/local/lib/python3.8/dist-packages/cudf/core/column/column.py:302�[0m, in �[0;36mColumnBase.from_arrow�[0;34m(cls, array)�[0m

E           �[1;32m    297�[0m �[38;5;28;01melif�[39;00m �[38;5;28misinstance�[39m(

E           �[1;32m    298�[0m     array�[38;5;241m.�[39mtype, pd�[38;5;241m.�[39mcore�[38;5;241m.�[39marrays�[38;5;241m.�[39m_arrow_utils�[38;5;241m.�[39mArrowIntervalType

E           �[1;32m    299�[0m ):

E           �[1;32m    300�[0m     �[38;5;28;01mreturn�[39;00m cudf�[38;5;241m.�[39mcore�[38;5;241m.�[39mcolumn�[38;5;241m.�[39mIntervalColumn�[38;5;241m.�[39mfrom_arrow(array)

E           �[0;32m--> 302�[0m result �[38;5;241m=�[39m �[43mlibcudf�[49m�[38;5;241;43m.�[39;49m�[43minterop�[49m�[38;5;241;43m.�[39;49m�[43mfrom_arrow�[49m�[43m(�[49m�[43mdata�[49m�[43m,�[49m�[43m �[49m�[43mdata�[49m�[38;5;241;43m.�[39;49m�[43mcolumn_names�[49m�[43m)�[49m[�[38;5;241m0�[39m][�[38;5;124m"�[39m�[38;5;124mNone�[39m�[38;5;124m"�[39m]

E           �[1;32m    304�[0m �[38;5;28;01mreturn�[39;00m result�[38;5;241m.�[39m_with_type_metadata(cudf_dtype_from_pa_type(array�[38;5;241m.�[39mtype))

E

E           File �[0;32mcudf/_lib/interop.pyx:167�[0m, in �[0;36mcudf._lib.interop.from_arrow�[0;34m()�[0m

E

E           �[0;31mMemoryError�[0m: std::bad_alloc: out_of_memory: CUDA error at: /usr/include/rmm/mr/device/cuda_memory_resource.hpp:70: cudaErrorMemoryAllocation out of memory

E           MemoryError: std::bad_alloc: out_of_memory: CUDA error at: /usr/include/rmm/mr/device/cuda_memory_resource.hpp:70: cudaErrorMemoryAllocation out of memory
/usr/local/lib/python3.8/dist-packages/nbclient/client.py:916: CellExecutionError

----------------------------- Captured stderr call -----------------------------

2022-08-10 18:09:10,092 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize

2022-08-10 18:09:10,096 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize

/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.

warnings.warn(

/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.

warnings.warn(

/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.

warnings.warn(

/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.

warnings.warn(

/usr/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 18 leaked semaphore objects to clean up at shutdown

warnings.warn('resource_tracker: There appear to be %d '

=========================== short test summary info ============================

FAILED tests/unit/examples/test_building_deploying_multi_stage_RecSys.py::test_func

FAILED tests/unit/examples/test_scaling_criteo_merlin_models.py::test_func - ...

=================== 2 failed, 1 passed in 163.07s (0:02:43) ====================

Build step 'Execute shell' marked build as failure

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

cd /var/jenkins_home/

CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://github.com/gitapi/repos/NVIDIA-Merlin/Merlin/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[merlin_merlin] $ /bin/bash /tmp/jenkins1543561617175394840.sh

karlhigley added 2 commits August 10, 2022 14:00

Revert "Avoid copying CMake libraries over the top of install"

ddba713

This reverts commit eb0789c.

Revert "Install CMake in Merlin base image (instead of copying from b…

e417fec

…uild)" This reverts commit df88d4b.

karlhigley added chore Infrastructure update ci labels Aug 10, 2022

karlhigley added this to the Merlin 22.08 milestone Aug 10, 2022

karlhigley requested a review from nv-alaiacano August 10, 2022 18:02

karlhigley self-assigned this Aug 10, 2022

karlhigley changed the title ~~Fix/revert cmake~~ Revert CMake changes Aug 10, 2022

Revert "Remove duplicate CMake installs (#523)"

b09e3e7

This reverts commit cf83cb3.

nv-alaiacano approved these changes Aug 10, 2022

View reviewed changes

karlhigley merged commit 05226e1 into main Aug 10, 2022

nv-alaiacano deleted the fix/revert-cmake branch August 10, 2022 18:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert CMake changes #538

Revert CMake changes #538

karlhigley commented Aug 10, 2022

github-actions bot commented Aug 10, 2022

nvidia-merlin-bot commented Aug 10, 2022

nvidia-merlin-bot commented Aug 10, 2022

Revert CMake changes #538

Revert CMake changes #538

Conversation

karlhigley commented Aug 10, 2022

github-actions bot commented Aug 10, 2022

Documentation preview

nvidia-merlin-bot commented Aug 10, 2022

nvidia-merlin-bot commented Aug 10, 2022