Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds integration tests to Merlin Models #438

Merged
merged 3 commits into from
Jul 20, 2022
Merged

Conversation

gabrielspmoreira
Copy link
Member

@gabrielspmoreira gabrielspmoreira commented Jul 7, 2022

This PR runs integration tests for Merlin Models:

  • Retrieval models (MF and Two-Tower) including accuracy and performance (runtime, throughtput) logging to W&B

@github-actions
Copy link

github-actions bot commented Jul 7, 2022

Documentation preview

https://nvidia-merlin.github.io/Merlin/review/pr-438

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #438 of commit c2a14c46544f9f91730dddb0324e50b01b92d222, no merge conflicts.
Running as SYSTEM
Setting status of c2a14c46544f9f91730dddb0324e50b01b92d222 to PENDING with url https://10.20.13.93:8080/job/merlin_merlin/232/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_merlin
using credential systems-login
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Merlin # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Merlin
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Merlin +refs/pull/438/*:refs/remotes/origin/pr/438/* # timeout=10
 > git rev-parse c2a14c46544f9f91730dddb0324e50b01b92d222^{commit} # timeout=10
Checking out Revision c2a14c46544f9f91730dddb0324e50b01b92d222 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f c2a14c46544f9f91730dddb0324e50b01b92d222 # timeout=10
Commit message: "Added integration tests to Merlin Models"
 > git rev-list --no-walk 43d5ab866d6e360ba7de67d50dd6e76d99473049 # timeout=10
[merlin_merlin] $ /bin/bash /tmp/jenkins15760467702778833485.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_merlin/merlin
plugins: anyio-3.5.0, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 2 items

tests/unit/test_version.py . [ 50%]
tests/unit/examples/test_building_deploying_multi_stage_RecSys.py . [100%]

======================== 2 passed in 140.17s (0:02:20) =========================
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://github.com/gitapi/repos/NVIDIA-Merlin/Merlin/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_merlin] $ /bin/bash /tmp/jenkins13601818796945399604.sh

@@ -101,6 +101,10 @@ echo "#####################"
echo "Run integration tests for NVTabular"
/nvtabular/ci/test_integration.sh $container $devices --report 1

# Test Transformers4Rec
echo "Run integration tests for Merlin Models"
/models/ci/test_integration.sh $container $devices
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is WB set to stream metrics by default? if it is, can we invert it, so that it is set not to report by default and we have to turn on reporting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes @jperez999 . For integration tests W&B is set to stream metrics by default. What could be a reason for disabling that?

@gabrielspmoreira
Copy link
Member Author

@jperez999 The PR introducing the retrieval integration tests into Merlin Models was already merged, so we can now merge this PR to enable running those integration tests nightly on Blossom

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #438 of commit 39677fa4eb9b65b20522c7e816b998eb2e3c31d3, no merge conflicts.
Running as SYSTEM
Setting status of 39677fa4eb9b65b20522c7e816b998eb2e3c31d3 to PENDING with url https://10.20.13.93:8080/job/merlin_merlin/258/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_merlin
using credential systems-login
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Merlin # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Merlin
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Merlin +refs/pull/438/*:refs/remotes/origin/pr/438/* # timeout=10
 > git rev-parse 39677fa4eb9b65b20522c7e816b998eb2e3c31d3^{commit} # timeout=10
Checking out Revision 39677fa4eb9b65b20522c7e816b998eb2e3c31d3 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 39677fa4eb9b65b20522c7e816b998eb2e3c31d3 # timeout=10
Commit message: "Enabling calling Merlin Models integration tests after rebasing"
 > git rev-list --no-walk e636ce63a75841e6054c975ebdde60e104ad51f8 # timeout=10
[merlin_merlin] $ /bin/bash /tmp/jenkins2365010800849903378.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_merlin/merlin
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 2 items

tests/unit/test_version.py . [ 50%]
tests/unit/examples/test_building_deploying_multi_stage_RecSys.py . [100%]

======================== 2 passed in 111.26s (0:01:51) =========================
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://github.com/gitapi/repos/NVIDIA-Merlin/Merlin/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_merlin] $ /bin/bash /tmp/jenkins2019134160873880517.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #438 of commit dd36e3afd92da6d92cdab47fa4fbce43161d1c4b, no merge conflicts.
Running as SYSTEM
Setting status of dd36e3afd92da6d92cdab47fa4fbce43161d1c4b to PENDING with url https://10.20.13.93:8080/job/merlin_merlin/265/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_merlin
using credential systems-login
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Merlin # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Merlin
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Merlin +refs/pull/438/*:refs/remotes/origin/pr/438/* # timeout=10
 > git rev-parse dd36e3afd92da6d92cdab47fa4fbce43161d1c4b^{commit} # timeout=10
Checking out Revision dd36e3afd92da6d92cdab47fa4fbce43161d1c4b (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f dd36e3afd92da6d92cdab47fa4fbce43161d1c4b # timeout=10
Commit message: "Merge branch 'main' into mm_integration_tests"
 > git rev-list --no-walk c34be4a49f5be131cd8b77577d3cfa050047d738 # timeout=10
[merlin_merlin] $ /bin/bash /tmp/jenkins11989105243338007253.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_merlin/merlin
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 2 items

tests/unit/test_version.py . [ 50%]
tests/unit/examples/test_building_deploying_multi_stage_RecSys.py F [100%]

=================================== FAILURES ===================================
__________________________________ test_func ___________________________________

self = <testbook.client.TestbookNotebookClient object at 0x7ff1defd1820>
cell = [53], kwargs = {}, cell_indexes = [53], executed_cells = [], idx = 53

def execute_cell(self, cell, **kwargs) -> Union[Dict, List[Dict]]:
    """
    Executes a cell or list of cells
    """
    if isinstance(cell, slice):
        start, stop = self._cell_index(cell.start), self._cell_index(cell.stop)
        if cell.step is not None:
            raise TestbookError('testbook does not support step argument')

        cell = range(start, stop + 1)
    elif isinstance(cell, str) or isinstance(cell, int):
        cell = [cell]

    cell_indexes = cell

    if all(isinstance(x, str) for x in cell):
        cell_indexes = [self._cell_index(tag) for tag in cell]

    executed_cells = []
    for idx in cell_indexes:
        try:
          cell = super().execute_cell(self.nb['cells'][idx], idx, **kwargs)

/usr/local/lib/python3.8/dist-packages/testbook/client.py:133:


args = (<testbook.client.TestbookNotebookClient object at 0x7ff1defd1820>, {'id': '4f980866', 'cell_type': 'code', 'metadata'...ast.py, line 299 in transform>]"\n\nAt:\n /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}, 53)
kwargs = {}

def wrapped(*args, **kwargs):
  return just_run(coro(*args, **kwargs))

/usr/local/lib/python3.8/dist-packages/nbclient/util.py:85:


coro = <coroutine object NotebookClient.async_execute_cell at 0x7ff1de93dac0>

def just_run(coro: Awaitable) -> Any:
    """Make the coroutine run, even if there is an event loop running (using nest_asyncio)"""
    try:
        loop = asyncio.get_running_loop()
    except RuntimeError:
        loop = None
    if loop is None:
        had_running_loop = False
        loop = asyncio.new_event_loop()
        asyncio.set_event_loop(loop)
    else:
        had_running_loop = True
    if had_running_loop:
        # if there is a running loop, we patch using nest_asyncio
        # to have reentrant event loops
        check_ipython()
        import nest_asyncio

        nest_asyncio.apply()
        check_patch_tornado()
  return loop.run_until_complete(coro)

/usr/local/lib/python3.8/dist-packages/nbclient/util.py:60:


self = <_UnixSelectorEventLoop running=False closed=False debug=False>
future = <Task finished name='Task-365' coro=<NotebookClient.async_execute_cell() done, defined at /usr/local/lib/python3.8/dis...ps/feast.py, line 299 in transform>]"\n\nAt:\n /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n\n')>

def run_until_complete(self, future):
    """Run until the Future is done.

    If the argument is a coroutine, it is wrapped in a Task.

    WARNING: It would be disastrous to call run_until_complete()
    with the same coroutine twice -- it would wrap it in two
    different Tasks and that can't be good.

    Return the Future's result, or raise its exception.
    """
    self._check_closed()
    self._check_running()

    new_task = not futures.isfuture(future)
    future = tasks.ensure_future(future, loop=self)
    if new_task:
        # An exception is raised if the future didn't complete, so there
        # is no need to log the "destroy pending task" message
        future._log_destroy_pending = False

    future.add_done_callback(_run_until_complete_cb)
    try:
        self.run_forever()
    except:
        if new_task and future.done() and not future.cancelled():
            # The coroutine raised a BaseException. Consume the exception
            # to not log a warning, the caller doesn't have access to the
            # local task.
            future.exception()
        raise
    finally:
        future.remove_done_callback(_run_until_complete_cb)
    if not future.done():
        raise RuntimeError('Event loop stopped before Future completed.')
  return future.result()

/usr/lib/python3.8/asyncio/base_events.py:616:


self = <testbook.client.TestbookNotebookClient object at 0x7ff1defd1820>
cell = {'id': '4f980866', 'cell_type': 'code', 'metadata': {'execution': {'iopub.status.busy': '2022-07-18T15:44:13.023892Z',...ps/feast.py, line 299 in transform>]"\n\nAt:\n /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}
cell_index = 53, execution_count = None, store_history = True

async def async_execute_cell(
    self,
    cell: NotebookNode,
    cell_index: int,
    execution_count: t.Optional[int] = None,
    store_history: bool = True,
) -> NotebookNode:
    """
    Executes a single code cell.

    To execute all cells see :meth:`execute`.

    Parameters
    ----------
    cell : nbformat.NotebookNode
        The cell which is currently being processed.
    cell_index : int
        The position of the cell within the notebook object.
    execution_count : int
        The execution count to be assigned to the cell (default: Use kernel response)
    store_history : bool
        Determines if history should be stored in the kernel (default: False).
        Specific to ipython kernels, which can store command histories.

    Returns
    -------
    output : dict
        The execution output payload (or None for no output).

    Raises
    ------
    CellExecutionError
        If execution failed and should raise an exception, this will be raised
        with defaults about the failure.

    Returns
    -------
    cell : NotebookNode
        The cell which was just processed.
    """
    assert self.kc is not None

    await run_hook(self.on_cell_start, cell=cell, cell_index=cell_index)

    if cell.cell_type != 'code' or not cell.source.strip():
        self.log.debug("Skipping non-executing cell %s", cell_index)
        return cell

    if self.skip_cells_with_tag in cell.metadata.get("tags", []):
        self.log.debug("Skipping tagged cell %s", cell_index)
        return cell

    if self.record_timing:  # clear execution metadata prior to execution
        cell['metadata']['execution'] = {}

    self.log.debug("Executing cell:\n%s", cell.source)

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors or "raises-exception" in cell.metadata.get("tags", [])
    )

    await run_hook(self.on_cell_execute, cell=cell, cell_index=cell_index)
    parent_msg_id = await ensure_async(
        self.kc.execute(
            cell.source, store_history=store_history, stop_on_error=not cell_allows_errors
        )
    )
    await run_hook(self.on_cell_complete, cell=cell, cell_index=cell_index)
    # We launched a code cell to execute
    self.code_cells_executed += 1
    exec_timeout = self._get_timeout(cell)

    cell.outputs = []
    self.clear_before_next_output = False

    task_poll_kernel_alive = asyncio.ensure_future(self._async_poll_kernel_alive())
    task_poll_output_msg = asyncio.ensure_future(
        self._async_poll_output_msg(parent_msg_id, cell, cell_index)
    )
    self.task_poll_for_reply = asyncio.ensure_future(
        self._async_poll_for_reply(
            parent_msg_id, cell, exec_timeout, task_poll_output_msg, task_poll_kernel_alive
        )
    )
    try:
        exec_reply = await self.task_poll_for_reply
    except asyncio.CancelledError:
        # can only be cancelled by task_poll_kernel_alive when the kernel is dead
        task_poll_output_msg.cancel()
        raise DeadKernelError("Kernel died")
    except Exception as e:
        # Best effort to cancel request if it hasn't been resolved
        try:
            # Check if the task_poll_output is doing the raising for us
            if not isinstance(e, CellControlSignal):
                task_poll_output_msg.cancel()
        finally:
            raise

    if execution_count:
        cell['execution_count'] = execution_count
    await run_hook(
        self.on_cell_executed, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )
  await self._check_raise_for_error(cell, cell_index, exec_reply)

/usr/local/lib/python3.8/dist-packages/nbclient/client.py:1022:


self = <testbook.client.TestbookNotebookClient object at 0x7ff1defd1820>
cell = {'id': '4f980866', 'cell_type': 'code', 'metadata': {'execution': {'iopub.status.busy': '2022-07-18T15:44:13.023892Z',...ps/feast.py, line 299 in transform>]"\n\nAt:\n /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}
cell_index = 53
exec_reply = {'buffers': [], 'content': {'ename': 'InferenceServerException', 'engine_info': {'engine_id': -1, 'engine_uuid': '1d70...e, 'engine': '1d70992d-4b38-4105-a374-58e4c23a1c70', 'started': '2022-07-18T15:44:13.024203Z', 'status': 'error'}, ...}

async def _check_raise_for_error(
    self, cell: NotebookNode, cell_index: int, exec_reply: t.Optional[t.Dict]
) -> None:

    if exec_reply is None:
        return None

    exec_reply_content = exec_reply['content']
    if exec_reply_content['status'] != 'error':
        return None

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors
        or exec_reply_content.get('ename') in self.allow_error_names
        or "raises-exception" in cell.metadata.get("tags", [])
    )
    await run_hook(
        self.on_cell_error, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )
    if not cell_allows_errors:
      raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)

E nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
E ------------------
E
E import shutil
E from merlin.models.loader.tf_utils import configure_tensorflow
E configure_tensorflow()
E from merlin.systems.triton.utils import run_ensemble_on_tritonserver
E response = run_ensemble_on_tritonserver(
E "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
E )
E response = [x.tolist()[0] for x in response["ordered_ids"]]
E shutil.rmtree("/tmp/examples/", ignore_errors=True)
E
E ------------------
E
E �[0;31m---------------------------------------------------------------------------�[0m
E �[0;31mInferenceServerException�[0m Traceback (most recent call last)
E Input �[0;32mIn [32]�[0m, in �[0;36m<cell line: 5>�[0;34m()�[0m
E �[1;32m 3�[0m configure_tensorflow()
E �[1;32m 4�[0m �[38;5;28;01mfrom�[39;00m �[38;5;21;01mmerlin�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01msystems�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mtriton�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mutils�[39;00m �[38;5;28;01mimport�[39;00m run_ensemble_on_tritonserver
E �[0;32m----> 5�[0m response �[38;5;241m=�[39m �[43mrun_ensemble_on_tritonserver�[49m�[43m(�[49m
E �[1;32m 6�[0m �[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43m/tmp/examples/poc_ensemble�[39;49m�[38;5;124;43m"�[39;49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest�[49m�[43m,�[49m�[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43mensemble_model�[39;49m�[38;5;124;43m"�[39;49m
E �[1;32m 7�[0m �[43m)�[49m
E �[1;32m 8�[0m response �[38;5;241m=�[39m [x�[38;5;241m.�[39mtolist()[�[38;5;241m0�[39m] �[38;5;28;01mfor�[39;00m x �[38;5;129;01min�[39;00m response[�[38;5;124m"�[39m�[38;5;124mordered_ids�[39m�[38;5;124m"�[39m]]
E �[1;32m 9�[0m shutil�[38;5;241m.�[39mrmtree(�[38;5;124m"�[39m�[38;5;124m/tmp/examples/�[39m�[38;5;124m"�[39m, ignore_errors�[38;5;241m=�[39m�[38;5;28;01mTrue�[39;00m)
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:93�[0m, in �[0;36mrun_ensemble_on_tritonserver�[0;34m(tmpdir, output_columns, df, model_name)�[0m
E �[1;32m 91�[0m response �[38;5;241m=�[39m �[38;5;28;01mNone�[39;00m
E �[1;32m 92�[0m �[38;5;28;01mwith�[39;00m run_triton_server(tmpdir) �[38;5;28;01mas�[39;00m client:
E �[0;32m---> 93�[0m response �[38;5;241m=�[39m �[43msend_triton_request�[49m�[43m(�[49m�[43mdf�[49m�[43m,�[49m�[43m �[49m�[43moutput_columns�[49m�[43m,�[49m�[43m �[49m�[43mclient�[49m�[38;5;241;43m=�[39;49m�[43mclient�[49m�[43m,�[49m�[43m �[49m�[43mtriton_model�[49m�[38;5;241;43m=�[39;49m�[43mmodel_name�[49m�[43m)�[49m
E �[1;32m 95�[0m �[38;5;28;01mreturn�[39;00m response
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:141�[0m, in �[0;36msend_triton_request�[0;34m(df, outputs_list, client, endpoint, request_id, triton_model)�[0m
E �[1;32m 139�[0m outputs �[38;5;241m=�[39m [grpcclient�[38;5;241m.�[39mInferRequestedOutput(col) �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list]
E �[1;32m 140�[0m �[38;5;28;01mwith�[39;00m client:
E �[0;32m--> 141�[0m response �[38;5;241m=�[39m �[43mclient�[49m�[38;5;241;43m.�[39;49m�[43minfer�[49m�[43m(�[49m�[43mtriton_model�[49m�[43m,�[49m�[43m �[49m�[43minputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest_id�[49m�[38;5;241;43m=�[39;49m�[43mrequest_id�[49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[38;5;241;43m=�[39;49m�[43moutputs�[49m�[43m)�[49m
E �[1;32m 143�[0m results �[38;5;241m=�[39m {}
E �[1;32m 144�[0m �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list:
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322�[0m, in �[0;36mInferenceServerClient.infer�[0;34m(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)�[0m
E �[1;32m 1320�[0m �[38;5;28;01mreturn�[39;00m result
E �[1;32m 1321�[0m �[38;5;28;01mexcept�[39;00m grpc�[38;5;241m.�[39mRpcError �[38;5;28;01mas�[39;00m rpc_error:
E �[0;32m-> 1322�[0m �[43mraise_error_grpc�[49m�[43m(�[49m�[43mrpc_error�[49m�[43m)�[49m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62�[0m, in �[0;36mraise_error_grpc�[0;34m(rpc_error)�[0m
E �[1;32m 61�[0m �[38;5;28;01mdef�[39;00m �[38;5;21mraise_error_grpc�[39m(rpc_error):
E �[0;32m---> 62�[0m �[38;5;28;01mraise�[39;00m get_error_grpc(rpc_error) �[38;5;28;01mfrom�[39;00m �[38;5;28mNone�[39m
E
E �[0;31mInferenceServerException�[0m: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E
E Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E
E At:
E /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
E
E InferenceServerException: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E
E Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E
E At:
E /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute

/usr/local/lib/python3.8/dist-packages/nbclient/client.py:916: CellExecutionError

During handling of the above exception, another exception occurred:

def test_func():
    with testbook(
        REPO_ROOT
        / "examples"
        / "Building-and-deploying-multi-stage-RecSys"
        / "01-Building-Recommender-Systems-with-Merlin.ipynb",
        execute=False,
    ) as tb1:
        tb1.inject(
            """
            import os
            os.environ["DATA_FOLDER"] = "/tmp/data/"
            os.environ["NUM_ROWS"] = "10000"
            os.system("mkdir -p /tmp/examples")
            os.environ["BASE_DIR"] = "/tmp/examples/"
            """
        )
        tb1.execute()
        assert os.path.isdir("/tmp/examples/dlrm")
        assert os.path.isdir("/tmp/examples/feature_repo")
        assert os.path.isdir("/tmp/examples/query_tower")
        assert os.path.isfile("/tmp/examples/item_embeddings.parquet")
        assert os.path.isfile("/tmp/examples/feature_repo/user_features.py")
        assert os.path.isfile("/tmp/examples/feature_repo/item_features.py")

    with testbook(
        REPO_ROOT
        / "examples"
        / "Building-and-deploying-multi-stage-RecSys"
        / "02-Deploying-multi-stage-RecSys-with-Merlin-Systems.ipynb",
        execute=False,
    ) as tb2:
        tb2.inject(
            """
            import os
            os.environ["DATA_FOLDER"] = "/tmp/data/"
            os.environ["BASE_DIR"] = "/tmp/examples/"
            """
        )
        NUM_OF_CELLS = len(tb2.cells)
        tb2.execute_cell(list(range(0, NUM_OF_CELLS - 3)))
        top_k = tb2.ref("top_k")
        outputs = tb2.ref("outputs")
        assert outputs[0] == "ordered_ids"
      tb2.inject(
            """
            import shutil
            from merlin.models.loader.tf_utils import configure_tensorflow
            configure_tensorflow()
            from merlin.systems.triton.utils import run_ensemble_on_tritonserver
            response = run_ensemble_on_tritonserver(
                "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
            )
            response = [x.tolist()[0] for x in response["ordered_ids"]]
            shutil.rmtree("/tmp/examples/", ignore_errors=True)
            """
        )

tests/unit/examples/test_building_deploying_multi_stage_RecSys.py:57:


/usr/local/lib/python3.8/dist-packages/testbook/client.py:237: in inject
cell = TestbookNode(self.execute_cell(inject_idx)) if run else TestbookNode(code_cell)


self = <testbook.client.TestbookNotebookClient object at 0x7ff1defd1820>
cell = [53], kwargs = {}, cell_indexes = [53], executed_cells = [], idx = 53

def execute_cell(self, cell, **kwargs) -> Union[Dict, List[Dict]]:
    """
    Executes a cell or list of cells
    """
    if isinstance(cell, slice):
        start, stop = self._cell_index(cell.start), self._cell_index(cell.stop)
        if cell.step is not None:
            raise TestbookError('testbook does not support step argument')

        cell = range(start, stop + 1)
    elif isinstance(cell, str) or isinstance(cell, int):
        cell = [cell]

    cell_indexes = cell

    if all(isinstance(x, str) for x in cell):
        cell_indexes = [self._cell_index(tag) for tag in cell]

    executed_cells = []
    for idx in cell_indexes:
        try:
            cell = super().execute_cell(self.nb['cells'][idx], idx, **kwargs)
        except CellExecutionError as ce:
          raise TestbookRuntimeError(ce.evalue, ce, self._get_error_class(ce.ename))

E testbook.exceptions.TestbookRuntimeError: An error occurred while executing the following cell:
E ------------------
E
E import shutil
E from merlin.models.loader.tf_utils import configure_tensorflow
E configure_tensorflow()
E from merlin.systems.triton.utils import run_ensemble_on_tritonserver
E response = run_ensemble_on_tritonserver(
E "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
E )
E response = [x.tolist()[0] for x in response["ordered_ids"]]
E shutil.rmtree("/tmp/examples/", ignore_errors=True)
E
E ------------------
E
E �[0;31m---------------------------------------------------------------------------�[0m
E �[0;31mInferenceServerException�[0m Traceback (most recent call last)
E Input �[0;32mIn [32]�[0m, in �[0;36m<cell line: 5>�[0;34m()�[0m
E �[1;32m 3�[0m configure_tensorflow()
E �[1;32m 4�[0m �[38;5;28;01mfrom�[39;00m �[38;5;21;01mmerlin�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01msystems�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mtriton�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mutils�[39;00m �[38;5;28;01mimport�[39;00m run_ensemble_on_tritonserver
E �[0;32m----> 5�[0m response �[38;5;241m=�[39m �[43mrun_ensemble_on_tritonserver�[49m�[43m(�[49m
E �[1;32m 6�[0m �[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43m/tmp/examples/poc_ensemble�[39;49m�[38;5;124;43m"�[39;49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest�[49m�[43m,�[49m�[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43mensemble_model�[39;49m�[38;5;124;43m"�[39;49m
E �[1;32m 7�[0m �[43m)�[49m
E �[1;32m 8�[0m response �[38;5;241m=�[39m [x�[38;5;241m.�[39mtolist()[�[38;5;241m0�[39m] �[38;5;28;01mfor�[39;00m x �[38;5;129;01min�[39;00m response[�[38;5;124m"�[39m�[38;5;124mordered_ids�[39m�[38;5;124m"�[39m]]
E �[1;32m 9�[0m shutil�[38;5;241m.�[39mrmtree(�[38;5;124m"�[39m�[38;5;124m/tmp/examples/�[39m�[38;5;124m"�[39m, ignore_errors�[38;5;241m=�[39m�[38;5;28;01mTrue�[39;00m)
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:93�[0m, in �[0;36mrun_ensemble_on_tritonserver�[0;34m(tmpdir, output_columns, df, model_name)�[0m
E �[1;32m 91�[0m response �[38;5;241m=�[39m �[38;5;28;01mNone�[39;00m
E �[1;32m 92�[0m �[38;5;28;01mwith�[39;00m run_triton_server(tmpdir) �[38;5;28;01mas�[39;00m client:
E �[0;32m---> 93�[0m response �[38;5;241m=�[39m �[43msend_triton_request�[49m�[43m(�[49m�[43mdf�[49m�[43m,�[49m�[43m �[49m�[43moutput_columns�[49m�[43m,�[49m�[43m �[49m�[43mclient�[49m�[38;5;241;43m=�[39;49m�[43mclient�[49m�[43m,�[49m�[43m �[49m�[43mtriton_model�[49m�[38;5;241;43m=�[39;49m�[43mmodel_name�[49m�[43m)�[49m
E �[1;32m 95�[0m �[38;5;28;01mreturn�[39;00m response
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:141�[0m, in �[0;36msend_triton_request�[0;34m(df, outputs_list, client, endpoint, request_id, triton_model)�[0m
E �[1;32m 139�[0m outputs �[38;5;241m=�[39m [grpcclient�[38;5;241m.�[39mInferRequestedOutput(col) �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list]
E �[1;32m 140�[0m �[38;5;28;01mwith�[39;00m client:
E �[0;32m--> 141�[0m response �[38;5;241m=�[39m �[43mclient�[49m�[38;5;241;43m.�[39;49m�[43minfer�[49m�[43m(�[49m�[43mtriton_model�[49m�[43m,�[49m�[43m �[49m�[43minputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest_id�[49m�[38;5;241;43m=�[39;49m�[43mrequest_id�[49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[38;5;241;43m=�[39;49m�[43moutputs�[49m�[43m)�[49m
E �[1;32m 143�[0m results �[38;5;241m=�[39m {}
E �[1;32m 144�[0m �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list:
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322�[0m, in �[0;36mInferenceServerClient.infer�[0;34m(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)�[0m
E �[1;32m 1320�[0m �[38;5;28;01mreturn�[39;00m result
E �[1;32m 1321�[0m �[38;5;28;01mexcept�[39;00m grpc�[38;5;241m.�[39mRpcError �[38;5;28;01mas�[39;00m rpc_error:
E �[0;32m-> 1322�[0m �[43mraise_error_grpc�[49m�[43m(�[49m�[43mrpc_error�[49m�[43m)�[49m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62�[0m, in �[0;36mraise_error_grpc�[0;34m(rpc_error)�[0m
E �[1;32m 61�[0m �[38;5;28;01mdef�[39;00m �[38;5;21mraise_error_grpc�[39m(rpc_error):
E �[0;32m---> 62�[0m �[38;5;28;01mraise�[39;00m get_error_grpc(rpc_error) �[38;5;28;01mfrom�[39;00m �[38;5;28mNone�[39m
E
E �[0;31mInferenceServerException�[0m: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E
E Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E
E At:
E /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
E
E InferenceServerException: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E
E Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E
E At:
E /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute

/usr/local/lib/python3.8/dist-packages/testbook/client.py:135: TestbookRuntimeError
----------------------------- Captured stdout call -----------------------------
Signal (2) received.
----------------------------- Captured stderr call -----------------------------
2022-07-18 15:43:01.244936: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-18 15:43:03.244179: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1627 MB memory: -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-18 15:43:03.244975: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 15153 MB memory: -> device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/lib/python3.8/logging/init.py", line 2127, in shutdown
h.close()
File "/usr/local/lib/python3.8/dist-packages/absl/logging/init.py", line 934, in close
self.stream.close()
File "/usr/local/lib/python3.8/dist-packages/ipykernel/iostream.py", line 438, in close
self.watch_fd_thread.join()
AttributeError: 'OutStream' object has no attribute 'watch_fd_thread'
WARNING clustering 252 points to 32 centroids: please provide at least 1248 training points
2022-07-18 15:44:06.073763: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-18 15:44:08.061615: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1627 MB memory: -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-18 15:44:08.062358: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 15153 MB memory: -> device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0
I0718 15:44:13.296922 18724 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7fb4ec000000' with size 268435456
I0718 15:44:13.297651 18724 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0718 15:44:13.304626 18724 model_repository_manager.cc:1191] loading: 1_predicttensorflow:1
I0718 15:44:13.404920 18724 model_repository_manager.cc:1191] loading: 0_queryfeast:1
I0718 15:44:13.505205 18724 model_repository_manager.cc:1191] loading: 2_queryfaiss:1
I0718 15:44:13.605517 18724 model_repository_manager.cc:1191] loading: 3_queryfeast:1
I0718 15:44:13.676469 18724 tensorflow.cc:2181] TRITONBACKEND_Initialize: tensorflow
I0718 15:44:13.676506 18724 tensorflow.cc:2191] Triton TRITONBACKEND API version: 1.9
I0718 15:44:13.676513 18724 tensorflow.cc:2197] 'tensorflow' TRITONBACKEND API version: 1.9
I0718 15:44:13.676519 18724 tensorflow.cc:2221] backend configuration:
{"cmdline":{"auto-complete-config":"false","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","version":"2","default-max-batch-size":"4"}}
I0718 15:44:13.676552 18724 tensorflow.cc:2281] TRITONBACKEND_ModelInitialize: 1_predicttensorflow (version 1)
I0718 15:44:13.681520 18724 tensorflow.cc:2330] TRITONBACKEND_ModelInstanceInitialize: 1_predicttensorflow (GPU device 0)
I0718 15:44:13.705864 18724 model_repository_manager.cc:1191] loading: 4_unrollfeatures:1
I0718 15:44:13.806184 18724 model_repository_manager.cc:1191] loading: 5_predicttensorflow:1
I0718 15:44:13.906509 18724 model_repository_manager.cc:1191] loading: 6_softmaxsampling:1
2022-07-18 15:44:14.022973: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel
2022-07-18 15:44:14.027296: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }
2022-07-18 15:44:14.027345: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel
2022-07-18 15:44:14.027451: I tensorflow/core/platform/cpu_feature_guard.cc:152] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE3 SSE4.1 SSE4.2 AVX
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-18 15:44:14.065250: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 12901 MB memory: -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-18 15:44:14.102465: I tensorflow/cc/saved_model/loader.cc:230] Restoring SavedModel bundle.
2022-07-18 15:44:14.161466: I tensorflow/cc/saved_model/loader.cc:214] Running initialization op on SavedModel bundle at path: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel
2022-07-18 15:44:14.186741: I tensorflow/cc/saved_model/loader.cc:321] SavedModel load for tags { serve }; Status: success: OK. Took 163786 microseconds.
I0718 15:44:14.186997 18724 model_repository_manager.cc:1345] successfully loaded '1_predicttensorflow' version 1
I0718 15:44:14.190549 18724 tensorflow.cc:2281] TRITONBACKEND_ModelInitialize: 5_predicttensorflow (version 1)
I0718 15:44:14.191867 18724 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 0_queryfeast (GPU device 0)
I0718 15:44:16.503240 18724 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 2_queryfaiss (GPU device 0)
I0718 15:44:16.503521 18724 model_repository_manager.cc:1345] successfully loaded '0_queryfeast' version 1
I0718 15:44:18.884878 18724 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 3_queryfeast (GPU device 0)
I0718 15:44:18.885016 18724 model_repository_manager.cc:1345] successfully loaded '2_queryfaiss' version 1
I0718 15:44:21.187944 18724 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 4_unrollfeatures (GPU device 0)
I0718 15:44:21.188199 18724 model_repository_manager.cc:1345] successfully loaded '3_queryfeast' version 1
I0718 15:44:23.275279 18724 tensorflow.cc:2330] TRITONBACKEND_ModelInstanceInitialize: 5_predicttensorflow (GPU device 0)
I0718 15:44:23.275610 18724 model_repository_manager.cc:1345] successfully loaded '4_unrollfeatures' version 1
2022-07-18 15:44:23.276929: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel
2022-07-18 15:44:23.292284: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }
2022-07-18 15:44:23.292325: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel
2022-07-18 15:44:23.294381: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 12901 MB memory: -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-18 15:44:23.316561: I tensorflow/cc/saved_model/loader.cc:230] Restoring SavedModel bundle.
2022-07-18 15:44:23.473227: I tensorflow/cc/saved_model/loader.cc:214] Running initialization op on SavedModel bundle at path: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel
2022-07-18 15:44:23.526622: I tensorflow/cc/saved_model/loader.cc:321] SavedModel load for tags { serve }; Status: success: OK. Took 249707 microseconds.
I0718 15:44:23.526774 18724 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 6_softmaxsampling (GPU device 0)
I0718 15:44:23.526854 18724 model_repository_manager.cc:1345] successfully loaded '5_predicttensorflow' version 1
I0718 15:44:25.608179 18724 model_repository_manager.cc:1345] successfully loaded '6_softmaxsampling' version 1
I0718 15:44:25.611060 18724 model_repository_manager.cc:1191] loading: ensemble_model:1
I0718 15:44:25.711864 18724 model_repository_manager.cc:1345] successfully loaded 'ensemble_model' version 1
I0718 15:44:25.712040 18724 server.cc:556]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0718 15:44:25.712147 18724 server.cc:583]
+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tensorflow | /opt/tritonserver/backends/tensorflow2/libtriton_tensorflow2.so | {"cmdline":{"auto-complete-config":"false","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","version":"2","default-max-batch-size":"4"}} |
| python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0718 15:44:25.712268 18724 server.cc:626]
+---------------------+---------+--------+
| Model | Version | Status |
+---------------------+---------+--------+
| 0_queryfeast | 1 | READY |
| 1_predicttensorflow | 1 | READY |
| 2_queryfaiss | 1 | READY |
| 3_queryfeast | 1 | READY |
| 4_unrollfeatures | 1 | READY |
| 5_predicttensorflow | 1 | READY |
| 6_softmaxsampling | 1 | READY |
| ensemble_model | 1 | READY |
+---------------------+---------+--------+

I0718 15:44:25.774995 18724 metrics.cc:650] Collecting metrics for GPU 0: Tesla P100-DGXS-16GB
I0718 15:44:25.775908 18724 tritonserver.cc:2138]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.22.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0] | /tmp/examples/poc_ensemble |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0718 15:44:25.776742 18724 grpc_server.cc:4589] Started GRPCInferenceService at 0.0.0.0:8001
I0718 15:44:25.777288 18724 http_server.cc:3303] Started HTTPService at 0.0.0.0:8000
I0718 15:44:25.818499 18724 http_server.cc:178] Started Metrics Service at 0.0.0.0:8002
W0718 15:44:26.796814 18724 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0718 15:44:26.796882 18724 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
W0718 15:44:27.797046 18724 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0718 15:44:27.797102 18724 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
W0718 15:44:28.815888 18724 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0718 15:44:28.815938 18724 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
0718 15:44:29.562398 18981 pb_stub.cc:749] Failed to process the request(s) for model '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)

Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"

At:
/tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute

I0718 15:44:29.563390 18724 server.cc:257] Waiting for in-flight requests to complete.
I0718 15:44:29.563445 18724 server.cc:273] Timeout 30: Found 0 model versions that have in-flight inferences
I0718 15:44:29.563465 18724 model_repository_manager.cc:1223] unloading: ensemble_model:1
I0718 15:44:29.563565 18724 model_repository_manager.cc:1223] unloading: 6_softmaxsampling:1
I0718 15:44:29.563639 18724 model_repository_manager.cc:1223] unloading: 5_predicttensorflow:1
I0718 15:44:29.563706 18724 model_repository_manager.cc:1328] successfully unloaded 'ensemble_model' version 1
I0718 15:44:29.563747 18724 model_repository_manager.cc:1223] unloading: 4_unrollfeatures:1
I0718 15:44:29.563818 18724 model_repository_manager.cc:1223] unloading: 3_queryfeast:1
I0718 15:44:29.563837 18724 tensorflow.cc:2368] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0718 15:44:29.563888 18724 model_repository_manager.cc:1223] unloading: 2_queryfaiss:1
I0718 15:44:29.563947 18724 model_repository_manager.cc:1223] unloading: 1_predicttensorflow:1
I0718 15:44:29.564024 18724 model_repository_manager.cc:1223] unloading: 0_queryfeast:1
I0718 15:44:29.564074 18724 server.cc:288] All models are stopped, unloading models
I0718 15:44:29.564094 18724 tensorflow.cc:2307] TRITONBACKEND_ModelFinalize: delete model state
I0718 15:44:29.564102 18724 server.cc:295] Timeout 30: Found 7 live models and 0 in-flight non-inference requests
I0718 15:44:29.564208 18724 tensorflow.cc:2368] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0718 15:44:29.564384 18724 tensorflow.cc:2307] TRITONBACKEND_ModelFinalize: delete model state
I0718 15:44:29.580611 18724 model_repository_manager.cc:1328] successfully unloaded '1_predicttensorflow' version 1
I0718 15:44:29.589213 18724 model_repository_manager.cc:1328] successfully unloaded '5_predicttensorflow' version 1
I0718 15:44:30.564234 18724 server.cc:295] Timeout 29: Found 5 live models and 0 in-flight non-inference requests
/usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py:15: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
ValueType.FLOAT: (np.float, False, False),
I0718 15:44:30.926519 18724 model_repository_manager.cc:1328] successfully unloaded '4_unrollfeatures' version 1
I0718 15:44:31.121236 18724 model_repository_manager.cc:1328] successfully unloaded '2_queryfaiss' version 1
I0718 15:44:31.177282 18724 model_repository_manager.cc:1328] successfully unloaded '6_softmaxsampling' version 1
I0718 15:44:31.207706 18724 model_repository_manager.cc:1328] successfully unloaded '0_queryfeast' version 1
I0718 15:44:31.564436 18724 server.cc:295] Timeout 28: Found 1 live models and 0 in-flight non-inference requests
I0718 15:44:32.564574 18724 server.cc:295] Timeout 27: Found 1 live models and 0 in-flight non-inference requests
/usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py:15: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
ValueType.FLOAT: (np.float, False, False),
I0718 15:44:33.419959 18724 model_repository_manager.cc:1328] successfully unloaded '3_queryfeast' version 1
I0718 15:44:33.564710 18724 server.cc:295] Timeout 26: Found 0 live models and 0 in-flight non-inference requests
=========================== short test summary info ============================
FAILED tests/unit/examples/test_building_deploying_multi_stage_RecSys.py::test_func
=================== 1 failed, 1 passed in 106.00s (0:01:45) ====================
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://github.com/gitapi/repos/NVIDIA-Merlin/Merlin/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_merlin] $ /bin/bash /tmp/jenkins14636332293218900185.sh

@gabrielspmoreira
Copy link
Member Author

rerun tests

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #438 of commit dd36e3afd92da6d92cdab47fa4fbce43161d1c4b, no merge conflicts.
Running as SYSTEM
Setting status of dd36e3afd92da6d92cdab47fa4fbce43161d1c4b to PENDING with url https://10.20.13.93:8080/job/merlin_merlin/266/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_merlin
using credential systems-login
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Merlin # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Merlin
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Merlin +refs/pull/438/*:refs/remotes/origin/pr/438/* # timeout=10
 > git rev-parse dd36e3afd92da6d92cdab47fa4fbce43161d1c4b^{commit} # timeout=10
Checking out Revision dd36e3afd92da6d92cdab47fa4fbce43161d1c4b (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f dd36e3afd92da6d92cdab47fa4fbce43161d1c4b # timeout=10
Commit message: "Merge branch 'main' into mm_integration_tests"
 > git rev-list --no-walk dd36e3afd92da6d92cdab47fa4fbce43161d1c4b # timeout=10
[merlin_merlin] $ /bin/bash /tmp/jenkins2016635391869878287.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_merlin/merlin
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 2 items

tests/unit/test_version.py . [ 50%]
tests/unit/examples/test_building_deploying_multi_stage_RecSys.py F [100%]

=================================== FAILURES ===================================
__________________________________ test_func ___________________________________

self = <testbook.client.TestbookNotebookClient object at 0x7f00c548aac0>
cell = [53], kwargs = {}, cell_indexes = [53], executed_cells = [], idx = 53

def execute_cell(self, cell, **kwargs) -> Union[Dict, List[Dict]]:
    """
    Executes a cell or list of cells
    """
    if isinstance(cell, slice):
        start, stop = self._cell_index(cell.start), self._cell_index(cell.stop)
        if cell.step is not None:
            raise TestbookError('testbook does not support step argument')

        cell = range(start, stop + 1)
    elif isinstance(cell, str) or isinstance(cell, int):
        cell = [cell]

    cell_indexes = cell

    if all(isinstance(x, str) for x in cell):
        cell_indexes = [self._cell_index(tag) for tag in cell]

    executed_cells = []
    for idx in cell_indexes:
        try:
          cell = super().execute_cell(self.nb['cells'][idx], idx, **kwargs)

/usr/local/lib/python3.8/dist-packages/testbook/client.py:133:


args = (<testbook.client.TestbookNotebookClient object at 0x7f00c548aac0>, {'id': '10300e5d', 'cell_type': 'code', 'metadata'...ast.py, line 299 in transform>]"\n\nAt:\n /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}, 53)
kwargs = {}

def wrapped(*args, **kwargs):
  return just_run(coro(*args, **kwargs))

/usr/local/lib/python3.8/dist-packages/nbclient/util.py:85:


coro = <coroutine object NotebookClient.async_execute_cell at 0x7f00c53f9340>

def just_run(coro: Awaitable) -> Any:
    """Make the coroutine run, even if there is an event loop running (using nest_asyncio)"""
    try:
        loop = asyncio.get_running_loop()
    except RuntimeError:
        loop = None
    if loop is None:
        had_running_loop = False
        loop = asyncio.new_event_loop()
        asyncio.set_event_loop(loop)
    else:
        had_running_loop = True
    if had_running_loop:
        # if there is a running loop, we patch using nest_asyncio
        # to have reentrant event loops
        check_ipython()
        import nest_asyncio

        nest_asyncio.apply()
        check_patch_tornado()
  return loop.run_until_complete(coro)

/usr/local/lib/python3.8/dist-packages/nbclient/util.py:60:


self = <_UnixSelectorEventLoop running=False closed=False debug=False>
future = <Task finished name='Task-365' coro=<NotebookClient.async_execute_cell() done, defined at /usr/local/lib/python3.8/dis...ps/feast.py, line 299 in transform>]"\n\nAt:\n /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n\n')>

def run_until_complete(self, future):
    """Run until the Future is done.

    If the argument is a coroutine, it is wrapped in a Task.

    WARNING: It would be disastrous to call run_until_complete()
    with the same coroutine twice -- it would wrap it in two
    different Tasks and that can't be good.

    Return the Future's result, or raise its exception.
    """
    self._check_closed()
    self._check_running()

    new_task = not futures.isfuture(future)
    future = tasks.ensure_future(future, loop=self)
    if new_task:
        # An exception is raised if the future didn't complete, so there
        # is no need to log the "destroy pending task" message
        future._log_destroy_pending = False

    future.add_done_callback(_run_until_complete_cb)
    try:
        self.run_forever()
    except:
        if new_task and future.done() and not future.cancelled():
            # The coroutine raised a BaseException. Consume the exception
            # to not log a warning, the caller doesn't have access to the
            # local task.
            future.exception()
        raise
    finally:
        future.remove_done_callback(_run_until_complete_cb)
    if not future.done():
        raise RuntimeError('Event loop stopped before Future completed.')
  return future.result()

/usr/lib/python3.8/asyncio/base_events.py:616:


self = <testbook.client.TestbookNotebookClient object at 0x7f00c548aac0>
cell = {'id': '10300e5d', 'cell_type': 'code', 'metadata': {'execution': {'iopub.status.busy': '2022-07-18T16:11:45.316769Z',...ps/feast.py, line 299 in transform>]"\n\nAt:\n /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}
cell_index = 53, execution_count = None, store_history = True

async def async_execute_cell(
    self,
    cell: NotebookNode,
    cell_index: int,
    execution_count: t.Optional[int] = None,
    store_history: bool = True,
) -> NotebookNode:
    """
    Executes a single code cell.

    To execute all cells see :meth:`execute`.

    Parameters
    ----------
    cell : nbformat.NotebookNode
        The cell which is currently being processed.
    cell_index : int
        The position of the cell within the notebook object.
    execution_count : int
        The execution count to be assigned to the cell (default: Use kernel response)
    store_history : bool
        Determines if history should be stored in the kernel (default: False).
        Specific to ipython kernels, which can store command histories.

    Returns
    -------
    output : dict
        The execution output payload (or None for no output).

    Raises
    ------
    CellExecutionError
        If execution failed and should raise an exception, this will be raised
        with defaults about the failure.

    Returns
    -------
    cell : NotebookNode
        The cell which was just processed.
    """
    assert self.kc is not None

    await run_hook(self.on_cell_start, cell=cell, cell_index=cell_index)

    if cell.cell_type != 'code' or not cell.source.strip():
        self.log.debug("Skipping non-executing cell %s", cell_index)
        return cell

    if self.skip_cells_with_tag in cell.metadata.get("tags", []):
        self.log.debug("Skipping tagged cell %s", cell_index)
        return cell

    if self.record_timing:  # clear execution metadata prior to execution
        cell['metadata']['execution'] = {}

    self.log.debug("Executing cell:\n%s", cell.source)

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors or "raises-exception" in cell.metadata.get("tags", [])
    )

    await run_hook(self.on_cell_execute, cell=cell, cell_index=cell_index)
    parent_msg_id = await ensure_async(
        self.kc.execute(
            cell.source, store_history=store_history, stop_on_error=not cell_allows_errors
        )
    )
    await run_hook(self.on_cell_complete, cell=cell, cell_index=cell_index)
    # We launched a code cell to execute
    self.code_cells_executed += 1
    exec_timeout = self._get_timeout(cell)

    cell.outputs = []
    self.clear_before_next_output = False

    task_poll_kernel_alive = asyncio.ensure_future(self._async_poll_kernel_alive())
    task_poll_output_msg = asyncio.ensure_future(
        self._async_poll_output_msg(parent_msg_id, cell, cell_index)
    )
    self.task_poll_for_reply = asyncio.ensure_future(
        self._async_poll_for_reply(
            parent_msg_id, cell, exec_timeout, task_poll_output_msg, task_poll_kernel_alive
        )
    )
    try:
        exec_reply = await self.task_poll_for_reply
    except asyncio.CancelledError:
        # can only be cancelled by task_poll_kernel_alive when the kernel is dead
        task_poll_output_msg.cancel()
        raise DeadKernelError("Kernel died")
    except Exception as e:
        # Best effort to cancel request if it hasn't been resolved
        try:
            # Check if the task_poll_output is doing the raising for us
            if not isinstance(e, CellControlSignal):
                task_poll_output_msg.cancel()
        finally:
            raise

    if execution_count:
        cell['execution_count'] = execution_count
    await run_hook(
        self.on_cell_executed, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )
  await self._check_raise_for_error(cell, cell_index, exec_reply)

/usr/local/lib/python3.8/dist-packages/nbclient/client.py:1022:


self = <testbook.client.TestbookNotebookClient object at 0x7f00c548aac0>
cell = {'id': '10300e5d', 'cell_type': 'code', 'metadata': {'execution': {'iopub.status.busy': '2022-07-18T16:11:45.316769Z',...ps/feast.py, line 299 in transform>]"\n\nAt:\n /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}
cell_index = 53
exec_reply = {'buffers': [], 'content': {'ename': 'InferenceServerException', 'engine_info': {'engine_id': -1, 'engine_uuid': '66b3...e, 'engine': '66b35789-4999-480a-a323-91eff08aebd7', 'started': '2022-07-18T16:11:45.317086Z', 'status': 'error'}, ...}

async def _check_raise_for_error(
    self, cell: NotebookNode, cell_index: int, exec_reply: t.Optional[t.Dict]
) -> None:

    if exec_reply is None:
        return None

    exec_reply_content = exec_reply['content']
    if exec_reply_content['status'] != 'error':
        return None

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors
        or exec_reply_content.get('ename') in self.allow_error_names
        or "raises-exception" in cell.metadata.get("tags", [])
    )
    await run_hook(
        self.on_cell_error, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )
    if not cell_allows_errors:
      raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)

E nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
E ------------------
E
E import shutil
E from merlin.models.loader.tf_utils import configure_tensorflow
E configure_tensorflow()
E from merlin.systems.triton.utils import run_ensemble_on_tritonserver
E response = run_ensemble_on_tritonserver(
E "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
E )
E response = [x.tolist()[0] for x in response["ordered_ids"]]
E shutil.rmtree("/tmp/examples/", ignore_errors=True)
E
E ------------------
E
E �[0;31m---------------------------------------------------------------------------�[0m
E �[0;31mInferenceServerException�[0m Traceback (most recent call last)
E Input �[0;32mIn [32]�[0m, in �[0;36m<cell line: 5>�[0;34m()�[0m
E �[1;32m 3�[0m configure_tensorflow()
E �[1;32m 4�[0m �[38;5;28;01mfrom�[39;00m �[38;5;21;01mmerlin�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01msystems�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mtriton�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mutils�[39;00m �[38;5;28;01mimport�[39;00m run_ensemble_on_tritonserver
E �[0;32m----> 5�[0m response �[38;5;241m=�[39m �[43mrun_ensemble_on_tritonserver�[49m�[43m(�[49m
E �[1;32m 6�[0m �[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43m/tmp/examples/poc_ensemble�[39;49m�[38;5;124;43m"�[39;49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest�[49m�[43m,�[49m�[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43mensemble_model�[39;49m�[38;5;124;43m"�[39;49m
E �[1;32m 7�[0m �[43m)�[49m
E �[1;32m 8�[0m response �[38;5;241m=�[39m [x�[38;5;241m.�[39mtolist()[�[38;5;241m0�[39m] �[38;5;28;01mfor�[39;00m x �[38;5;129;01min�[39;00m response[�[38;5;124m"�[39m�[38;5;124mordered_ids�[39m�[38;5;124m"�[39m]]
E �[1;32m 9�[0m shutil�[38;5;241m.�[39mrmtree(�[38;5;124m"�[39m�[38;5;124m/tmp/examples/�[39m�[38;5;124m"�[39m, ignore_errors�[38;5;241m=�[39m�[38;5;28;01mTrue�[39;00m)
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:93�[0m, in �[0;36mrun_ensemble_on_tritonserver�[0;34m(tmpdir, output_columns, df, model_name)�[0m
E �[1;32m 91�[0m response �[38;5;241m=�[39m �[38;5;28;01mNone�[39;00m
E �[1;32m 92�[0m �[38;5;28;01mwith�[39;00m run_triton_server(tmpdir) �[38;5;28;01mas�[39;00m client:
E �[0;32m---> 93�[0m response �[38;5;241m=�[39m �[43msend_triton_request�[49m�[43m(�[49m�[43mdf�[49m�[43m,�[49m�[43m �[49m�[43moutput_columns�[49m�[43m,�[49m�[43m �[49m�[43mclient�[49m�[38;5;241;43m=�[39;49m�[43mclient�[49m�[43m,�[49m�[43m �[49m�[43mtriton_model�[49m�[38;5;241;43m=�[39;49m�[43mmodel_name�[49m�[43m)�[49m
E �[1;32m 95�[0m �[38;5;28;01mreturn�[39;00m response
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:141�[0m, in �[0;36msend_triton_request�[0;34m(df, outputs_list, client, endpoint, request_id, triton_model)�[0m
E �[1;32m 139�[0m outputs �[38;5;241m=�[39m [grpcclient�[38;5;241m.�[39mInferRequestedOutput(col) �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list]
E �[1;32m 140�[0m �[38;5;28;01mwith�[39;00m client:
E �[0;32m--> 141�[0m response �[38;5;241m=�[39m �[43mclient�[49m�[38;5;241;43m.�[39;49m�[43minfer�[49m�[43m(�[49m�[43mtriton_model�[49m�[43m,�[49m�[43m �[49m�[43minputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest_id�[49m�[38;5;241;43m=�[39;49m�[43mrequest_id�[49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[38;5;241;43m=�[39;49m�[43moutputs�[49m�[43m)�[49m
E �[1;32m 143�[0m results �[38;5;241m=�[39m {}
E �[1;32m 144�[0m �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list:
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322�[0m, in �[0;36mInferenceServerClient.infer�[0;34m(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)�[0m
E �[1;32m 1320�[0m �[38;5;28;01mreturn�[39;00m result
E �[1;32m 1321�[0m �[38;5;28;01mexcept�[39;00m grpc�[38;5;241m.�[39mRpcError �[38;5;28;01mas�[39;00m rpc_error:
E �[0;32m-> 1322�[0m �[43mraise_error_grpc�[49m�[43m(�[49m�[43mrpc_error�[49m�[43m)�[49m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62�[0m, in �[0;36mraise_error_grpc�[0;34m(rpc_error)�[0m
E �[1;32m 61�[0m �[38;5;28;01mdef�[39;00m �[38;5;21mraise_error_grpc�[39m(rpc_error):
E �[0;32m---> 62�[0m �[38;5;28;01mraise�[39;00m get_error_grpc(rpc_error) �[38;5;28;01mfrom�[39;00m �[38;5;28mNone�[39m
E
E �[0;31mInferenceServerException�[0m: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E
E Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E
E At:
E /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
E
E InferenceServerException: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E
E Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E
E At:
E /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute

/usr/local/lib/python3.8/dist-packages/nbclient/client.py:916: CellExecutionError

During handling of the above exception, another exception occurred:

def test_func():
    with testbook(
        REPO_ROOT
        / "examples"
        / "Building-and-deploying-multi-stage-RecSys"
        / "01-Building-Recommender-Systems-with-Merlin.ipynb",
        execute=False,
    ) as tb1:
        tb1.inject(
            """
            import os
            os.environ["DATA_FOLDER"] = "/tmp/data/"
            os.environ["NUM_ROWS"] = "10000"
            os.system("mkdir -p /tmp/examples")
            os.environ["BASE_DIR"] = "/tmp/examples/"
            """
        )
        tb1.execute()
        assert os.path.isdir("/tmp/examples/dlrm")
        assert os.path.isdir("/tmp/examples/feature_repo")
        assert os.path.isdir("/tmp/examples/query_tower")
        assert os.path.isfile("/tmp/examples/item_embeddings.parquet")
        assert os.path.isfile("/tmp/examples/feature_repo/user_features.py")
        assert os.path.isfile("/tmp/examples/feature_repo/item_features.py")

    with testbook(
        REPO_ROOT
        / "examples"
        / "Building-and-deploying-multi-stage-RecSys"
        / "02-Deploying-multi-stage-RecSys-with-Merlin-Systems.ipynb",
        execute=False,
    ) as tb2:
        tb2.inject(
            """
            import os
            os.environ["DATA_FOLDER"] = "/tmp/data/"
            os.environ["BASE_DIR"] = "/tmp/examples/"
            """
        )
        NUM_OF_CELLS = len(tb2.cells)
        tb2.execute_cell(list(range(0, NUM_OF_CELLS - 3)))
        top_k = tb2.ref("top_k")
        outputs = tb2.ref("outputs")
        assert outputs[0] == "ordered_ids"
      tb2.inject(
            """
            import shutil
            from merlin.models.loader.tf_utils import configure_tensorflow
            configure_tensorflow()
            from merlin.systems.triton.utils import run_ensemble_on_tritonserver
            response = run_ensemble_on_tritonserver(
                "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
            )
            response = [x.tolist()[0] for x in response["ordered_ids"]]
            shutil.rmtree("/tmp/examples/", ignore_errors=True)
            """
        )

tests/unit/examples/test_building_deploying_multi_stage_RecSys.py:57:


/usr/local/lib/python3.8/dist-packages/testbook/client.py:237: in inject
cell = TestbookNode(self.execute_cell(inject_idx)) if run else TestbookNode(code_cell)


self = <testbook.client.TestbookNotebookClient object at 0x7f00c548aac0>
cell = [53], kwargs = {}, cell_indexes = [53], executed_cells = [], idx = 53

def execute_cell(self, cell, **kwargs) -> Union[Dict, List[Dict]]:
    """
    Executes a cell or list of cells
    """
    if isinstance(cell, slice):
        start, stop = self._cell_index(cell.start), self._cell_index(cell.stop)
        if cell.step is not None:
            raise TestbookError('testbook does not support step argument')

        cell = range(start, stop + 1)
    elif isinstance(cell, str) or isinstance(cell, int):
        cell = [cell]

    cell_indexes = cell

    if all(isinstance(x, str) for x in cell):
        cell_indexes = [self._cell_index(tag) for tag in cell]

    executed_cells = []
    for idx in cell_indexes:
        try:
            cell = super().execute_cell(self.nb['cells'][idx], idx, **kwargs)
        except CellExecutionError as ce:
          raise TestbookRuntimeError(ce.evalue, ce, self._get_error_class(ce.ename))

E testbook.exceptions.TestbookRuntimeError: An error occurred while executing the following cell:
E ------------------
E
E import shutil
E from merlin.models.loader.tf_utils import configure_tensorflow
E configure_tensorflow()
E from merlin.systems.triton.utils import run_ensemble_on_tritonserver
E response = run_ensemble_on_tritonserver(
E "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
E )
E response = [x.tolist()[0] for x in response["ordered_ids"]]
E shutil.rmtree("/tmp/examples/", ignore_errors=True)
E
E ------------------
E
E �[0;31m---------------------------------------------------------------------------�[0m
E �[0;31mInferenceServerException�[0m Traceback (most recent call last)
E Input �[0;32mIn [32]�[0m, in �[0;36m<cell line: 5>�[0;34m()�[0m
E �[1;32m 3�[0m configure_tensorflow()
E �[1;32m 4�[0m �[38;5;28;01mfrom�[39;00m �[38;5;21;01mmerlin�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01msystems�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mtriton�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mutils�[39;00m �[38;5;28;01mimport�[39;00m run_ensemble_on_tritonserver
E �[0;32m----> 5�[0m response �[38;5;241m=�[39m �[43mrun_ensemble_on_tritonserver�[49m�[43m(�[49m
E �[1;32m 6�[0m �[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43m/tmp/examples/poc_ensemble�[39;49m�[38;5;124;43m"�[39;49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest�[49m�[43m,�[49m�[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43mensemble_model�[39;49m�[38;5;124;43m"�[39;49m
E �[1;32m 7�[0m �[43m)�[49m
E �[1;32m 8�[0m response �[38;5;241m=�[39m [x�[38;5;241m.�[39mtolist()[�[38;5;241m0�[39m] �[38;5;28;01mfor�[39;00m x �[38;5;129;01min�[39;00m response[�[38;5;124m"�[39m�[38;5;124mordered_ids�[39m�[38;5;124m"�[39m]]
E �[1;32m 9�[0m shutil�[38;5;241m.�[39mrmtree(�[38;5;124m"�[39m�[38;5;124m/tmp/examples/�[39m�[38;5;124m"�[39m, ignore_errors�[38;5;241m=�[39m�[38;5;28;01mTrue�[39;00m)
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:93�[0m, in �[0;36mrun_ensemble_on_tritonserver�[0;34m(tmpdir, output_columns, df, model_name)�[0m
E �[1;32m 91�[0m response �[38;5;241m=�[39m �[38;5;28;01mNone�[39;00m
E �[1;32m 92�[0m �[38;5;28;01mwith�[39;00m run_triton_server(tmpdir) �[38;5;28;01mas�[39;00m client:
E �[0;32m---> 93�[0m response �[38;5;241m=�[39m �[43msend_triton_request�[49m�[43m(�[49m�[43mdf�[49m�[43m,�[49m�[43m �[49m�[43moutput_columns�[49m�[43m,�[49m�[43m �[49m�[43mclient�[49m�[38;5;241;43m=�[39;49m�[43mclient�[49m�[43m,�[49m�[43m �[49m�[43mtriton_model�[49m�[38;5;241;43m=�[39;49m�[43mmodel_name�[49m�[43m)�[49m
E �[1;32m 95�[0m �[38;5;28;01mreturn�[39;00m response
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:141�[0m, in �[0;36msend_triton_request�[0;34m(df, outputs_list, client, endpoint, request_id, triton_model)�[0m
E �[1;32m 139�[0m outputs �[38;5;241m=�[39m [grpcclient�[38;5;241m.�[39mInferRequestedOutput(col) �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list]
E �[1;32m 140�[0m �[38;5;28;01mwith�[39;00m client:
E �[0;32m--> 141�[0m response �[38;5;241m=�[39m �[43mclient�[49m�[38;5;241;43m.�[39;49m�[43minfer�[49m�[43m(�[49m�[43mtriton_model�[49m�[43m,�[49m�[43m �[49m�[43minputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest_id�[49m�[38;5;241;43m=�[39;49m�[43mrequest_id�[49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[38;5;241;43m=�[39;49m�[43moutputs�[49m�[43m)�[49m
E �[1;32m 143�[0m results �[38;5;241m=�[39m {}
E �[1;32m 144�[0m �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list:
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322�[0m, in �[0;36mInferenceServerClient.infer�[0;34m(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)�[0m
E �[1;32m 1320�[0m �[38;5;28;01mreturn�[39;00m result
E �[1;32m 1321�[0m �[38;5;28;01mexcept�[39;00m grpc�[38;5;241m.�[39mRpcError �[38;5;28;01mas�[39;00m rpc_error:
E �[0;32m-> 1322�[0m �[43mraise_error_grpc�[49m�[43m(�[49m�[43mrpc_error�[49m�[43m)�[49m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62�[0m, in �[0;36mraise_error_grpc�[0;34m(rpc_error)�[0m
E �[1;32m 61�[0m �[38;5;28;01mdef�[39;00m �[38;5;21mraise_error_grpc�[39m(rpc_error):
E �[0;32m---> 62�[0m �[38;5;28;01mraise�[39;00m get_error_grpc(rpc_error) �[38;5;28;01mfrom�[39;00m �[38;5;28mNone�[39m
E
E �[0;31mInferenceServerException�[0m: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E
E Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E
E At:
E /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
E
E InferenceServerException: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E
E Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E
E At:
E /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute

/usr/local/lib/python3.8/dist-packages/testbook/client.py:135: TestbookRuntimeError
----------------------------- Captured stdout call -----------------------------
Signal (2) received.
----------------------------- Captured stderr call -----------------------------
2022-07-18 16:10:30.183629: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-18 16:10:32.200327: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1627 MB memory: -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-18 16:10:32.201247: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14532 MB memory: -> device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/lib/python3.8/logging/init.py", line 2127, in shutdown
h.close()
File "/usr/local/lib/python3.8/dist-packages/absl/logging/init.py", line 934, in close
self.stream.close()
File "/usr/local/lib/python3.8/dist-packages/ipykernel/iostream.py", line 438, in close
self.watch_fd_thread.join()
AttributeError: 'OutStream' object has no attribute 'watch_fd_thread'
WARNING clustering 237 points to 32 centroids: please provide at least 1248 training points
2022-07-18 16:11:38.347550: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-18 16:11:40.342520: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1627 MB memory: -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-18 16:11:40.343267: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14532 MB memory: -> device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0
I0718 16:11:45.579192 5482 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7fe616000000' with size 268435456
I0718 16:11:45.579963 5482 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0718 16:11:45.587231 5482 model_repository_manager.cc:1191] loading: 0_queryfeast:1
I0718 16:11:45.687638 5482 model_repository_manager.cc:1191] loading: 1_predicttensorflow:1
I0718 16:11:45.694736 5482 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 0_queryfeast (GPU device 0)
I0718 16:11:45.788023 5482 model_repository_manager.cc:1191] loading: 2_queryfaiss:1
I0718 16:11:45.888232 5482 model_repository_manager.cc:1191] loading: 3_queryfeast:1
I0718 16:11:45.988527 5482 model_repository_manager.cc:1191] loading: 4_unrollfeatures:1
I0718 16:11:46.088848 5482 model_repository_manager.cc:1191] loading: 5_predicttensorflow:1
I0718 16:11:46.189143 5482 model_repository_manager.cc:1191] loading: 6_softmaxsampling:1
I0718 16:11:48.023559 5482 model_repository_manager.cc:1345] successfully loaded '0_queryfeast' version 1
I0718 16:11:48.293480 5482 tensorflow.cc:2181] TRITONBACKEND_Initialize: tensorflow
I0718 16:11:48.293518 5482 tensorflow.cc:2191] Triton TRITONBACKEND API version: 1.9
I0718 16:11:48.293525 5482 tensorflow.cc:2197] 'tensorflow' TRITONBACKEND API version: 1.9
I0718 16:11:48.293531 5482 tensorflow.cc:2221] backend configuration:
{"cmdline":{"auto-complete-config":"false","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","version":"2","default-max-batch-size":"4"}}
I0718 16:11:48.293568 5482 tensorflow.cc:2281] TRITONBACKEND_ModelInitialize: 1_predicttensorflow (version 1)
I0718 16:11:48.297086 5482 tensorflow.cc:2281] TRITONBACKEND_ModelInitialize: 5_predicttensorflow (version 1)
I0718 16:11:48.299118 5482 tensorflow.cc:2330] TRITONBACKEND_ModelInstanceInitialize: 1_predicttensorflow (GPU device 0)
2022-07-18 16:11:48.641569: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel
2022-07-18 16:11:48.645682: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }
2022-07-18 16:11:48.645708: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel
2022-07-18 16:11:48.645807: I tensorflow/core/platform/cpu_feature_guard.cc:152] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE3 SSE4.1 SSE4.2 AVX
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-18 16:11:48.683022: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 11497 MB memory: -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-18 16:11:48.718149: I tensorflow/cc/saved_model/loader.cc:230] Restoring SavedModel bundle.
2022-07-18 16:11:48.796348: I tensorflow/cc/saved_model/loader.cc:214] Running initialization op on SavedModel bundle at path: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel
2022-07-18 16:11:48.821082: I tensorflow/cc/saved_model/loader.cc:321] SavedModel load for tags { serve }; Status: success: OK. Took 179531 microseconds.
I0718 16:11:48.821205 5482 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 2_queryfaiss (GPU device 0)
I0718 16:11:48.821302 5482 model_repository_manager.cc:1345] successfully loaded '1_predicttensorflow' version 1
I0718 16:11:51.195764 5482 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 3_queryfeast (GPU device 0)
I0718 16:11:51.197407 5482 model_repository_manager.cc:1345] successfully loaded '2_queryfaiss' version 1
I0718 16:11:53.570422 5482 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 4_unrollfeatures (GPU device 0)
I0718 16:11:53.570697 5482 model_repository_manager.cc:1345] successfully loaded '3_queryfeast' version 1
I0718 16:11:55.727076 5482 tensorflow.cc:2330] TRITONBACKEND_ModelInstanceInitialize: 5_predicttensorflow (GPU device 0)
I0718 16:11:55.727392 5482 model_repository_manager.cc:1345] successfully loaded '4_unrollfeatures' version 1
2022-07-18 16:11:55.728901: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel
2022-07-18 16:11:55.749619: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }
2022-07-18 16:11:55.749668: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel
2022-07-18 16:11:55.751726: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 11497 MB memory: -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-18 16:11:55.774961: I tensorflow/cc/saved_model/loader.cc:230] Restoring SavedModel bundle.
2022-07-18 16:11:55.932932: I tensorflow/cc/saved_model/loader.cc:214] Running initialization op on SavedModel bundle at path: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel
2022-07-18 16:11:55.985684: I tensorflow/cc/saved_model/loader.cc:321] SavedModel load for tags { serve }; Status: success: OK. Took 256796 microseconds.
I0718 16:11:55.985863 5482 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 6_softmaxsampling (GPU device 0)
I0718 16:11:55.985982 5482 model_repository_manager.cc:1345] successfully loaded '5_predicttensorflow' version 1
I0718 16:11:58.084130 5482 model_repository_manager.cc:1345] successfully loaded '6_softmaxsampling' version 1
I0718 16:11:58.087004 5482 model_repository_manager.cc:1191] loading: ensemble_model:1
I0718 16:11:58.187807 5482 model_repository_manager.cc:1345] successfully loaded 'ensemble_model' version 1
I0718 16:11:58.187998 5482 server.cc:556]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0718 16:11:58.188109 5482 server.cc:583]
+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
| tensorflow | /opt/tritonserver/backends/tensorflow2/libtriton_tensorflow2.so | {"cmdline":{"auto-complete-config":"false","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","version":"2","default-max-batch-size":"4"}} |
+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0718 16:11:58.188223 5482 server.cc:626]
+---------------------+---------+--------+
| Model | Version | Status |
+---------------------+---------+--------+
| 0_queryfeast | 1 | READY |
| 1_predicttensorflow | 1 | READY |
| 2_queryfaiss | 1 | READY |
| 3_queryfeast | 1 | READY |
| 4_unrollfeatures | 1 | READY |
| 5_predicttensorflow | 1 | READY |
| 6_softmaxsampling | 1 | READY |
| ensemble_model | 1 | READY |
+---------------------+---------+--------+

I0718 16:11:58.250769 5482 metrics.cc:650] Collecting metrics for GPU 0: Tesla P100-DGXS-16GB
I0718 16:11:58.251616 5482 tritonserver.cc:2138]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.22.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0] | /tmp/examples/poc_ensemble |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0718 16:11:58.253056 5482 grpc_server.cc:4589] Started GRPCInferenceService at 0.0.0.0:8001
I0718 16:11:58.253282 5482 http_server.cc:3303] Started HTTPService at 0.0.0.0:8000
I0718 16:11:58.294189 5482 http_server.cc:178] Started Metrics Service at 0.0.0.0:8002
W0718 16:11:59.267570 5482 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0718 16:11:59.267635 5482 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
W0718 16:12:00.267789 5482 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0718 16:12:00.267848 5482 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
W0718 16:12:01.288510 5482 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0718 16:12:01.288571 5482 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
0718 16:12:03.855857 5739 pb_stub.cc:749] Failed to process the request(s) for model '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)

Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"

At:
/tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute

I0718 16:12:03.860333 5482 server.cc:257] Waiting for in-flight requests to complete.
I0718 16:12:03.860383 5482 server.cc:273] Timeout 30: Found 0 model versions that have in-flight inferences
I0718 16:12:03.860402 5482 model_repository_manager.cc:1223] unloading: ensemble_model:1
I0718 16:12:03.860501 5482 model_repository_manager.cc:1223] unloading: 6_softmaxsampling:1
I0718 16:12:03.860570 5482 model_repository_manager.cc:1223] unloading: 5_predicttensorflow:1
I0718 16:12:03.860675 5482 model_repository_manager.cc:1328] successfully unloaded 'ensemble_model' version 1
I0718 16:12:03.860688 5482 model_repository_manager.cc:1223] unloading: 4_unrollfeatures:1
I0718 16:12:03.860775 5482 tensorflow.cc:2368] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0718 16:12:03.860814 5482 model_repository_manager.cc:1223] unloading: 3_queryfeast:1
I0718 16:12:03.860883 5482 model_repository_manager.cc:1223] unloading: 2_queryfaiss:1
I0718 16:12:03.860938 5482 model_repository_manager.cc:1223] unloading: 1_predicttensorflow:1
I0718 16:12:03.860962 5482 tensorflow.cc:2307] TRITONBACKEND_ModelFinalize: delete model state
I0718 16:12:03.860999 5482 model_repository_manager.cc:1223] unloading: 0_queryfeast:1
I0718 16:12:03.861045 5482 server.cc:288] All models are stopped, unloading models
I0718 16:12:03.861068 5482 server.cc:295] Timeout 30: Found 7 live models and 0 in-flight non-inference requests
I0718 16:12:03.861145 5482 tensorflow.cc:2368] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0718 16:12:03.861338 5482 tensorflow.cc:2307] TRITONBACKEND_ModelFinalize: delete model state
I0718 16:12:03.877659 5482 model_repository_manager.cc:1328] successfully unloaded '1_predicttensorflow' version 1
I0718 16:12:03.882185 5482 model_repository_manager.cc:1328] successfully unloaded '5_predicttensorflow' version 1
I0718 16:12:04.861379 5482 server.cc:295] Timeout 29: Found 5 live models and 0 in-flight non-inference requests
I0718 16:12:05.425707 5482 model_repository_manager.cc:1328] successfully unloaded '4_unrollfeatures' version 1
I0718 16:12:05.453380 5482 model_repository_manager.cc:1328] successfully unloaded '6_softmaxsampling' version 1
I0718 16:12:05.540578 5482 model_repository_manager.cc:1328] successfully unloaded '2_queryfaiss' version 1
I0718 16:12:05.861583 5482 server.cc:295] Timeout 28: Found 2 live models and 0 in-flight non-inference requests
/usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py:15: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
ValueType.FLOAT: (np.float, False, False),
I0718 16:12:06.197324 5482 model_repository_manager.cc:1328] successfully unloaded '3_queryfeast' version 1
I0718 16:12:06.861710 5482 server.cc:295] Timeout 27: Found 1 live models and 0 in-flight non-inference requests
I0718 16:12:07.861841 5482 server.cc:295] Timeout 26: Found 1 live models and 0 in-flight non-inference requests
/usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py:15: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
ValueType.FLOAT: (np.float, False, False),
I0718 16:12:08.861977 5482 server.cc:295] Timeout 25: Found 1 live models and 0 in-flight non-inference requests
I0718 16:12:08.944392 5482 model_repository_manager.cc:1328] successfully unloaded '0_queryfeast' version 1
I0718 16:12:09.862115 5482 server.cc:295] Timeout 24: Found 0 live models and 0 in-flight non-inference requests
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/lib/python3.8/logging/init.py", line 2127, in shutdown
h.close()
File "/usr/local/lib/python3.8/dist-packages/absl/logging/init.py", line 934, in close
self.stream.close()
File "/usr/local/lib/python3.8/dist-packages/ipykernel/iostream.py", line 438, in close
self.watch_fd_thread.join()
AttributeError: 'OutStream' object has no attribute 'watch_fd_thread'
=========================== short test summary info ============================
FAILED tests/unit/examples/test_building_deploying_multi_stage_RecSys.py::test_func
=================== 1 failed, 1 passed in 111.76s (0:01:51) ====================
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://github.com/gitapi/repos/NVIDIA-Merlin/Merlin/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_merlin] $ /bin/bash /tmp/jenkins6467646715249025791.sh

@gabrielspmoreira
Copy link
Member Author

rerun tests

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #438 of commit dd36e3afd92da6d92cdab47fa4fbce43161d1c4b, no merge conflicts.
Running as SYSTEM
Setting status of dd36e3afd92da6d92cdab47fa4fbce43161d1c4b to PENDING with url https://10.20.13.93:8080/job/merlin_merlin/268/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_merlin
using credential systems-login
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Merlin # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Merlin
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Merlin +refs/pull/438/*:refs/remotes/origin/pr/438/* # timeout=10
 > git rev-parse dd36e3afd92da6d92cdab47fa4fbce43161d1c4b^{commit} # timeout=10
Checking out Revision dd36e3afd92da6d92cdab47fa4fbce43161d1c4b (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f dd36e3afd92da6d92cdab47fa4fbce43161d1c4b # timeout=10
Commit message: "Merge branch 'main' into mm_integration_tests"
 > git rev-list --no-walk 3bbcca6fd9a613df6b23c9934340d056e04c13d1 # timeout=10
[merlin_merlin] $ /bin/bash /tmp/jenkins14588817199850278494.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_merlin/merlin
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 2 items

tests/unit/test_version.py . [ 50%]
tests/unit/examples/test_building_deploying_multi_stage_RecSys.py F [100%]

=================================== FAILURES ===================================
__________________________________ test_func ___________________________________

self = <testbook.client.TestbookNotebookClient object at 0x7f5bed8cc9a0>
cell = [53], kwargs = {}, cell_indexes = [53], executed_cells = [], idx = 53

def execute_cell(self, cell, **kwargs) -> Union[Dict, List[Dict]]:
    """
    Executes a cell or list of cells
    """
    if isinstance(cell, slice):
        start, stop = self._cell_index(cell.start), self._cell_index(cell.stop)
        if cell.step is not None:
            raise TestbookError('testbook does not support step argument')

        cell = range(start, stop + 1)
    elif isinstance(cell, str) or isinstance(cell, int):
        cell = [cell]

    cell_indexes = cell

    if all(isinstance(x, str) for x in cell):
        cell_indexes = [self._cell_index(tag) for tag in cell]

    executed_cells = []
    for idx in cell_indexes:
        try:
          cell = super().execute_cell(self.nb['cells'][idx], idx, **kwargs)

/usr/local/lib/python3.8/dist-packages/testbook/client.py:133:


args = (<testbook.client.TestbookNotebookClient object at 0x7f5bed8cc9a0>, {'id': 'cb4a73ad', 'cell_type': 'code', 'metadata'...ast.py, line 299 in transform>]"\n\nAt:\n /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}, 53)
kwargs = {}

def wrapped(*args, **kwargs):
  return just_run(coro(*args, **kwargs))

/usr/local/lib/python3.8/dist-packages/nbclient/util.py:85:


coro = <coroutine object NotebookClient.async_execute_cell at 0x7f5bed238d40>

def just_run(coro: Awaitable) -> Any:
    """Make the coroutine run, even if there is an event loop running (using nest_asyncio)"""
    try:
        loop = asyncio.get_running_loop()
    except RuntimeError:
        loop = None
    if loop is None:
        had_running_loop = False
        loop = asyncio.new_event_loop()
        asyncio.set_event_loop(loop)
    else:
        had_running_loop = True
    if had_running_loop:
        # if there is a running loop, we patch using nest_asyncio
        # to have reentrant event loops
        check_ipython()
        import nest_asyncio

        nest_asyncio.apply()
        check_patch_tornado()
  return loop.run_until_complete(coro)

/usr/local/lib/python3.8/dist-packages/nbclient/util.py:60:


self = <_UnixSelectorEventLoop running=False closed=False debug=False>
future = <Task finished name='Task-365' coro=<NotebookClient.async_execute_cell() done, defined at /usr/local/lib/python3.8/dis...ps/feast.py, line 299 in transform>]"\n\nAt:\n /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n\n')>

def run_until_complete(self, future):
    """Run until the Future is done.

    If the argument is a coroutine, it is wrapped in a Task.

    WARNING: It would be disastrous to call run_until_complete()
    with the same coroutine twice -- it would wrap it in two
    different Tasks and that can't be good.

    Return the Future's result, or raise its exception.
    """
    self._check_closed()
    self._check_running()

    new_task = not futures.isfuture(future)
    future = tasks.ensure_future(future, loop=self)
    if new_task:
        # An exception is raised if the future didn't complete, so there
        # is no need to log the "destroy pending task" message
        future._log_destroy_pending = False

    future.add_done_callback(_run_until_complete_cb)
    try:
        self.run_forever()
    except:
        if new_task and future.done() and not future.cancelled():
            # The coroutine raised a BaseException. Consume the exception
            # to not log a warning, the caller doesn't have access to the
            # local task.
            future.exception()
        raise
    finally:
        future.remove_done_callback(_run_until_complete_cb)
    if not future.done():
        raise RuntimeError('Event loop stopped before Future completed.')
  return future.result()

/usr/lib/python3.8/asyncio/base_events.py:616:


self = <testbook.client.TestbookNotebookClient object at 0x7f5bed8cc9a0>
cell = {'id': 'cb4a73ad', 'cell_type': 'code', 'metadata': {'execution': {'iopub.status.busy': '2022-07-19T15:58:37.270535Z',...ps/feast.py, line 299 in transform>]"\n\nAt:\n /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}
cell_index = 53, execution_count = None, store_history = True

async def async_execute_cell(
    self,
    cell: NotebookNode,
    cell_index: int,
    execution_count: t.Optional[int] = None,
    store_history: bool = True,
) -> NotebookNode:
    """
    Executes a single code cell.

    To execute all cells see :meth:`execute`.

    Parameters
    ----------
    cell : nbformat.NotebookNode
        The cell which is currently being processed.
    cell_index : int
        The position of the cell within the notebook object.
    execution_count : int
        The execution count to be assigned to the cell (default: Use kernel response)
    store_history : bool
        Determines if history should be stored in the kernel (default: False).
        Specific to ipython kernels, which can store command histories.

    Returns
    -------
    output : dict
        The execution output payload (or None for no output).

    Raises
    ------
    CellExecutionError
        If execution failed and should raise an exception, this will be raised
        with defaults about the failure.

    Returns
    -------
    cell : NotebookNode
        The cell which was just processed.
    """
    assert self.kc is not None

    await run_hook(self.on_cell_start, cell=cell, cell_index=cell_index)

    if cell.cell_type != 'code' or not cell.source.strip():
        self.log.debug("Skipping non-executing cell %s", cell_index)
        return cell

    if self.skip_cells_with_tag in cell.metadata.get("tags", []):
        self.log.debug("Skipping tagged cell %s", cell_index)
        return cell

    if self.record_timing:  # clear execution metadata prior to execution
        cell['metadata']['execution'] = {}

    self.log.debug("Executing cell:\n%s", cell.source)

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors or "raises-exception" in cell.metadata.get("tags", [])
    )

    await run_hook(self.on_cell_execute, cell=cell, cell_index=cell_index)
    parent_msg_id = await ensure_async(
        self.kc.execute(
            cell.source, store_history=store_history, stop_on_error=not cell_allows_errors
        )
    )
    await run_hook(self.on_cell_complete, cell=cell, cell_index=cell_index)
    # We launched a code cell to execute
    self.code_cells_executed += 1
    exec_timeout = self._get_timeout(cell)

    cell.outputs = []
    self.clear_before_next_output = False

    task_poll_kernel_alive = asyncio.ensure_future(self._async_poll_kernel_alive())
    task_poll_output_msg = asyncio.ensure_future(
        self._async_poll_output_msg(parent_msg_id, cell, cell_index)
    )
    self.task_poll_for_reply = asyncio.ensure_future(
        self._async_poll_for_reply(
            parent_msg_id, cell, exec_timeout, task_poll_output_msg, task_poll_kernel_alive
        )
    )
    try:
        exec_reply = await self.task_poll_for_reply
    except asyncio.CancelledError:
        # can only be cancelled by task_poll_kernel_alive when the kernel is dead
        task_poll_output_msg.cancel()
        raise DeadKernelError("Kernel died")
    except Exception as e:
        # Best effort to cancel request if it hasn't been resolved
        try:
            # Check if the task_poll_output is doing the raising for us
            if not isinstance(e, CellControlSignal):
                task_poll_output_msg.cancel()
        finally:
            raise

    if execution_count:
        cell['execution_count'] = execution_count
    await run_hook(
        self.on_cell_executed, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )
  await self._check_raise_for_error(cell, cell_index, exec_reply)

/usr/local/lib/python3.8/dist-packages/nbclient/client.py:1022:


self = <testbook.client.TestbookNotebookClient object at 0x7f5bed8cc9a0>
cell = {'id': 'cb4a73ad', 'cell_type': 'code', 'metadata': {'execution': {'iopub.status.busy': '2022-07-19T15:58:37.270535Z',...ps/feast.py, line 299 in transform>]"\n\nAt:\n /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}
cell_index = 53
exec_reply = {'buffers': [], 'content': {'ename': 'InferenceServerException', 'engine_info': {'engine_id': -1, 'engine_uuid': 'cc1d...e, 'engine': 'cc1dc4ac-1192-4c1e-a386-b69e6e770c63', 'started': '2022-07-19T15:58:37.270831Z', 'status': 'error'}, ...}

async def _check_raise_for_error(
    self, cell: NotebookNode, cell_index: int, exec_reply: t.Optional[t.Dict]
) -> None:

    if exec_reply is None:
        return None

    exec_reply_content = exec_reply['content']
    if exec_reply_content['status'] != 'error':
        return None

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors
        or exec_reply_content.get('ename') in self.allow_error_names
        or "raises-exception" in cell.metadata.get("tags", [])
    )
    await run_hook(
        self.on_cell_error, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )
    if not cell_allows_errors:
      raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)

E nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
E ------------------
E
E import shutil
E from merlin.models.loader.tf_utils import configure_tensorflow
E configure_tensorflow()
E from merlin.systems.triton.utils import run_ensemble_on_tritonserver
E response = run_ensemble_on_tritonserver(
E "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
E )
E response = [x.tolist()[0] for x in response["ordered_ids"]]
E shutil.rmtree("/tmp/examples/", ignore_errors=True)
E
E ------------------
E
E �[0;31m---------------------------------------------------------------------------�[0m
E �[0;31mInferenceServerException�[0m Traceback (most recent call last)
E Input �[0;32mIn [32]�[0m, in �[0;36m<cell line: 5>�[0;34m()�[0m
E �[1;32m 3�[0m configure_tensorflow()
E �[1;32m 4�[0m �[38;5;28;01mfrom�[39;00m �[38;5;21;01mmerlin�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01msystems�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mtriton�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mutils�[39;00m �[38;5;28;01mimport�[39;00m run_ensemble_on_tritonserver
E �[0;32m----> 5�[0m response �[38;5;241m=�[39m �[43mrun_ensemble_on_tritonserver�[49m�[43m(�[49m
E �[1;32m 6�[0m �[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43m/tmp/examples/poc_ensemble�[39;49m�[38;5;124;43m"�[39;49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest�[49m�[43m,�[49m�[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43mensemble_model�[39;49m�[38;5;124;43m"�[39;49m
E �[1;32m 7�[0m �[43m)�[49m
E �[1;32m 8�[0m response �[38;5;241m=�[39m [x�[38;5;241m.�[39mtolist()[�[38;5;241m0�[39m] �[38;5;28;01mfor�[39;00m x �[38;5;129;01min�[39;00m response[�[38;5;124m"�[39m�[38;5;124mordered_ids�[39m�[38;5;124m"�[39m]]
E �[1;32m 9�[0m shutil�[38;5;241m.�[39mrmtree(�[38;5;124m"�[39m�[38;5;124m/tmp/examples/�[39m�[38;5;124m"�[39m, ignore_errors�[38;5;241m=�[39m�[38;5;28;01mTrue�[39;00m)
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:93�[0m, in �[0;36mrun_ensemble_on_tritonserver�[0;34m(tmpdir, output_columns, df, model_name)�[0m
E �[1;32m 91�[0m response �[38;5;241m=�[39m �[38;5;28;01mNone�[39;00m
E �[1;32m 92�[0m �[38;5;28;01mwith�[39;00m run_triton_server(tmpdir) �[38;5;28;01mas�[39;00m client:
E �[0;32m---> 93�[0m response �[38;5;241m=�[39m �[43msend_triton_request�[49m�[43m(�[49m�[43mdf�[49m�[43m,�[49m�[43m �[49m�[43moutput_columns�[49m�[43m,�[49m�[43m �[49m�[43mclient�[49m�[38;5;241;43m=�[39;49m�[43mclient�[49m�[43m,�[49m�[43m �[49m�[43mtriton_model�[49m�[38;5;241;43m=�[39;49m�[43mmodel_name�[49m�[43m)�[49m
E �[1;32m 95�[0m �[38;5;28;01mreturn�[39;00m response
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:141�[0m, in �[0;36msend_triton_request�[0;34m(df, outputs_list, client, endpoint, request_id, triton_model)�[0m
E �[1;32m 139�[0m outputs �[38;5;241m=�[39m [grpcclient�[38;5;241m.�[39mInferRequestedOutput(col) �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list]
E �[1;32m 140�[0m �[38;5;28;01mwith�[39;00m client:
E �[0;32m--> 141�[0m response �[38;5;241m=�[39m �[43mclient�[49m�[38;5;241;43m.�[39;49m�[43minfer�[49m�[43m(�[49m�[43mtriton_model�[49m�[43m,�[49m�[43m �[49m�[43minputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest_id�[49m�[38;5;241;43m=�[39;49m�[43mrequest_id�[49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[38;5;241;43m=�[39;49m�[43moutputs�[49m�[43m)�[49m
E �[1;32m 143�[0m results �[38;5;241m=�[39m {}
E �[1;32m 144�[0m �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list:
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322�[0m, in �[0;36mInferenceServerClient.infer�[0;34m(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)�[0m
E �[1;32m 1320�[0m �[38;5;28;01mreturn�[39;00m result
E �[1;32m 1321�[0m �[38;5;28;01mexcept�[39;00m grpc�[38;5;241m.�[39mRpcError �[38;5;28;01mas�[39;00m rpc_error:
E �[0;32m-> 1322�[0m �[43mraise_error_grpc�[49m�[43m(�[49m�[43mrpc_error�[49m�[43m)�[49m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62�[0m, in �[0;36mraise_error_grpc�[0;34m(rpc_error)�[0m
E �[1;32m 61�[0m �[38;5;28;01mdef�[39;00m �[38;5;21mraise_error_grpc�[39m(rpc_error):
E �[0;32m---> 62�[0m �[38;5;28;01mraise�[39;00m get_error_grpc(rpc_error) �[38;5;28;01mfrom�[39;00m �[38;5;28mNone�[39m
E
E �[0;31mInferenceServerException�[0m: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E
E Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E
E At:
E /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
E
E InferenceServerException: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E
E Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E
E At:
E /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute

/usr/local/lib/python3.8/dist-packages/nbclient/client.py:916: CellExecutionError

During handling of the above exception, another exception occurred:

def test_func():
    with testbook(
        REPO_ROOT
        / "examples"
        / "Building-and-deploying-multi-stage-RecSys"
        / "01-Building-Recommender-Systems-with-Merlin.ipynb",
        execute=False,
    ) as tb1:
        tb1.inject(
            """
            import os
            os.environ["DATA_FOLDER"] = "/tmp/data/"
            os.environ["NUM_ROWS"] = "10000"
            os.system("mkdir -p /tmp/examples")
            os.environ["BASE_DIR"] = "/tmp/examples/"
            """
        )
        tb1.execute()
        assert os.path.isdir("/tmp/examples/dlrm")
        assert os.path.isdir("/tmp/examples/feature_repo")
        assert os.path.isdir("/tmp/examples/query_tower")
        assert os.path.isfile("/tmp/examples/item_embeddings.parquet")
        assert os.path.isfile("/tmp/examples/feature_repo/user_features.py")
        assert os.path.isfile("/tmp/examples/feature_repo/item_features.py")

    with testbook(
        REPO_ROOT
        / "examples"
        / "Building-and-deploying-multi-stage-RecSys"
        / "02-Deploying-multi-stage-RecSys-with-Merlin-Systems.ipynb",
        execute=False,
    ) as tb2:
        tb2.inject(
            """
            import os
            os.environ["DATA_FOLDER"] = "/tmp/data/"
            os.environ["BASE_DIR"] = "/tmp/examples/"
            """
        )
        NUM_OF_CELLS = len(tb2.cells)
        tb2.execute_cell(list(range(0, NUM_OF_CELLS - 3)))
        top_k = tb2.ref("top_k")
        outputs = tb2.ref("outputs")
        assert outputs[0] == "ordered_ids"
      tb2.inject(
            """
            import shutil
            from merlin.models.loader.tf_utils import configure_tensorflow
            configure_tensorflow()
            from merlin.systems.triton.utils import run_ensemble_on_tritonserver
            response = run_ensemble_on_tritonserver(
                "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
            )
            response = [x.tolist()[0] for x in response["ordered_ids"]]
            shutil.rmtree("/tmp/examples/", ignore_errors=True)
            """
        )

tests/unit/examples/test_building_deploying_multi_stage_RecSys.py:57:


/usr/local/lib/python3.8/dist-packages/testbook/client.py:237: in inject
cell = TestbookNode(self.execute_cell(inject_idx)) if run else TestbookNode(code_cell)


self = <testbook.client.TestbookNotebookClient object at 0x7f5bed8cc9a0>
cell = [53], kwargs = {}, cell_indexes = [53], executed_cells = [], idx = 53

def execute_cell(self, cell, **kwargs) -> Union[Dict, List[Dict]]:
    """
    Executes a cell or list of cells
    """
    if isinstance(cell, slice):
        start, stop = self._cell_index(cell.start), self._cell_index(cell.stop)
        if cell.step is not None:
            raise TestbookError('testbook does not support step argument')

        cell = range(start, stop + 1)
    elif isinstance(cell, str) or isinstance(cell, int):
        cell = [cell]

    cell_indexes = cell

    if all(isinstance(x, str) for x in cell):
        cell_indexes = [self._cell_index(tag) for tag in cell]

    executed_cells = []
    for idx in cell_indexes:
        try:
            cell = super().execute_cell(self.nb['cells'][idx], idx, **kwargs)
        except CellExecutionError as ce:
          raise TestbookRuntimeError(ce.evalue, ce, self._get_error_class(ce.ename))

E testbook.exceptions.TestbookRuntimeError: An error occurred while executing the following cell:
E ------------------
E
E import shutil
E from merlin.models.loader.tf_utils import configure_tensorflow
E configure_tensorflow()
E from merlin.systems.triton.utils import run_ensemble_on_tritonserver
E response = run_ensemble_on_tritonserver(
E "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
E )
E response = [x.tolist()[0] for x in response["ordered_ids"]]
E shutil.rmtree("/tmp/examples/", ignore_errors=True)
E
E ------------------
E
E �[0;31m---------------------------------------------------------------------------�[0m
E �[0;31mInferenceServerException�[0m Traceback (most recent call last)
E Input �[0;32mIn [32]�[0m, in �[0;36m<cell line: 5>�[0;34m()�[0m
E �[1;32m 3�[0m configure_tensorflow()
E �[1;32m 4�[0m �[38;5;28;01mfrom�[39;00m �[38;5;21;01mmerlin�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01msystems�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mtriton�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mutils�[39;00m �[38;5;28;01mimport�[39;00m run_ensemble_on_tritonserver
E �[0;32m----> 5�[0m response �[38;5;241m=�[39m �[43mrun_ensemble_on_tritonserver�[49m�[43m(�[49m
E �[1;32m 6�[0m �[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43m/tmp/examples/poc_ensemble�[39;49m�[38;5;124;43m"�[39;49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest�[49m�[43m,�[49m�[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43mensemble_model�[39;49m�[38;5;124;43m"�[39;49m
E �[1;32m 7�[0m �[43m)�[49m
E �[1;32m 8�[0m response �[38;5;241m=�[39m [x�[38;5;241m.�[39mtolist()[�[38;5;241m0�[39m] �[38;5;28;01mfor�[39;00m x �[38;5;129;01min�[39;00m response[�[38;5;124m"�[39m�[38;5;124mordered_ids�[39m�[38;5;124m"�[39m]]
E �[1;32m 9�[0m shutil�[38;5;241m.�[39mrmtree(�[38;5;124m"�[39m�[38;5;124m/tmp/examples/�[39m�[38;5;124m"�[39m, ignore_errors�[38;5;241m=�[39m�[38;5;28;01mTrue�[39;00m)
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:93�[0m, in �[0;36mrun_ensemble_on_tritonserver�[0;34m(tmpdir, output_columns, df, model_name)�[0m
E �[1;32m 91�[0m response �[38;5;241m=�[39m �[38;5;28;01mNone�[39;00m
E �[1;32m 92�[0m �[38;5;28;01mwith�[39;00m run_triton_server(tmpdir) �[38;5;28;01mas�[39;00m client:
E �[0;32m---> 93�[0m response �[38;5;241m=�[39m �[43msend_triton_request�[49m�[43m(�[49m�[43mdf�[49m�[43m,�[49m�[43m �[49m�[43moutput_columns�[49m�[43m,�[49m�[43m �[49m�[43mclient�[49m�[38;5;241;43m=�[39;49m�[43mclient�[49m�[43m,�[49m�[43m �[49m�[43mtriton_model�[49m�[38;5;241;43m=�[39;49m�[43mmodel_name�[49m�[43m)�[49m
E �[1;32m 95�[0m �[38;5;28;01mreturn�[39;00m response
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:141�[0m, in �[0;36msend_triton_request�[0;34m(df, outputs_list, client, endpoint, request_id, triton_model)�[0m
E �[1;32m 139�[0m outputs �[38;5;241m=�[39m [grpcclient�[38;5;241m.�[39mInferRequestedOutput(col) �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list]
E �[1;32m 140�[0m �[38;5;28;01mwith�[39;00m client:
E �[0;32m--> 141�[0m response �[38;5;241m=�[39m �[43mclient�[49m�[38;5;241;43m.�[39;49m�[43minfer�[49m�[43m(�[49m�[43mtriton_model�[49m�[43m,�[49m�[43m �[49m�[43minputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest_id�[49m�[38;5;241;43m=�[39;49m�[43mrequest_id�[49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[38;5;241;43m=�[39;49m�[43moutputs�[49m�[43m)�[49m
E �[1;32m 143�[0m results �[38;5;241m=�[39m {}
E �[1;32m 144�[0m �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list:
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322�[0m, in �[0;36mInferenceServerClient.infer�[0;34m(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)�[0m
E �[1;32m 1320�[0m �[38;5;28;01mreturn�[39;00m result
E �[1;32m 1321�[0m �[38;5;28;01mexcept�[39;00m grpc�[38;5;241m.�[39mRpcError �[38;5;28;01mas�[39;00m rpc_error:
E �[0;32m-> 1322�[0m �[43mraise_error_grpc�[49m�[43m(�[49m�[43mrpc_error�[49m�[43m)�[49m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62�[0m, in �[0;36mraise_error_grpc�[0;34m(rpc_error)�[0m
E �[1;32m 61�[0m �[38;5;28;01mdef�[39;00m �[38;5;21mraise_error_grpc�[39m(rpc_error):
E �[0;32m---> 62�[0m �[38;5;28;01mraise�[39;00m get_error_grpc(rpc_error) �[38;5;28;01mfrom�[39;00m �[38;5;28mNone�[39m
E
E �[0;31mInferenceServerException�[0m: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E
E Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E
E At:
E /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
E
E InferenceServerException: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E
E Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E
E At:
E /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute

/usr/local/lib/python3.8/dist-packages/testbook/client.py:135: TestbookRuntimeError
----------------------------- Captured stdout call -----------------------------
Signal (2) received.
----------------------------- Captured stderr call -----------------------------
2022-07-19 15:57:26.944646: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-19 15:57:28.923446: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1627 MB memory: -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-19 15:57:28.924235: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 15153 MB memory: -> device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/lib/python3.8/logging/init.py", line 2127, in shutdown
h.close()
File "/usr/local/lib/python3.8/dist-packages/absl/logging/init.py", line 934, in close
self.stream.close()
File "/usr/local/lib/python3.8/dist-packages/ipykernel/iostream.py", line 438, in close
self.watch_fd_thread.join()
AttributeError: 'OutStream' object has no attribute 'watch_fd_thread'
WARNING clustering 247 points to 32 centroids: please provide at least 1248 training points
2022-07-19 15:58:30.406939: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-19 15:58:32.391407: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1627 MB memory: -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-19 15:58:32.392142: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 15153 MB memory: -> device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0
I0719 15:58:37.543515 28041 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7fb3d6000000' with size 268435456
I0719 15:58:37.544271 28041 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0719 15:58:37.551429 28041 model_repository_manager.cc:1191] loading: 1_predicttensorflow:1
I0719 15:58:37.651774 28041 model_repository_manager.cc:1191] loading: 0_queryfeast:1
I0719 15:58:37.752037 28041 model_repository_manager.cc:1191] loading: 2_queryfaiss:1
I0719 15:58:37.852310 28041 model_repository_manager.cc:1191] loading: 3_queryfeast:1
I0719 15:58:37.927586 28041 tensorflow.cc:2181] TRITONBACKEND_Initialize: tensorflow
I0719 15:58:37.927628 28041 tensorflow.cc:2191] Triton TRITONBACKEND API version: 1.9
I0719 15:58:37.927635 28041 tensorflow.cc:2197] 'tensorflow' TRITONBACKEND API version: 1.9
I0719 15:58:37.927640 28041 tensorflow.cc:2221] backend configuration:
{"cmdline":{"auto-complete-config":"false","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","version":"2","default-max-batch-size":"4"}}
I0719 15:58:37.927676 28041 tensorflow.cc:2281] TRITONBACKEND_ModelInitialize: 1_predicttensorflow (version 1)
I0719 15:58:37.932634 28041 tensorflow.cc:2330] TRITONBACKEND_ModelInstanceInitialize: 1_predicttensorflow (GPU device 0)
I0719 15:58:37.952641 28041 model_repository_manager.cc:1191] loading: 4_unrollfeatures:1
I0719 15:58:38.052957 28041 model_repository_manager.cc:1191] loading: 5_predicttensorflow:1
I0719 15:58:38.153284 28041 model_repository_manager.cc:1191] loading: 6_softmaxsampling:1
2022-07-19 15:58:38.276456: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel
2022-07-19 15:58:38.280447: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }
2022-07-19 15:58:38.280493: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel
2022-07-19 15:58:38.280598: I tensorflow/core/platform/cpu_feature_guard.cc:152] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE3 SSE4.1 SSE4.2 AVX
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-19 15:58:38.328992: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 12901 MB memory: -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-19 15:58:38.369504: I tensorflow/cc/saved_model/loader.cc:230] Restoring SavedModel bundle.
2022-07-19 15:58:38.451269: I tensorflow/cc/saved_model/loader.cc:214] Running initialization op on SavedModel bundle at path: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel
2022-07-19 15:58:38.474947: I tensorflow/cc/saved_model/loader.cc:321] SavedModel load for tags { serve }; Status: success: OK. Took 198510 microseconds.
I0719 15:58:38.475194 28041 model_repository_manager.cc:1345] successfully loaded '1_predicttensorflow' version 1
I0719 15:58:38.479528 28041 tensorflow.cc:2281] TRITONBACKEND_ModelInitialize: 5_predicttensorflow (version 1)
I0719 15:58:38.481477 28041 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 0_queryfeast (GPU device 0)
I0719 15:58:40.787890 28041 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 2_queryfaiss (GPU device 0)
I0719 15:58:40.788199 28041 model_repository_manager.cc:1345] successfully loaded '0_queryfeast' version 1
I0719 15:58:43.192630 28041 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 3_queryfeast (GPU device 0)
I0719 15:58:43.194333 28041 model_repository_manager.cc:1345] successfully loaded '2_queryfaiss' version 1
I0719 15:58:45.486609 28041 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 4_unrollfeatures (GPU device 0)
I0719 15:58:45.486825 28041 model_repository_manager.cc:1345] successfully loaded '3_queryfeast' version 1
I0719 15:58:47.552551 28041 tensorflow.cc:2330] TRITONBACKEND_ModelInstanceInitialize: 5_predicttensorflow (GPU device 0)
I0719 15:58:47.552794 28041 model_repository_manager.cc:1345] successfully loaded '4_unrollfeatures' version 1
2022-07-19 15:58:47.554226: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel
2022-07-19 15:58:47.578557: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }
2022-07-19 15:58:47.578602: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel
2022-07-19 15:58:47.580642: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 12901 MB memory: -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-19 15:58:47.603270: I tensorflow/cc/saved_model/loader.cc:230] Restoring SavedModel bundle.
2022-07-19 15:58:47.760179: I tensorflow/cc/saved_model/loader.cc:214] Running initialization op on SavedModel bundle at path: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel
2022-07-19 15:58:47.823037: I tensorflow/cc/saved_model/loader.cc:321] SavedModel load for tags { serve }; Status: success: OK. Took 268825 microseconds.
I0719 15:58:47.823188 28041 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 6_softmaxsampling (GPU device 0)
I0719 15:58:47.823401 28041 model_repository_manager.cc:1345] successfully loaded '5_predicttensorflow' version 1
I0719 15:58:49.908781 28041 model_repository_manager.cc:1345] successfully loaded '6_softmaxsampling' version 1
I0719 15:58:49.911629 28041 model_repository_manager.cc:1191] loading: ensemble_model:1
I0719 15:58:50.012433 28041 model_repository_manager.cc:1345] successfully loaded 'ensemble_model' version 1
I0719 15:58:50.012615 28041 server.cc:556]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0719 15:58:50.012729 28041 server.cc:583]
+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tensorflow | /opt/tritonserver/backends/tensorflow2/libtriton_tensorflow2.so | {"cmdline":{"auto-complete-config":"false","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","version":"2","default-max-batch-size":"4"}} |
| python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0719 15:58:50.012872 28041 server.cc:626]
+---------------------+---------+--------+
| Model | Version | Status |
+---------------------+---------+--------+
| 0_queryfeast | 1 | READY |
| 1_predicttensorflow | 1 | READY |
| 2_queryfaiss | 1 | READY |
| 3_queryfeast | 1 | READY |
| 4_unrollfeatures | 1 | READY |
| 5_predicttensorflow | 1 | READY |
| 6_softmaxsampling | 1 | READY |
| ensemble_model | 1 | READY |
+---------------------+---------+--------+

I0719 15:58:50.076027 28041 metrics.cc:650] Collecting metrics for GPU 0: Tesla P100-DGXS-16GB
I0719 15:58:50.076904 28041 tritonserver.cc:2138]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.22.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0] | /tmp/examples/poc_ensemble |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0719 15:58:50.077736 28041 grpc_server.cc:4589] Started GRPCInferenceService at 0.0.0.0:8001
I0719 15:58:50.078088 28041 http_server.cc:3303] Started HTTPService at 0.0.0.0:8000
I0719 15:58:50.119100 28041 http_server.cc:178] Started Metrics Service at 0.0.0.0:8002
W0719 15:58:51.098301 28041 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0719 15:58:51.098361 28041 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
W0719 15:58:52.098526 28041 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0719 15:58:52.098582 28041 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
W0719 15:58:53.123001 28041 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0719 15:58:53.123059 28041 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
0719 15:58:53.786537 28298 pb_stub.cc:749] Failed to process the request(s) for model '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)

Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"

At:
/tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute

I0719 15:58:53.790944 28041 server.cc:257] Waiting for in-flight requests to complete.
I0719 15:58:53.790971 28041 server.cc:273] Timeout 30: Found 0 model versions that have in-flight inferences
I0719 15:58:53.790979 28041 model_repository_manager.cc:1223] unloading: ensemble_model:1
I0719 15:58:53.791029 28041 model_repository_manager.cc:1223] unloading: 6_softmaxsampling:1
I0719 15:58:53.791067 28041 model_repository_manager.cc:1223] unloading: 5_predicttensorflow:1
I0719 15:58:53.791100 28041 model_repository_manager.cc:1223] unloading: 4_unrollfeatures:1
I0719 15:58:53.791134 28041 model_repository_manager.cc:1223] unloading: 3_queryfeast:1
I0719 15:58:53.791182 28041 model_repository_manager.cc:1223] unloading: 2_queryfaiss:1
I0719 15:58:53.791203 28041 model_repository_manager.cc:1223] unloading: 1_predicttensorflow:1
I0719 15:58:53.791228 28041 model_repository_manager.cc:1328] successfully unloaded 'ensemble_model' version 1
I0719 15:58:53.791262 28041 tensorflow.cc:2368] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0719 15:58:53.791274 28041 model_repository_manager.cc:1223] unloading: 0_queryfeast:1
I0719 15:58:53.791361 28041 server.cc:288] All models are stopped, unloading models
I0719 15:58:53.791373 28041 server.cc:295] Timeout 30: Found 7 live models and 0 in-flight non-inference requests
I0719 15:58:53.791545 28041 tensorflow.cc:2368] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0719 15:58:53.791616 28041 tensorflow.cc:2307] TRITONBACKEND_ModelFinalize: delete model state
I0719 15:58:53.791656 28041 tensorflow.cc:2307] TRITONBACKEND_ModelFinalize: delete model state
I0719 15:58:53.799692 28041 model_repository_manager.cc:1328] successfully unloaded '1_predicttensorflow' version 1
I0719 15:58:53.816374 28041 model_repository_manager.cc:1328] successfully unloaded '5_predicttensorflow' version 1
I0719 15:58:54.791531 28041 server.cc:295] Timeout 29: Found 5 live models and 0 in-flight non-inference requests
I0719 15:58:55.152476 28041 model_repository_manager.cc:1328] successfully unloaded '6_softmaxsampling' version 1
I0719 15:58:55.206269 28041 model_repository_manager.cc:1328] successfully unloaded '4_unrollfeatures' version 1
I0719 15:58:55.369325 28041 model_repository_manager.cc:1328] successfully unloaded '2_queryfaiss' version 1
/usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py:15: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
ValueType.FLOAT: (np.float, False, False),
I0719 15:58:55.791715 28041 server.cc:295] Timeout 28: Found 2 live models and 0 in-flight non-inference requests
I0719 15:58:55.917770 28041 model_repository_manager.cc:1328] successfully unloaded '0_queryfeast' version 1
/usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py:15: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
ValueType.FLOAT: (np.float, False, False),
I0719 15:58:56.505409 28041 model_repository_manager.cc:1328] successfully unloaded '3_queryfeast' version 1
I0719 15:58:56.791851 28041 server.cc:295] Timeout 27: Found 0 live models and 0 in-flight non-inference requests
=========================== short test summary info ============================
FAILED tests/unit/examples/test_building_deploying_multi_stage_RecSys.py::test_func
=================== 1 failed, 1 passed in 103.35s (0:01:43) ====================
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://github.com/gitapi/repos/NVIDIA-Merlin/Merlin/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_merlin] $ /bin/bash /tmp/jenkins10183754583790379217.sh

@gabrielspmoreira
Copy link
Member Author

rerun tests

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #438 of commit dd36e3afd92da6d92cdab47fa4fbce43161d1c4b, no merge conflicts.
Running as SYSTEM
Setting status of dd36e3afd92da6d92cdab47fa4fbce43161d1c4b to PENDING with url https://10.20.13.93:8080/job/merlin_merlin/269/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_merlin
using credential systems-login
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Merlin # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Merlin
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Merlin +refs/pull/438/*:refs/remotes/origin/pr/438/* # timeout=10
 > git rev-parse dd36e3afd92da6d92cdab47fa4fbce43161d1c4b^{commit} # timeout=10
Checking out Revision dd36e3afd92da6d92cdab47fa4fbce43161d1c4b (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f dd36e3afd92da6d92cdab47fa4fbce43161d1c4b # timeout=10
Commit message: "Merge branch 'main' into mm_integration_tests"
 > git rev-list --no-walk dd36e3afd92da6d92cdab47fa4fbce43161d1c4b # timeout=10
[merlin_merlin] $ /bin/bash /tmp/jenkins2714096223939441451.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_merlin/merlin
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 2 items

tests/unit/test_version.py . [ 50%]
tests/unit/examples/test_building_deploying_multi_stage_RecSys.py F [100%]

=================================== FAILURES ===================================
__________________________________ test_func ___________________________________

self = <testbook.client.TestbookNotebookClient object at 0x7fdc0f3b8b20>
cell = [53], kwargs = {}, cell_indexes = [53], executed_cells = [], idx = 53

def execute_cell(self, cell, **kwargs) -> Union[Dict, List[Dict]]:
    """
    Executes a cell or list of cells
    """
    if isinstance(cell, slice):
        start, stop = self._cell_index(cell.start), self._cell_index(cell.stop)
        if cell.step is not None:
            raise TestbookError('testbook does not support step argument')

        cell = range(start, stop + 1)
    elif isinstance(cell, str) or isinstance(cell, int):
        cell = [cell]

    cell_indexes = cell

    if all(isinstance(x, str) for x in cell):
        cell_indexes = [self._cell_index(tag) for tag in cell]

    executed_cells = []
    for idx in cell_indexes:
        try:
          cell = super().execute_cell(self.nb['cells'][idx], idx, **kwargs)

/usr/local/lib/python3.8/dist-packages/testbook/client.py:133:


args = (<testbook.client.TestbookNotebookClient object at 0x7fdc0f3b8b20>, {'id': '4defd03b', 'cell_type': 'code', 'metadata'...ast.py, line 299 in transform>]"\n\nAt:\n /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}, 53)
kwargs = {}

def wrapped(*args, **kwargs):
  return just_run(coro(*args, **kwargs))

/usr/local/lib/python3.8/dist-packages/nbclient/util.py:85:


coro = <coroutine object NotebookClient.async_execute_cell at 0x7fdc0f3283c0>

def just_run(coro: Awaitable) -> Any:
    """Make the coroutine run, even if there is an event loop running (using nest_asyncio)"""
    try:
        loop = asyncio.get_running_loop()
    except RuntimeError:
        loop = None
    if loop is None:
        had_running_loop = False
        loop = asyncio.new_event_loop()
        asyncio.set_event_loop(loop)
    else:
        had_running_loop = True
    if had_running_loop:
        # if there is a running loop, we patch using nest_asyncio
        # to have reentrant event loops
        check_ipython()
        import nest_asyncio

        nest_asyncio.apply()
        check_patch_tornado()
  return loop.run_until_complete(coro)

/usr/local/lib/python3.8/dist-packages/nbclient/util.py:60:


self = <_UnixSelectorEventLoop running=False closed=False debug=False>
future = <Task finished name='Task-365' coro=<NotebookClient.async_execute_cell() done, defined at /usr/local/lib/python3.8/dis...ps/feast.py, line 299 in transform>]"\n\nAt:\n /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n\n')>

def run_until_complete(self, future):
    """Run until the Future is done.

    If the argument is a coroutine, it is wrapped in a Task.

    WARNING: It would be disastrous to call run_until_complete()
    with the same coroutine twice -- it would wrap it in two
    different Tasks and that can't be good.

    Return the Future's result, or raise its exception.
    """
    self._check_closed()
    self._check_running()

    new_task = not futures.isfuture(future)
    future = tasks.ensure_future(future, loop=self)
    if new_task:
        # An exception is raised if the future didn't complete, so there
        # is no need to log the "destroy pending task" message
        future._log_destroy_pending = False

    future.add_done_callback(_run_until_complete_cb)
    try:
        self.run_forever()
    except:
        if new_task and future.done() and not future.cancelled():
            # The coroutine raised a BaseException. Consume the exception
            # to not log a warning, the caller doesn't have access to the
            # local task.
            future.exception()
        raise
    finally:
        future.remove_done_callback(_run_until_complete_cb)
    if not future.done():
        raise RuntimeError('Event loop stopped before Future completed.')
  return future.result()

/usr/lib/python3.8/asyncio/base_events.py:616:


self = <testbook.client.TestbookNotebookClient object at 0x7fdc0f3b8b20>
cell = {'id': '4defd03b', 'cell_type': 'code', 'metadata': {'execution': {'iopub.status.busy': '2022-07-20T18:04:53.756452Z',...ps/feast.py, line 299 in transform>]"\n\nAt:\n /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}
cell_index = 53, execution_count = None, store_history = True

async def async_execute_cell(
    self,
    cell: NotebookNode,
    cell_index: int,
    execution_count: t.Optional[int] = None,
    store_history: bool = True,
) -> NotebookNode:
    """
    Executes a single code cell.

    To execute all cells see :meth:`execute`.

    Parameters
    ----------
    cell : nbformat.NotebookNode
        The cell which is currently being processed.
    cell_index : int
        The position of the cell within the notebook object.
    execution_count : int
        The execution count to be assigned to the cell (default: Use kernel response)
    store_history : bool
        Determines if history should be stored in the kernel (default: False).
        Specific to ipython kernels, which can store command histories.

    Returns
    -------
    output : dict
        The execution output payload (or None for no output).

    Raises
    ------
    CellExecutionError
        If execution failed and should raise an exception, this will be raised
        with defaults about the failure.

    Returns
    -------
    cell : NotebookNode
        The cell which was just processed.
    """
    assert self.kc is not None

    await run_hook(self.on_cell_start, cell=cell, cell_index=cell_index)

    if cell.cell_type != 'code' or not cell.source.strip():
        self.log.debug("Skipping non-executing cell %s", cell_index)
        return cell

    if self.skip_cells_with_tag in cell.metadata.get("tags", []):
        self.log.debug("Skipping tagged cell %s", cell_index)
        return cell

    if self.record_timing:  # clear execution metadata prior to execution
        cell['metadata']['execution'] = {}

    self.log.debug("Executing cell:\n%s", cell.source)

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors or "raises-exception" in cell.metadata.get("tags", [])
    )

    await run_hook(self.on_cell_execute, cell=cell, cell_index=cell_index)
    parent_msg_id = await ensure_async(
        self.kc.execute(
            cell.source, store_history=store_history, stop_on_error=not cell_allows_errors
        )
    )
    await run_hook(self.on_cell_complete, cell=cell, cell_index=cell_index)
    # We launched a code cell to execute
    self.code_cells_executed += 1
    exec_timeout = self._get_timeout(cell)

    cell.outputs = []
    self.clear_before_next_output = False

    task_poll_kernel_alive = asyncio.ensure_future(self._async_poll_kernel_alive())
    task_poll_output_msg = asyncio.ensure_future(
        self._async_poll_output_msg(parent_msg_id, cell, cell_index)
    )
    self.task_poll_for_reply = asyncio.ensure_future(
        self._async_poll_for_reply(
            parent_msg_id, cell, exec_timeout, task_poll_output_msg, task_poll_kernel_alive
        )
    )
    try:
        exec_reply = await self.task_poll_for_reply
    except asyncio.CancelledError:
        # can only be cancelled by task_poll_kernel_alive when the kernel is dead
        task_poll_output_msg.cancel()
        raise DeadKernelError("Kernel died")
    except Exception as e:
        # Best effort to cancel request if it hasn't been resolved
        try:
            # Check if the task_poll_output is doing the raising for us
            if not isinstance(e, CellControlSignal):
                task_poll_output_msg.cancel()
        finally:
            raise

    if execution_count:
        cell['execution_count'] = execution_count
    await run_hook(
        self.on_cell_executed, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )
  await self._check_raise_for_error(cell, cell_index, exec_reply)

/usr/local/lib/python3.8/dist-packages/nbclient/client.py:1022:


self = <testbook.client.TestbookNotebookClient object at 0x7fdc0f3b8b20>
cell = {'id': '4defd03b', 'cell_type': 'code', 'metadata': {'execution': {'iopub.status.busy': '2022-07-20T18:04:53.756452Z',...ps/feast.py, line 299 in transform>]"\n\nAt:\n /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}
cell_index = 53
exec_reply = {'buffers': [], 'content': {'ename': 'InferenceServerException', 'engine_info': {'engine_id': -1, 'engine_uuid': '0dc4...e, 'engine': '0dc40953-7dc1-4ae9-8838-c7ce0e180cef', 'started': '2022-07-20T18:04:53.756748Z', 'status': 'error'}, ...}

async def _check_raise_for_error(
    self, cell: NotebookNode, cell_index: int, exec_reply: t.Optional[t.Dict]
) -> None:

    if exec_reply is None:
        return None

    exec_reply_content = exec_reply['content']
    if exec_reply_content['status'] != 'error':
        return None

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors
        or exec_reply_content.get('ename') in self.allow_error_names
        or "raises-exception" in cell.metadata.get("tags", [])
    )
    await run_hook(
        self.on_cell_error, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )
    if not cell_allows_errors:
      raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)

E nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
E ------------------
E
E import shutil
E from merlin.models.loader.tf_utils import configure_tensorflow
E configure_tensorflow()
E from merlin.systems.triton.utils import run_ensemble_on_tritonserver
E response = run_ensemble_on_tritonserver(
E "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
E )
E response = [x.tolist()[0] for x in response["ordered_ids"]]
E shutil.rmtree("/tmp/examples/", ignore_errors=True)
E
E ------------------
E
E �[0;31m---------------------------------------------------------------------------�[0m
E �[0;31mInferenceServerException�[0m Traceback (most recent call last)
E Input �[0;32mIn [32]�[0m, in �[0;36m<cell line: 5>�[0;34m()�[0m
E �[1;32m 3�[0m configure_tensorflow()
E �[1;32m 4�[0m �[38;5;28;01mfrom�[39;00m �[38;5;21;01mmerlin�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01msystems�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mtriton�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mutils�[39;00m �[38;5;28;01mimport�[39;00m run_ensemble_on_tritonserver
E �[0;32m----> 5�[0m response �[38;5;241m=�[39m �[43mrun_ensemble_on_tritonserver�[49m�[43m(�[49m
E �[1;32m 6�[0m �[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43m/tmp/examples/poc_ensemble�[39;49m�[38;5;124;43m"�[39;49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest�[49m�[43m,�[49m�[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43mensemble_model�[39;49m�[38;5;124;43m"�[39;49m
E �[1;32m 7�[0m �[43m)�[49m
E �[1;32m 8�[0m response �[38;5;241m=�[39m [x�[38;5;241m.�[39mtolist()[�[38;5;241m0�[39m] �[38;5;28;01mfor�[39;00m x �[38;5;129;01min�[39;00m response[�[38;5;124m"�[39m�[38;5;124mordered_ids�[39m�[38;5;124m"�[39m]]
E �[1;32m 9�[0m shutil�[38;5;241m.�[39mrmtree(�[38;5;124m"�[39m�[38;5;124m/tmp/examples/�[39m�[38;5;124m"�[39m, ignore_errors�[38;5;241m=�[39m�[38;5;28;01mTrue�[39;00m)
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:93�[0m, in �[0;36mrun_ensemble_on_tritonserver�[0;34m(tmpdir, output_columns, df, model_name)�[0m
E �[1;32m 91�[0m response �[38;5;241m=�[39m �[38;5;28;01mNone�[39;00m
E �[1;32m 92�[0m �[38;5;28;01mwith�[39;00m run_triton_server(tmpdir) �[38;5;28;01mas�[39;00m client:
E �[0;32m---> 93�[0m response �[38;5;241m=�[39m �[43msend_triton_request�[49m�[43m(�[49m�[43mdf�[49m�[43m,�[49m�[43m �[49m�[43moutput_columns�[49m�[43m,�[49m�[43m �[49m�[43mclient�[49m�[38;5;241;43m=�[39;49m�[43mclient�[49m�[43m,�[49m�[43m �[49m�[43mtriton_model�[49m�[38;5;241;43m=�[39;49m�[43mmodel_name�[49m�[43m)�[49m
E �[1;32m 95�[0m �[38;5;28;01mreturn�[39;00m response
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:141�[0m, in �[0;36msend_triton_request�[0;34m(df, outputs_list, client, endpoint, request_id, triton_model)�[0m
E �[1;32m 139�[0m outputs �[38;5;241m=�[39m [grpcclient�[38;5;241m.�[39mInferRequestedOutput(col) �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list]
E �[1;32m 140�[0m �[38;5;28;01mwith�[39;00m client:
E �[0;32m--> 141�[0m response �[38;5;241m=�[39m �[43mclient�[49m�[38;5;241;43m.�[39;49m�[43minfer�[49m�[43m(�[49m�[43mtriton_model�[49m�[43m,�[49m�[43m �[49m�[43minputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest_id�[49m�[38;5;241;43m=�[39;49m�[43mrequest_id�[49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[38;5;241;43m=�[39;49m�[43moutputs�[49m�[43m)�[49m
E �[1;32m 143�[0m results �[38;5;241m=�[39m {}
E �[1;32m 144�[0m �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list:
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322�[0m, in �[0;36mInferenceServerClient.infer�[0;34m(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)�[0m
E �[1;32m 1320�[0m �[38;5;28;01mreturn�[39;00m result
E �[1;32m 1321�[0m �[38;5;28;01mexcept�[39;00m grpc�[38;5;241m.�[39mRpcError �[38;5;28;01mas�[39;00m rpc_error:
E �[0;32m-> 1322�[0m �[43mraise_error_grpc�[49m�[43m(�[49m�[43mrpc_error�[49m�[43m)�[49m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62�[0m, in �[0;36mraise_error_grpc�[0;34m(rpc_error)�[0m
E �[1;32m 61�[0m �[38;5;28;01mdef�[39;00m �[38;5;21mraise_error_grpc�[39m(rpc_error):
E �[0;32m---> 62�[0m �[38;5;28;01mraise�[39;00m get_error_grpc(rpc_error) �[38;5;28;01mfrom�[39;00m �[38;5;28mNone�[39m
E
E �[0;31mInferenceServerException�[0m: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E
E Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E
E At:
E /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
E
E InferenceServerException: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E
E Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E
E At:
E /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute

/usr/local/lib/python3.8/dist-packages/nbclient/client.py:916: CellExecutionError

During handling of the above exception, another exception occurred:

def test_func():
    with testbook(
        REPO_ROOT
        / "examples"
        / "Building-and-deploying-multi-stage-RecSys"
        / "01-Building-Recommender-Systems-with-Merlin.ipynb",
        execute=False,
    ) as tb1:
        tb1.inject(
            """
            import os
            os.environ["DATA_FOLDER"] = "/tmp/data/"
            os.environ["NUM_ROWS"] = "10000"
            os.system("mkdir -p /tmp/examples")
            os.environ["BASE_DIR"] = "/tmp/examples/"
            """
        )
        tb1.execute()
        assert os.path.isdir("/tmp/examples/dlrm")
        assert os.path.isdir("/tmp/examples/feature_repo")
        assert os.path.isdir("/tmp/examples/query_tower")
        assert os.path.isfile("/tmp/examples/item_embeddings.parquet")
        assert os.path.isfile("/tmp/examples/feature_repo/user_features.py")
        assert os.path.isfile("/tmp/examples/feature_repo/item_features.py")

    with testbook(
        REPO_ROOT
        / "examples"
        / "Building-and-deploying-multi-stage-RecSys"
        / "02-Deploying-multi-stage-RecSys-with-Merlin-Systems.ipynb",
        execute=False,
    ) as tb2:
        tb2.inject(
            """
            import os
            os.environ["DATA_FOLDER"] = "/tmp/data/"
            os.environ["BASE_DIR"] = "/tmp/examples/"
            """
        )
        NUM_OF_CELLS = len(tb2.cells)
        tb2.execute_cell(list(range(0, NUM_OF_CELLS - 3)))
        top_k = tb2.ref("top_k")
        outputs = tb2.ref("outputs")
        assert outputs[0] == "ordered_ids"
      tb2.inject(
            """
            import shutil
            from merlin.models.loader.tf_utils import configure_tensorflow
            configure_tensorflow()
            from merlin.systems.triton.utils import run_ensemble_on_tritonserver
            response = run_ensemble_on_tritonserver(
                "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
            )
            response = [x.tolist()[0] for x in response["ordered_ids"]]
            shutil.rmtree("/tmp/examples/", ignore_errors=True)
            """
        )

tests/unit/examples/test_building_deploying_multi_stage_RecSys.py:57:


/usr/local/lib/python3.8/dist-packages/testbook/client.py:237: in inject
cell = TestbookNode(self.execute_cell(inject_idx)) if run else TestbookNode(code_cell)


self = <testbook.client.TestbookNotebookClient object at 0x7fdc0f3b8b20>
cell = [53], kwargs = {}, cell_indexes = [53], executed_cells = [], idx = 53

def execute_cell(self, cell, **kwargs) -> Union[Dict, List[Dict]]:
    """
    Executes a cell or list of cells
    """
    if isinstance(cell, slice):
        start, stop = self._cell_index(cell.start), self._cell_index(cell.stop)
        if cell.step is not None:
            raise TestbookError('testbook does not support step argument')

        cell = range(start, stop + 1)
    elif isinstance(cell, str) or isinstance(cell, int):
        cell = [cell]

    cell_indexes = cell

    if all(isinstance(x, str) for x in cell):
        cell_indexes = [self._cell_index(tag) for tag in cell]

    executed_cells = []
    for idx in cell_indexes:
        try:
            cell = super().execute_cell(self.nb['cells'][idx], idx, **kwargs)
        except CellExecutionError as ce:
          raise TestbookRuntimeError(ce.evalue, ce, self._get_error_class(ce.ename))

E testbook.exceptions.TestbookRuntimeError: An error occurred while executing the following cell:
E ------------------
E
E import shutil
E from merlin.models.loader.tf_utils import configure_tensorflow
E configure_tensorflow()
E from merlin.systems.triton.utils import run_ensemble_on_tritonserver
E response = run_ensemble_on_tritonserver(
E "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
E )
E response = [x.tolist()[0] for x in response["ordered_ids"]]
E shutil.rmtree("/tmp/examples/", ignore_errors=True)
E
E ------------------
E
E �[0;31m---------------------------------------------------------------------------�[0m
E �[0;31mInferenceServerException�[0m Traceback (most recent call last)
E Input �[0;32mIn [32]�[0m, in �[0;36m<cell line: 5>�[0;34m()�[0m
E �[1;32m 3�[0m configure_tensorflow()
E �[1;32m 4�[0m �[38;5;28;01mfrom�[39;00m �[38;5;21;01mmerlin�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01msystems�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mtriton�[39;00m�[38;5;21;01m.�[39;00m�[38;5;21;01mutils�[39;00m �[38;5;28;01mimport�[39;00m run_ensemble_on_tritonserver
E �[0;32m----> 5�[0m response �[38;5;241m=�[39m �[43mrun_ensemble_on_tritonserver�[49m�[43m(�[49m
E �[1;32m 6�[0m �[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43m/tmp/examples/poc_ensemble�[39;49m�[38;5;124;43m"�[39;49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest�[49m�[43m,�[49m�[43m �[49m�[38;5;124;43m"�[39;49m�[38;5;124;43mensemble_model�[39;49m�[38;5;124;43m"�[39;49m
E �[1;32m 7�[0m �[43m)�[49m
E �[1;32m 8�[0m response �[38;5;241m=�[39m [x�[38;5;241m.�[39mtolist()[�[38;5;241m0�[39m] �[38;5;28;01mfor�[39;00m x �[38;5;129;01min�[39;00m response[�[38;5;124m"�[39m�[38;5;124mordered_ids�[39m�[38;5;124m"�[39m]]
E �[1;32m 9�[0m shutil�[38;5;241m.�[39mrmtree(�[38;5;124m"�[39m�[38;5;124m/tmp/examples/�[39m�[38;5;124m"�[39m, ignore_errors�[38;5;241m=�[39m�[38;5;28;01mTrue�[39;00m)
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:93�[0m, in �[0;36mrun_ensemble_on_tritonserver�[0;34m(tmpdir, output_columns, df, model_name)�[0m
E �[1;32m 91�[0m response �[38;5;241m=�[39m �[38;5;28;01mNone�[39;00m
E �[1;32m 92�[0m �[38;5;28;01mwith�[39;00m run_triton_server(tmpdir) �[38;5;28;01mas�[39;00m client:
E �[0;32m---> 93�[0m response �[38;5;241m=�[39m �[43msend_triton_request�[49m�[43m(�[49m�[43mdf�[49m�[43m,�[49m�[43m �[49m�[43moutput_columns�[49m�[43m,�[49m�[43m �[49m�[43mclient�[49m�[38;5;241;43m=�[39;49m�[43mclient�[49m�[43m,�[49m�[43m �[49m�[43mtriton_model�[49m�[38;5;241;43m=�[39;49m�[43mmodel_name�[49m�[43m)�[49m
E �[1;32m 95�[0m �[38;5;28;01mreturn�[39;00m response
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:141�[0m, in �[0;36msend_triton_request�[0;34m(df, outputs_list, client, endpoint, request_id, triton_model)�[0m
E �[1;32m 139�[0m outputs �[38;5;241m=�[39m [grpcclient�[38;5;241m.�[39mInferRequestedOutput(col) �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list]
E �[1;32m 140�[0m �[38;5;28;01mwith�[39;00m client:
E �[0;32m--> 141�[0m response �[38;5;241m=�[39m �[43mclient�[49m�[38;5;241;43m.�[39;49m�[43minfer�[49m�[43m(�[49m�[43mtriton_model�[49m�[43m,�[49m�[43m �[49m�[43minputs�[49m�[43m,�[49m�[43m �[49m�[43mrequest_id�[49m�[38;5;241;43m=�[39;49m�[43mrequest_id�[49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[38;5;241;43m=�[39;49m�[43moutputs�[49m�[43m)�[49m
E �[1;32m 143�[0m results �[38;5;241m=�[39m {}
E �[1;32m 144�[0m �[38;5;28;01mfor�[39;00m col �[38;5;129;01min�[39;00m outputs_list:
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322�[0m, in �[0;36mInferenceServerClient.infer�[0;34m(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)�[0m
E �[1;32m 1320�[0m �[38;5;28;01mreturn�[39;00m result
E �[1;32m 1321�[0m �[38;5;28;01mexcept�[39;00m grpc�[38;5;241m.�[39mRpcError �[38;5;28;01mas�[39;00m rpc_error:
E �[0;32m-> 1322�[0m �[43mraise_error_grpc�[49m�[43m(�[49m�[43mrpc_error�[49m�[43m)�[49m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62�[0m, in �[0;36mraise_error_grpc�[0;34m(rpc_error)�[0m
E �[1;32m 61�[0m �[38;5;28;01mdef�[39;00m �[38;5;21mraise_error_grpc�[39m(rpc_error):
E �[0;32m---> 62�[0m �[38;5;28;01mraise�[39;00m get_error_grpc(rpc_error) �[38;5;28;01mfrom�[39;00m �[38;5;28mNone�[39m
E
E �[0;31mInferenceServerException�[0m: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E
E Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E
E At:
E /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
E
E InferenceServerException: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E
E Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E
E At:
E /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute

/usr/local/lib/python3.8/dist-packages/testbook/client.py:135: TestbookRuntimeError
----------------------------- Captured stdout call -----------------------------
Signal (2) received.
----------------------------- Captured stderr call -----------------------------
2022-07-20 18:03:37.861905: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-20 18:03:39.850956: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1627 MB memory: -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-20 18:03:39.851704: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 15153 MB memory: -> device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/lib/python3.8/logging/init.py", line 2127, in shutdown
h.close()
File "/usr/local/lib/python3.8/dist-packages/absl/logging/init.py", line 934, in close
self.stream.close()
File "/usr/local/lib/python3.8/dist-packages/ipykernel/iostream.py", line 438, in close
self.watch_fd_thread.join()
AttributeError: 'OutStream' object has no attribute 'watch_fd_thread'
WARNING clustering 246 points to 32 centroids: please provide at least 1248 training points
2022-07-20 18:04:46.856305: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-20 18:04:48.848933: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1627 MB memory: -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-20 18:04:48.849703: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 15153 MB memory: -> device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0
I0720 18:04:54.048010 30654 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7fbe04000000' with size 268435456
I0720 18:04:54.048793 30654 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0720 18:04:54.056094 30654 model_repository_manager.cc:1191] loading: 1_predicttensorflow:1
I0720 18:04:54.156404 30654 model_repository_manager.cc:1191] loading: 0_queryfeast:1
I0720 18:04:54.256663 30654 model_repository_manager.cc:1191] loading: 2_queryfaiss:1
I0720 18:04:54.356937 30654 model_repository_manager.cc:1191] loading: 3_queryfeast:1
I0720 18:04:54.437341 30654 tensorflow.cc:2181] TRITONBACKEND_Initialize: tensorflow
I0720 18:04:54.437377 30654 tensorflow.cc:2191] Triton TRITONBACKEND API version: 1.9
I0720 18:04:54.437384 30654 tensorflow.cc:2197] 'tensorflow' TRITONBACKEND API version: 1.9
I0720 18:04:54.437389 30654 tensorflow.cc:2221] backend configuration:
{"cmdline":{"auto-complete-config":"false","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","version":"2","default-max-batch-size":"4"}}
I0720 18:04:54.437424 30654 tensorflow.cc:2281] TRITONBACKEND_ModelInitialize: 1_predicttensorflow (version 1)
I0720 18:04:54.442433 30654 tensorflow.cc:2330] TRITONBACKEND_ModelInstanceInitialize: 1_predicttensorflow (GPU device 0)
I0720 18:04:54.457215 30654 model_repository_manager.cc:1191] loading: 4_unrollfeatures:1
I0720 18:04:54.557507 30654 model_repository_manager.cc:1191] loading: 5_predicttensorflow:1
I0720 18:04:54.657878 30654 model_repository_manager.cc:1191] loading: 6_softmaxsampling:1
2022-07-20 18:04:54.788287: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel
2022-07-20 18:04:54.792467: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }
2022-07-20 18:04:54.792516: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel
2022-07-20 18:04:54.792622: I tensorflow/core/platform/cpu_feature_guard.cc:152] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE3 SSE4.1 SSE4.2 AVX
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-20 18:04:54.839238: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 12901 MB memory: -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-20 18:04:54.888048: I tensorflow/cc/saved_model/loader.cc:230] Restoring SavedModel bundle.
2022-07-20 18:04:54.972681: I tensorflow/cc/saved_model/loader.cc:214] Running initialization op on SavedModel bundle at path: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel
2022-07-20 18:04:54.996636: I tensorflow/cc/saved_model/loader.cc:321] SavedModel load for tags { serve }; Status: success: OK. Took 208369 microseconds.
I0720 18:04:54.996890 30654 model_repository_manager.cc:1345] successfully loaded '1_predicttensorflow' version 1
I0720 18:04:55.001358 30654 tensorflow.cc:2281] TRITONBACKEND_ModelInitialize: 5_predicttensorflow (version 1)
I0720 18:04:55.002680 30654 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 0_queryfeast (GPU device 0)
I0720 18:04:57.283545 30654 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 2_queryfaiss (GPU device 0)
I0720 18:04:57.283862 30654 model_repository_manager.cc:1345] successfully loaded '0_queryfeast' version 1
I0720 18:04:59.664708 30654 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 3_queryfeast (GPU device 0)
I0720 18:04:59.666447 30654 model_repository_manager.cc:1345] successfully loaded '2_queryfaiss' version 1
I0720 18:05:01.969638 30654 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 4_unrollfeatures (GPU device 0)
I0720 18:05:01.969838 30654 model_repository_manager.cc:1345] successfully loaded '3_queryfeast' version 1
I0720 18:05:04.078166 30654 tensorflow.cc:2330] TRITONBACKEND_ModelInstanceInitialize: 5_predicttensorflow (GPU device 0)
I0720 18:05:04.078432 30654 model_repository_manager.cc:1345] successfully loaded '4_unrollfeatures' version 1
2022-07-20 18:05:04.079810: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel
2022-07-20 18:05:04.100970: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }
2022-07-20 18:05:04.101018: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel
2022-07-20 18:05:04.103295: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 12901 MB memory: -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-20 18:05:04.125809: I tensorflow/cc/saved_model/loader.cc:230] Restoring SavedModel bundle.
2022-07-20 18:05:04.286472: I tensorflow/cc/saved_model/loader.cc:214] Running initialization op on SavedModel bundle at path: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel
2022-07-20 18:05:04.336108: I tensorflow/cc/saved_model/loader.cc:321] SavedModel load for tags { serve }; Status: success: OK. Took 256312 microseconds.
I0720 18:05:04.336253 30654 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 6_softmaxsampling (GPU device 0)
I0720 18:05:04.336397 30654 model_repository_manager.cc:1345] successfully loaded '5_predicttensorflow' version 1
I0720 18:05:06.450841 30654 model_repository_manager.cc:1345] successfully loaded '6_softmaxsampling' version 1
I0720 18:05:06.453322 30654 model_repository_manager.cc:1191] loading: ensemble_model:1
I0720 18:05:06.554133 30654 model_repository_manager.cc:1345] successfully loaded 'ensemble_model' version 1
I0720 18:05:06.554299 30654 server.cc:556]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0720 18:05:06.554397 30654 server.cc:583]
+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tensorflow | /opt/tritonserver/backends/tensorflow2/libtriton_tensorflow2.so | {"cmdline":{"auto-complete-config":"false","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","version":"2","default-max-batch-size":"4"}} |
| python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0720 18:05:06.554503 30654 server.cc:626]
+---------------------+---------+--------+
| Model | Version | Status |
+---------------------+---------+--------+
| 0_queryfeast | 1 | READY |
| 1_predicttensorflow | 1 | READY |
| 2_queryfaiss | 1 | READY |
| 3_queryfeast | 1 | READY |
| 4_unrollfeatures | 1 | READY |
| 5_predicttensorflow | 1 | READY |
| 6_softmaxsampling | 1 | READY |
| ensemble_model | 1 | READY |
+---------------------+---------+--------+

I0720 18:05:06.617118 30654 metrics.cc:650] Collecting metrics for GPU 0: Tesla P100-DGXS-16GB
I0720 18:05:06.617969 30654 tritonserver.cc:2138]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.22.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0] | /tmp/examples/poc_ensemble |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0720 18:05:06.618820 30654 grpc_server.cc:4589] Started GRPCInferenceService at 0.0.0.0:8001
I0720 18:05:06.619335 30654 http_server.cc:3303] Started HTTPService at 0.0.0.0:8000
I0720 18:05:06.660539 30654 http_server.cc:178] Started Metrics Service at 0.0.0.0:8002
W0720 18:05:07.637266 30654 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0720 18:05:07.637330 30654 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
W0720 18:05:08.637489 30654 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0720 18:05:08.637544 30654 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
0720 18:05:09.290165 30911 pb_stub.cc:749] Failed to process the request(s) for model '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)

Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"

At:
/tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute

I0720 18:05:09.294746 30654 server.cc:257] Waiting for in-flight requests to complete.
I0720 18:05:09.294816 30654 server.cc:273] Timeout 30: Found 0 model versions that have in-flight inferences
I0720 18:05:09.294834 30654 model_repository_manager.cc:1223] unloading: ensemble_model:1
I0720 18:05:09.294923 30654 model_repository_manager.cc:1223] unloading: 6_softmaxsampling:1
I0720 18:05:09.294991 30654 model_repository_manager.cc:1223] unloading: 5_predicttensorflow:1
I0720 18:05:09.295066 30654 model_repository_manager.cc:1223] unloading: 4_unrollfeatures:1
I0720 18:05:09.295088 30654 model_repository_manager.cc:1328] successfully unloaded 'ensemble_model' version 1
I0720 18:05:09.295122 30654 model_repository_manager.cc:1223] unloading: 3_queryfeast:1
I0720 18:05:09.295219 30654 tensorflow.cc:2368] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0720 18:05:09.295257 30654 model_repository_manager.cc:1223] unloading: 2_queryfaiss:1
I0720 18:05:09.295308 30654 model_repository_manager.cc:1223] unloading: 1_predicttensorflow:1
I0720 18:05:09.295336 30654 tensorflow.cc:2307] TRITONBACKEND_ModelFinalize: delete model state
I0720 18:05:09.295381 30654 model_repository_manager.cc:1223] unloading: 0_queryfeast:1
I0720 18:05:09.295430 30654 server.cc:288] All models are stopped, unloading models
I0720 18:05:09.295465 30654 server.cc:295] Timeout 30: Found 7 live models and 0 in-flight non-inference requests
I0720 18:05:09.295560 30654 tensorflow.cc:2368] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0720 18:05:09.295782 30654 tensorflow.cc:2307] TRITONBACKEND_ModelFinalize: delete model state
I0720 18:05:09.310762 30654 model_repository_manager.cc:1328] successfully unloaded '1_predicttensorflow' version 1
I0720 18:05:09.316485 30654 model_repository_manager.cc:1328] successfully unloaded '5_predicttensorflow' version 1
W0720 18:05:09.656038 30654 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0720 18:05:09.656086 30654 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
I0720 18:05:10.295579 30654 server.cc:295] Timeout 29: Found 5 live models and 0 in-flight non-inference requests
I0720 18:05:10.776344 30654 model_repository_manager.cc:1328] successfully unloaded '6_softmaxsampling' version 1
I0720 18:05:10.800218 30654 model_repository_manager.cc:1328] successfully unloaded '4_unrollfeatures' version 1
I0720 18:05:10.970323 30654 model_repository_manager.cc:1328] successfully unloaded '2_queryfaiss' version 1
I0720 18:05:11.295732 30654 server.cc:295] Timeout 28: Found 2 live models and 0 in-flight non-inference requests
I0720 18:05:12.295879 30654 server.cc:295] Timeout 27: Found 2 live models and 0 in-flight non-inference requests
I0720 18:05:13.296011 30654 server.cc:295] Timeout 26: Found 2 live models and 0 in-flight non-inference requests
I0720 18:05:14.296142 30654 server.cc:295] Timeout 25: Found 2 live models and 0 in-flight non-inference requests
I0720 18:05:15.296279 30654 server.cc:295] Timeout 24: Found 2 live models and 0 in-flight non-inference requests
I0720 18:05:16.296417 30654 server.cc:295] Timeout 23: Found 2 live models and 0 in-flight non-inference requests
I0720 18:05:17.296556 30654 server.cc:295] Timeout 22: Found 2 live models and 0 in-flight non-inference requests
I0720 18:05:18.296691 30654 server.cc:295] Timeout 21: Found 2 live models and 0 in-flight non-inference requests
I0720 18:05:19.296829 30654 server.cc:295] Timeout 20: Found 2 live models and 0 in-flight non-inference requests
/usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py:15: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
ValueType.FLOAT: (np.float, False, False),
/usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py:15: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
ValueType.FLOAT: (np.float, False, False),
I0720 18:05:19.773349 30654 model_repository_manager.cc:1328] successfully unloaded '3_queryfeast' version 1
I0720 18:05:19.835036 30654 model_repository_manager.cc:1328] successfully unloaded '0_queryfeast' version 1
I0720 18:05:20.296963 30654 server.cc:295] Timeout 19: Found 0 live models and 0 in-flight non-inference requests
=========================== short test summary info ============================
FAILED tests/unit/examples/test_building_deploying_multi_stage_RecSys.py::test_func
=================== 1 failed, 1 passed in 116.07s (0:01:56) ====================
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://github.com/gitapi/repos/NVIDIA-Merlin/Merlin/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_merlin] $ /bin/bash /tmp/jenkins2302420003527094502.sh

@gabrielspmoreira gabrielspmoreira merged commit c72bcbf into main Jul 20, 2022
@viswa-nvidia viswa-nvidia added this to the Merlin 22.08 milestone Jul 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants