Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove drop_duplicates() from SAR method fix #1464 #1588

Merged
merged 10 commits into from
Feb 28, 2022

Conversation

miguelgfierro
Copy link
Collaborator

Description

Related Issues

Fix #1464

Checklist:

  • I have followed the contribution guidelines and code style for this project.
  • I have added tests covering my contributions.
  • I have updated the documentation accordingly.
  • This PR is being made to staging branch and not to main branch.

@miguelgfierro
Copy link
Collaborator Author

error:

tests/unit/recommenders/models/test_sar_singlenode.py ......FFFFFF...... [ 74%]
________________ test_sar_item_similarity[1-cooccurrence-count] ________________

threshold = 1, similarity_type = 'cooccurrence', file = 'count'
demo_usage_data =                  UserId    MovieId     Timestamp  Rating
0      0003000098E85347  DQF-00358  1.433879e+09     1.0
1   ...BAE  DQF-00248  1.416292e+09     1.0
11837  00030000822E3BAE  DAF-00448  1.416292e+09     1.0

[11838 rows x 4 columns]
sar_settings = {'ATOL': 1e-08, 'FILE_DIR': 'https://recodatasets.z20.web.core.windows.net/sarunittest/', 'TEST_USER_ID': '0003000098E85347'}
header = {'col_item': 'MovieId', 'col_rating': 'Rating', 'col_timestamp': 'Timestamp', 'col_user': 'UserId'}

    @pytest.mark.parametrize(
        "threshold,similarity_type,file",
        [
            (1, "cooccurrence", "count"),
            (1, "jaccard", "jac"),
            (1, "lift", "lift"),
            (3, "cooccurrence", "count"),
            (3, "jaccard", "jac"),
            (3, "lift", "lift"),
        ],
    )
    def test_sar_item_similarity(
        threshold, similarity_type, file, demo_usage_data, sar_settings, header
    ):
    
        model = SARSingleNode(
            similarity_type=similarity_type,
            timedecay_formula=False,
            time_decay_coefficient=30,
            threshold=threshold,
            **header
        )
    
        model.fit(demo_usage_data)
    
        true_item_similarity, row_ids, col_ids = read_matrix(
            sar_settings["FILE_DIR"] + "sim_" + file + str(threshold) + ".csv"
        )
    
        if similarity_type == "cooccurrence":
            test_item_similarity = _rearrange_to_test(
                model.item_similarity.todense(),
                row_ids,
                col_ids,
                model.item2index,
                model.item2index,
            )
>           assert np.array_equal(
                true_item_similarity.astype(test_item_similarity.dtype),
                test_item_similarity,
            )
E           AssertionError: assert False
E            +  where False = <function array_equal at 0x7f5b7d41dd40>(array([[ 31.,   0.,   0., ...,   0.,  16.,   0.],\n       [  0.,  29.,   5., ...,   0.,   0.,   0.],\n       [  0.,   5.... 60.,  12.,   1.],\n       [ 16.,   0.,   0., ...,  12., 149.,   0.],\n       [  0.,   0.,   0., ...,   1.,   0.,  29.]]), matrix([[ 55.,   0.,   0., ...,   0.,  20.,   0.],\n        [  0.,  35.,   6., ...,   0.,   0.,   0.],\n        [  0.,  ...3.,  15.,   1.],\n        [ 20.,   0.,   0., ...,  15., 243.,   0.],\n        [  0.,   0.,   0., ...,   1.,   0., 103.]]))
E            +    where <function array_equal at 0x7f5b7d41dd40> = np.array_equal
E            +    and   array([[ 31.,   0.,   0., ...,   0.,  16.,   0.],\n       [  0.,  29.,   5., ...,   0.,   0.,   0.],\n       [  0.,   5.... 60.,  12.,   1.],\n       [ 16.,   0.,   0., ...,  12., 149.,   0.],\n       [  0.,   0.,   0., ...,   1.,   0.,  29.]]) = <built-in method astype of numpy.ndarray object at 0x7f5b407a5b10>(dtype('float64'))
E            +      where <built-in method astype of numpy.ndarray object at 0x7f5b407a5b10> = array([['31', '0', '0', ..., '0', '16', '0'],\n       ['0', '29', '5', ..., '0', '0', '0'],\n       ['0', '5', '55', ...... '12', '1'],\n       ['16', '0', '0', ..., '12', '149', '0'],\n       ['0', '0', '0', ..., '1', '0', '29']], dtype='<U4').astype
E            +      and   dtype('float64') = matrix([[ 55.,   0.,   0., ...,   0.,  20.,   0.],\n        [  0.,  35.,   6., ...,   0.,   0.,   0.],\n        [  0.,  ...3.,  15.,   1.],\n        [ 20.,   0.,   0., ...,  15., 243.,   0.],\n        [  0.,   0.,   0., ...,   1.,   0., 103.]]).dtype

tests/unit/recommenders/models/test_sar_singlenode.py:175: AssertionError
___________________ test_sar_item_similarity[1-jaccard-jac] ____________________

threshold = 1, similarity_type = 'jaccard', file = 'jac'
demo_usage_data =                  UserId    MovieId     Timestamp  Rating
0      0003000098E85347  DQF-00358  1.433879e+09     1.0
1   ...BAE  DQF-00248  1.416292e+09     1.0
11837  00030000822E3BAE  DAF-00448  1.416292e+09     1.0

[11838 rows x 4 columns]
sar_settings = {'ATOL': 1e-08, 'FILE_DIR': 'https://recodatasets.z20.web.core.windows.net/sarunittest/', 'TEST_USER_ID': '0003000098E85347'}
header = {'col_item': 'MovieId', 'col_rating': 'Rating', 'col_timestamp': 'Timestamp', 'col_user': 'UserId'}

    @pytest.mark.parametrize(
        "threshold,similarity_type,file",
        [
            (1, "cooccurrence", "count"),
            (1, "jaccard", "jac"),
            (1, "lift", "lift"),
            (3, "cooccurrence", "count"),
            (3, "jaccard", "jac"),
            (3, "lift", "lift"),
        ],
    )
    def test_sar_item_similarity(
        threshold, similarity_type, file, demo_usage_data, sar_settings, header
    ):
    
        model = SARSingleNode(
            similarity_type=similarity_type,
            timedecay_formula=False,
            time_decay_coefficient=30,
            threshold=threshold,
            **header
        )
    
        model.fit(demo_usage_data)
    
        true_item_similarity, row_ids, col_ids = read_matrix(
            sar_settings["FILE_DIR"] + "sim_" + file + str(threshold) + ".csv"
        )
    
        if similarity_type == "cooccurrence":
            test_item_similarity = _rearrange_to_test(
                model.item_similarity.todense(),
                row_ids,
                col_ids,
                model.item2index,
                model.item2index,
            )
            assert np.array_equal(
                true_item_similarity.astype(test_item_similarity.dtype),
                test_item_similarity,
            )
        else:
            test_item_similarity = _rearrange_to_test(
                model.item_similarity, row_ids, col_ids, model.item2index, model.item2index
            )
>           assert np.allclose(
                true_item_similarity.astype(test_item_similarity.dtype),
                test_item_similarity,
                atol=sar_settings["ATOL"],
            )
E           AssertionError: assert False
E            +  where False = <function allclose at 0x7f5b7d41d8c0>(array([[1.        , 0.        , 0.        , ..., 0.        , 0.09756098,\n        0.        ],\n       [0.        , 1.  ...  ,\n        0.        ],\n       [0.        , 0.        , 0.        , ..., 0.01136364, 0.        ,\n        1.        ]]), array([[1.        , 0.        , 0.        , ..., 0.        , 0.07194245,\n        0.        ],\n       [0.        , 1.  ...  ,\n        0.        ],\n       [0.        , 0.        , 0.        , ..., 0.00273973, 0.        ,\n        1.        ]]), atol=1e-08)
E            +    where <function allclose at 0x7f5b7d41d8c0> = np.allclose
E            +    and   array([[1.        , 0.        , 0.        , ..., 0.        , 0.09756098,\n        0.        ],\n       [0.        , 1.  ...  ,\n        0.        ],\n       [0.        , 0.        , 0.        , ..., 0.01136364, 0.        ,\n        1.        ]]) = <built-in method astype of numpy.ndarray object at 0x7f5b400eb090>(dtype('float64'))
E            +      where <built-in method astype of numpy.ndarray object at 0x7f5b400eb090> = array([['1', '0', '0', ..., '0', '0.0975609756097561', '0'],\n       ['0', '1', '0.0632911392405063', ..., '0', '0', '0...., '0.0609137055837563', '1',\n        '0'],\n       ['0', '0', '0', ..., '0.0113636363636364', '0', '1']], dtype='<U20').astype
E            +      and   dtype('float64') = array([[1.        , 0.        , 0.        , ..., 0.        , 0.07194245,\n        0.        ],\n       [0.        , 1.  ...  ,\n        0.        ],\n       [0.        , 0.        , 0.        , ..., 0.00273973, 0.        ,\n        1.        ]]).dtype

tests/unit/recommenders/models/test_sar_singlenode.py:183: AssertionError
____________________ test_sar_item_similarity[1-lift-lift] _____________________

threshold = 1, similarity_type = 'lift', file = 'lift'
demo_usage_data =                  UserId    MovieId     Timestamp  Rating
0      0003000098E85347  DQF-00358  1.433879e+09     1.0
1   ...BAE  DQF-00248  1.416292e+09     1.0
11837  00030000822E3BAE  DAF-00448  1.416292e+09     1.0

[11838 rows x 4 columns]
sar_settings = {'ATOL': 1e-08, 'FILE_DIR': 'https://recodatasets.z20.web.core.windows.net/sarunittest/', 'TEST_USER_ID': '0003000098E85347'}
header = {'col_item': 'MovieId', 'col_rating': 'Rating', 'col_timestamp': 'Timestamp', 'col_user': 'UserId'}

    @pytest.mark.parametrize(
        "threshold,similarity_type,file",
        [
            (1, "cooccurrence", "count"),
            (1, "jaccard", "jac"),
            (1, "lift", "lift"),
            (3, "cooccurrence", "count"),
            (3, "jaccard", "jac"),
            (3, "lift", "lift"),
        ],
    )
    def test_sar_item_similarity(
        threshold, similarity_type, file, demo_usage_data, sar_settings, header
    ):
    
        model = SARSingleNode(
            similarity_type=similarity_type,
            timedecay_formula=False,
            time_decay_coefficient=30,
            threshold=threshold,
            **header
        )
    
        model.fit(demo_usage_data)
    
        true_item_similarity, row_ids, col_ids = read_matrix(
            sar_settings["FILE_DIR"] + "sim_" + file + str(threshold) + ".csv"
        )
    
        if similarity_type == "cooccurrence":
            test_item_similarity = _rearrange_to_test(
                model.item_similarity.todense(),
                row_ids,
                col_ids,
                model.item2index,
                model.item2index,
            )
            assert np.array_equal(
                true_item_similarity.astype(test_item_similarity.dtype),
                test_item_similarity,
            )
        else:
            test_item_similarity = _rearrange_to_test(
                model.item_similarity, row_ids, col_ids, model.item2index, model.item2index
            )
>           assert np.allclose(
                true_item_similarity.astype(test_item_similarity.dtype),
                test_item_similarity,
                atol=sar_settings["ATOL"],
            )
E           AssertionError: assert False
E            +  where False = <function allclose at 0x7f5b7d41d8c0>(array([[0.03225806, 0.        , 0.        , ..., 0.        , 0.00346395,\n        0.        ],\n       [0.        , 0.03...41,\n        0.        ],\n       [0.        , 0.        , 0.        , ..., 0.00057471, 0.        ,\n        0.03448276]]), array([[1.81818182e-02, 0.00000000e+00, 0.00000000e+00, ...,\n        0.00000000e+00, 1.49644594e-03, 0.00000000e+00],\n...\n       [0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ...,\n        3.69153531e-05, 0.00000000e+00, 9.70873786e-03]]), atol=1e-08)
E            +    where <function allclose at 0x7f5b7d41d8c0> = np.allclose
E            +    and   array([[0.03225806, 0.        , 0.        , ..., 0.        , 0.00346395,\n        0.        ],\n       [0.        , 0.03...41,\n        0.        ],\n       [0.        , 0.        , 0.        , ..., 0.00057471, 0.        ,\n        0.03448276]]) = <built-in method astype of numpy.ndarray object at 0x7f5b40e726f0>(dtype('float64'))
E            +      where <built-in method astype of numpy.ndarray object at 0x7f5b40e726f0> = array([['0.032258064516129', '0', '0', ..., '0', '0.00346395323663131',\n        '0'],\n       ['0', '0.0344827586206897...39597315', '0'],\n       ['0', '0', '0', ..., '0.000574712643678161', '0',\n        '0.0344827586206897']], dtype='<U20').astype
E            +      and   dtype('float64') = array([[1.81818182e-02, 0.00000000e+00, 0.00000000e+00, ...,\n        0.00000000e+00, 1.49644594e-03, 0.00000000e+00],\n...\n       [0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ...,\n        3.69153531e-05, 0.00000000e+00, 9.70873786e-03]]).dtype

tests/unit/recommenders/models/test_sar_singlenode.py:183: AssertionError
________________ test_sar_item_similarity[3-cooccurrence-count] ________________

threshold = 3, similarity_type = 'cooccurrence', file = 'count'
demo_usage_data =                  UserId    MovieId     Timestamp  Rating
0      0003000098E85347  DQF-00358  1.433879e+09     1.0
1   ...BAE  DQF-00248  1.416292e+09     1.0
11837  00030000822E3BAE  DAF-00448  1.416292e+09     1.0

[11838 rows x 4 columns]
sar_settings = {'ATOL': 1e-08, 'FILE_DIR': 'https://recodatasets.z20.web.core.windows.net/sarunittest/', 'TEST_USER_ID': '0003000098E85347'}
header = {'col_item': 'MovieId', 'col_rating': 'Rating', 'col_timestamp': 'Timestamp', 'col_user': 'UserId'}

    @pytest.mark.parametrize(
        "threshold,similarity_type,file",
        [
            (1, "cooccurrence", "count"),
            (1, "jaccard", "jac"),
            (1, "lift", "lift"),
            (3, "cooccurrence", "count"),
            (3, "jaccard", "jac"),
            (3, "lift", "lift"),
        ],
    )
    def test_sar_item_similarity(
        threshold, similarity_type, file, demo_usage_data, sar_settings, header
    ):
    
        model = SARSingleNode(
            similarity_type=similarity_type,
            timedecay_formula=False,
            time_decay_coefficient=30,
            threshold=threshold,
            **header
        )
    
        model.fit(demo_usage_data)
    
        true_item_similarity, row_ids, col_ids = read_matrix(
            sar_settings["FILE_DIR"] + "sim_" + file + str(threshold) + ".csv"
        )
    
        if similarity_type == "cooccurrence":
            test_item_similarity = _rearrange_to_test(
                model.item_similarity.todense(),
                row_ids,
                col_ids,
                model.item2index,
                model.item2index,
            )
>           assert np.array_equal(
                true_item_similarity.astype(test_item_similarity.dtype),
                test_item_similarity,
            )
E           AssertionError: assert False
E            +  where False = <function array_equal at 0x7f5b7d41dd40>(array([[ 31.,   0.,   0., ...,   0.,  16.,   0.],\n       [  0.,  29.,   5., ...,   0.,   0.,   0.],\n       [  0.,   5.... 60.,  12.,   0.],\n       [ 16.,   0.,   0., ...,  12., 149.,   0.],\n       [  0.,   0.,   0., ...,   0.,   0.,  29.]]), matrix([[ 55.,   0.,   0., ...,   0.,  20.,   0.],\n        [  0.,  35.,   6., ...,   0.,   0.,   0.],\n        [  0.,  ...3.,  15.,   0.],\n        [ 20.,   0.,   0., ...,  15., 243.,   0.],\n        [  0.,   0.,   0., ...,   0.,   0., 103.]]))
E            +    where <function array_equal at 0x7f5b7d41dd40> = np.array_equal
E            +    and   array([[ 31.,   0.,   0., ...,   0.,  16.,   0.],\n       [  0.,  29.,   5., ...,   0.,   0.,   0.],\n       [  0.,   5.... 60.,  12.,   0.],\n       [ 16.,   0.,   0., ...,  12., 149.,   0.],\n       [  0.,   0.,   0., ...,   0.,   0.,  29.]]) = <built-in method astype of numpy.ndarray object at 0x7f5b400ea270>(dtype('float64'))
E            +      where <built-in method astype of numpy.ndarray object at 0x7f5b400ea270> = array([['31', '0', '0', ..., '0', '16', '0'],\n       ['0', '29', '5', ..., '0', '0', '0'],\n       ['0', '5', '55', ...... '12', '0'],\n       ['16', '0', '0', ..., '12', '149', '0'],\n       ['0', '0', '0', ..., '0', '0', '29']], dtype='<U4').astype
E            +      and   dtype('float64') = matrix([[ 55.,   0.,   0., ...,   0.,  20.,   0.],\n        [  0.,  35.,   6., ...,   0.,   0.,   0.],\n        [  0.,  ...3.,  15.,   0.],\n        [ 20.,   0.,   0., ...,  15., 243.,   0.],\n        [  0.,   0.,   0., ...,   0.,   0., 103.]]).dtype

tests/unit/recommenders/models/test_sar_singlenode.py:175: AssertionError
___________________ test_sar_item_similarity[3-jaccard-jac] ____________________

threshold = 3, similarity_type = 'jaccard', file = 'jac'
demo_usage_data =                  UserId    MovieId     Timestamp  Rating
0      0003000098E85347  DQF-00358  1.433879e+09     1.0
1   ...BAE  DQF-00248  1.416292e+09     1.0
11837  00030000822E3BAE  DAF-00448  1.416292e+09     1.0

[11838 rows x 4 columns]
sar_settings = {'ATOL': 1e-08, 'FILE_DIR': 'https://recodatasets.z20.web.core.windows.net/sarunittest/', 'TEST_USER_ID': '0003000098E85347'}
header = {'col_item': 'MovieId', 'col_rating': 'Rating', 'col_timestamp': 'Timestamp', 'col_user': 'UserId'}

    @pytest.mark.parametrize(
        "threshold,similarity_type,file",
        [
            (1, "cooccurrence", "count"),
            (1, "jaccard", "jac"),
            (1, "lift", "lift"),
            (3, "cooccurrence", "count"),
            (3, "jaccard", "jac"),
            (3, "lift", "lift"),
        ],
    )
    def test_sar_item_similarity(
        threshold, similarity_type, file, demo_usage_data, sar_settings, header
    ):
    
        model = SARSingleNode(
            similarity_type=similarity_type,
            timedecay_formula=False,
            time_decay_coefficient=30,
            threshold=threshold,
            **header
        )
    
        model.fit(demo_usage_data)
    
        true_item_similarity, row_ids, col_ids = read_matrix(
            sar_settings["FILE_DIR"] + "sim_" + file + str(threshold) + ".csv"
        )
    
        if similarity_type == "cooccurrence":
            test_item_similarity = _rearrange_to_test(
                model.item_similarity.todense(),
                row_ids,
                col_ids,
                model.item2index,
                model.item2index,
            )
            assert np.array_equal(
                true_item_similarity.astype(test_item_similarity.dtype),
                test_item_similarity,
            )
        else:
            test_item_similarity = _rearrange_to_test(
                model.item_similarity, row_ids, col_ids, model.item2index, model.item2index
            )
>           assert np.allclose(
                true_item_similarity.astype(test_item_similarity.dtype),
                test_item_similarity,
                atol=sar_settings["ATOL"],
            )
E           AssertionError: assert False
E            +  where False = <function allclose at 0x7f5b7d41d8c0>(array([[1.        , 0.        , 0.        , ..., 0.        , 0.09756098,\n        0.        ],\n       [0.        , 1.  ...  ,\n        0.        ],\n       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,\n        1.        ]]), array([[1.        , 0.        , 0.        , ..., 0.        , 0.07194245,\n        0.        ],\n       [0.        , 1.  ...  ,\n        0.        ],\n       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,\n        1.        ]]), atol=1e-08)
E            +    where <function allclose at 0x7f5b7d41d8c0> = np.allclose
E            +    and   array([[1.        , 0.        , 0.        , ..., 0.        , 0.09756098,\n        0.        ],\n       [0.        , 1.  ...  ,\n        0.        ],\n       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,\n        1.        ]]) = <built-in method astype of numpy.ndarray object at 0x7f5b40b61f90>(dtype('float64'))
E            +      where <built-in method astype of numpy.ndarray object at 0x7f5b40b61f90> = array([['1', '0', '0', ..., '0', '0.0975609756097561', '0'],\n       ['0', '1', '0.0632911392405063', ..., '0', '0', '0...61', '0', '0', ..., '0.0609137055837563', '1',\n        '0'],\n       ['0', '0', '0', ..., '0', '0', '1']], dtype='<U19').astype
E            +      and   dtype('float64') = array([[1.        , 0.        , 0.        , ..., 0.        , 0.07194245,\n        0.        ],\n       [0.        , 1.  ...  ,\n        0.        ],\n       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,\n        1.        ]]).dtype

tests/unit/recommenders/models/test_sar_singlenode.py:183: AssertionError
____________________ test_sar_item_similarity[3-lift-lift] _____________________

threshold = 3, similarity_type = 'lift', file = 'lift'
demo_usage_data =                  UserId    MovieId     Timestamp  Rating
0      0003000098E85347  DQF-00358  1.433879e+09     1.0
1   ...BAE  DQF-00248  1.416292e+09     1.0
11837  00030000822E3BAE  DAF-00448  1.416292e+09     1.0

[11838 rows x 4 columns]
sar_settings = {'ATOL': 1e-08, 'FILE_DIR': 'https://recodatasets.z20.web.core.windows.net/sarunittest/', 'TEST_USER_ID': '0003000098E85347'}
header = {'col_item': 'MovieId', 'col_rating': 'Rating', 'col_timestamp': 'Timestamp', 'col_user': 'UserId'}

    @pytest.mark.parametrize(
        "threshold,similarity_type,file",
        [
            (1, "cooccurrence", "count"),
            (1, "jaccard", "jac"),
            (1, "lift", "lift"),
            (3, "cooccurrence", "count"),
            (3, "jaccard", "jac"),
            (3, "lift", "lift"),
        ],
    )
    def test_sar_item_similarity(
        threshold, similarity_type, file, demo_usage_data, sar_settings, header
    ):
    
        model = SARSingleNode(
            similarity_type=similarity_type,
            timedecay_formula=False,
            time_decay_coefficient=30,
            threshold=threshold,
            **header
        )
    
        model.fit(demo_usage_data)
    
        true_item_similarity, row_ids, col_ids = read_matrix(
            sar_settings["FILE_DIR"] + "sim_" + file + str(threshold) + ".csv"
        )
    
        if similarity_type == "cooccurrence":
            test_item_similarity = _rearrange_to_test(
                model.item_similarity.todense(),
                row_ids,
                col_ids,
                model.item2index,
                model.item2index,
            )
            assert np.array_equal(
                true_item_similarity.astype(test_item_similarity.dtype),
                test_item_similarity,
            )
        else:
            test_item_similarity = _rearrange_to_test(
                model.item_similarity, row_ids, col_ids, model.item2index, model.item2index
            )
>           assert np.allclose(
                true_item_similarity.astype(test_item_similarity.dtype),
                test_item_similarity,
                atol=sar_settings["ATOL"],
            )
E           AssertionError: assert False
E            +  where False = <function allclose at 0x7f5b7d41d8c0>(array([[0.03225806, 0.        , 0.        , ..., 0.        , 0.00346395,\n        0.        ],\n       [0.        , 0.03...41,\n        0.        ],\n       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,\n        0.03448276]]), array([[0.01818182, 0.        , 0.        , ..., 0.        , 0.00149645,\n        0.        ],\n       [0.        , 0.02...23,\n        0.        ],\n       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,\n        0.00970874]]), atol=1e-08)
E            +    where <function allclose at 0x7f5b7d41d8c0> = np.allclose
E            +    and   array([[0.03225806, 0.        , 0.        , ..., 0.        , 0.00346395,\n        0.        ],\n       [0.        , 0.03...41,\n        0.        ],\n       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,\n        0.03448276]]) = <built-in method astype of numpy.ndarray object at 0x7f5b3f8e9f30>(dtype('float64'))
E            +      where <built-in method astype of numpy.ndarray object at 0x7f5b3f8e9f30> = array([['0.032258064516129', '0', '0', ..., '0', '0.00346395323663131',\n        '0'],\n       ['0', '0.0344827586206897...9463',\n        '0.00671140939597315', '0'],\n       ['0', '0', '0', ..., '0', '0', '0.0344827586206897']], dtype='<U20').astype
E            +      and   dtype('float64') = array([[0.01818182, 0.        , 0.        , ..., 0.        , 0.00149645,\n        0.        ],\n       [0.        , 0.02...23,\n        0.        ],\n       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,\n        0.00970874]]).dtype

tests/unit/recommenders/models/test_sar_singlenode.py:183: AssertionError

@miguelgfierro
Copy link
Collaborator Author

I think @simonzhaoms was looking into removing drop_duplicates, should I close this PR?

@simonzhaoms
Copy link
Collaborator

simonzhaoms commented Feb 23, 2022

@miguelgfierro I am working on this PR and will close this PR after I fix the failures in tests.

Copy link
Collaborator Author

@miguelgfierro miguelgfierro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@miguelgfierro
Copy link
Collaborator Author

@simonzhaoms is this PR ready to be merged?

@simonzhaoms
Copy link
Collaborator

Yes. @miguelgfierro

@simonzhaoms simonzhaoms merged commit 96b5053 into staging Feb 28, 2022
@miguelgfierro miguelgfierro deleted the miguelgfierro-patch-2 branch February 28, 2022 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants