FIX-#5829: fix ndarray assignment via loc #5847

anmyachev · 2023-03-23T17:19:41Z

What do these changes do?

first commit message and PR title follow format outlined here

NOTE: If you edit the PR title to match this format, you need to add another commit (even if it's empty) or amend your last commit for the CI job that checks the PR title to pick up the new PR title.
passes flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
passes black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
signed commit with git commit -s
Resolves BUG: Syntax Error in modin > pandas > indexing.py:829 #5829
tests added and passing
module layout described at docs/development/architecture.rst is up-to-date

anmyachev · 2023-03-24T12:22:00Z

modin/pandas/indexing.py

@@ -821,22 +821,23 @@ def _setitem_with_new_columns(self, row_loc, col_loc, item):
                        "Must have equal len keys and value when setting with an iterable"
                    )
            else:
-                if item.shape != (len(self.qc.index, len(col_loc))):
+                if item.shape != (len(row_loc), len(col_loc)):


I believe that we need to compare the lengths of the keys with the object that we insert.

anmyachev · 2023-03-24T12:22:57Z

modin/pandas/indexing.py

        if not all(common_label_loc):
            # In this case we have some new cols and some old ones
            columns = self.qc.columns
            for i in range(len(common_label_loc)):
                if not common_label_loc[i]:
                    columns = columns.insert(len(columns), col_loc[i])
-            self.qc = self.qc.reindex(labels=columns, axis=1, fill_value=0)
+            self.qc = self.qc.reindex(labels=columns, axis=1, fill_value=np.NaN)


Apparently the behavior in pandas has been changed.

anmyachev · 2023-03-24T12:27:18Z

modin/pandas/indexing.py

-                if len(item.shape) > 1
-                else item[common_label_loc]
-            )
+            if not isinstance(item, np.ndarray):


At least in the case when the keys are strings, we will get an empty array here, which is not true.

Could you elaborate please? I still don't understand why the check for numpy array is needed

The previous comment is not entirely correct.

This part of the code was responsible for not taking those columns from the assigned value that were not in the query compiler, however this behavior is not true for loc operation, since missed columns will be added in the next few lines via reindex op.

Considering also that in this code branch there can only be a numpy array, I completely deleted this code.

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

anmyachev · 2023-03-30T17:40:29Z

@dchigarev anything else?

anmyachev marked this pull request as ready for review March 23, 2023 18:36

anmyachev requested a review from a team as a code owner March 23, 2023 18:36

anmyachev commented Mar 24, 2023

View reviewed changes

anmyachev added the Ready for review label Mar 24, 2023

anmyachev added 2 commits March 27, 2023 19:17

FIX-modin-project#5829: fix ndarray assignment via loc

6e9a16c

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

update

77cee29

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

anmyachev force-pushed the issue5829 branch from fbf3d52 to 77cee29 Compare March 27, 2023 17:17

fixes

c97e528

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

dchigarev approved these changes Apr 3, 2023

View reviewed changes

dchigarev merged commit 18d4738 into modin-project:master Apr 3, 2023

anmyachev deleted the issue5829 branch April 3, 2023 21:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX-#5829: fix ndarray assignment via loc #5847

FIX-#5829: fix ndarray assignment via loc #5847

anmyachev commented Mar 23, 2023 •

edited

Loading

anmyachev Mar 24, 2023

anmyachev Mar 24, 2023

anmyachev Mar 24, 2023

dchigarev Mar 27, 2023

anmyachev Mar 27, 2023

anmyachev Mar 27, 2023

anmyachev commented Mar 30, 2023

FIX-#5829: fix ndarray assignment via loc #5847

FIX-#5829: fix ndarray assignment via loc #5847

Conversation

anmyachev commented Mar 23, 2023 • edited Loading

What do these changes do?

anmyachev Mar 24, 2023

Choose a reason for hiding this comment

anmyachev Mar 24, 2023

Choose a reason for hiding this comment

anmyachev Mar 24, 2023

Choose a reason for hiding this comment

dchigarev Mar 27, 2023

Choose a reason for hiding this comment

anmyachev Mar 27, 2023

Choose a reason for hiding this comment

anmyachev Mar 27, 2023

Choose a reason for hiding this comment

anmyachev commented Mar 30, 2023

anmyachev commented Mar 23, 2023 •

edited

Loading