Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-34466 XA prepare don't release unmodified records for some cases #3369

Open
wants to merge 2 commits into
base: 10.6
Choose a base branch
from

Conversation

vlad-lesin
Copy link
Contributor

@vlad-lesin vlad-lesin commented Jun 27, 2024

There is no need to exclude exclusive non-gap locks from the procedure
of locks releasing on XA PREPARE execution in
lock_release_on_prepare_try() after commit
17e59ed3aad481918f1a01d8afbc071c316d5930 (MDEV-33454), because
lock_rec_unlock_unmodified() should check if the record was modified
with the XA, and release the lock if it was not.

lock_release_on_prepare_try(): don't skip X-locks, let
lock_rec_unlock_unmodified() to process them.

lock_sec_rec_some_has_impl(): add template parameter for not acquiring
trx_t::mutex for the case if a caller already holds the mutex, don't
crash if lock's bitmap is clean.

row_vers_impl_x_locked(), row_vers_impl_x_locked_low(): add new argument
to skip trx_t::mutex acquiring.

rw_trx_hash_t::validate_element(): don't acquire trx_t::mutex if the
current thread already holds it.

@vlad-lesin vlad-lesin requested a review from dr-m June 27, 2024 15:26
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@vlad-lesin vlad-lesin force-pushed the 10.6-MDEV-34466 branch 2 times, most recently from c681c21 to ff16839 Compare June 28, 2024 13:09
Copy link
Contributor

@dr-m dr-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look OK, I only have some minor nitpicks. I would suggest to write
commit 17e59ed3aad481918f1a01d8afbc071c316d5930 (MDEV-33454)
in the commit message so that it is a little more convenient to find the change that it is referring to. In the GitHub web user interface the 17e59ed should be displayed as a hyperlink.

Once you have addressed these, @mariadb-DebarunBanerjee should review this.

storage/innobase/include/trx0sys.h Outdated Show resolved Hide resolved
storage/innobase/include/row0vers.h Outdated Show resolved Hide resolved
storage/innobase/lock/lock0lock.cc Outdated Show resolved Hide resolved
storage/innobase/lock/lock0lock.cc Outdated Show resolved Hide resolved
storage/innobase/row/row0vers.cc Outdated Show resolved Hide resolved
storage/innobase/row/row0vers.cc Outdated Show resolved Hide resolved
storage/innobase/lock/lock0lock.cc Outdated Show resolved Hide resolved
storage/innobase/lock/lock0lock.cc Outdated Show resolved Hide resolved
@vlad-lesin vlad-lesin force-pushed the 10.6-MDEV-34466 branch 3 times, most recently from 1f515af to 2384ea2 Compare July 22, 2024 11:54
Copy link
Contributor

@mariadb-DebarunBanerjee mariadb-DebarunBanerjee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Vladislav,
Thanks for the patch. I have reviewed the code and included my comments. The design level comments are from my general DB/MySQL understanding and I may not be completely correct with MariaDB. I still think it is worth to point them out for your consideration.

As this code is enabling the major part of the base MDEV work, we should consider testing the feature well for impacts.

It could have significant performance impact in Primary while checking for 2ndary index implicit lock especially with concurrent load modifying index records.

Since this early release of locks hardly helps the primary (rather impacts the performance/concurrency), we should evaluate how much it indeed helps the replica with scenarios where it is effective.

For row level replication, the replica should be updating the rows that are modified directly and the locks acquired during range scanning for update in Primary should not be there while replaying. So, I think there are only very specific cases, like unique index insert S locks where the optimization for releasing S lock early is really useful.

For statement level replication, I think releasing S locks early has a functional issue related to transaction ordering. Statement level replication is essentially dependent on Serializability (Note that RR is serializable when only DMLs are considered) and releasing S lock at XA PREPARE impacts the ordering.I would try to repeat it with mtr and update.

storage/innobase/lock/lock0lock.cc Outdated Show resolved Hide resolved
storage/innobase/row/row0sel.cc Outdated Show resolved Hide resolved
storage/innobase/row/row0vers.cc Outdated Show resolved Hide resolved
Comment on lines 4586 to 4611
bool rec_granted_exclusive_not_gap=
lock->is_rec_granted_exclusive_not_gap();
if (!supremum_bit && rec_granted_exclusive_not_gap)
continue;
if (UNIV_UNLIKELY(lock->type_mode & (LOCK_PREDICATE | LOCK_PRDT_PAGE)))
continue; /* SPATIAL INDEX locking is broken. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the crux of the patch. To make sure we are on same plane, let me first state the logic, as it stands today, of,releasing locks at XA prepare.

  1. Release all record locks in bitmap only if it is GAP only (X + LOCK_GAP). [!rec_granted_exclusive_not_gap]
  2. Otherwise release only the supremum record lock if supremum is part of bitmap.

This is the condition in lock_release_on_prepare_try, which actually can release only very few locks excluding all the S locks. The function lock_rec_unlock_unmodified() is completely unreachable from lock_release_on_prepare_try.

The lock_rec_unlock_unmodified is reachable from "lock_release_on_prepare()" where we do call it when [1] & [2] are not satisfied Unfortunately, lock_release_on_prepare_try() would also set all_released =T after releasing only minimal number of locks unless it fails to acquire the lock sys latch.

I have two points to include here.

  1. With this context, the patch makes lock_release_on_prepare_try() logic similar to lock_release_on_prepare() and should be correct assuming the base design is ok.

  2. Assuming we do intend to release all locks for which we haven't actually modified the record, even after the patch, we would miss the set of records for which the the lock also includes supremum record.

I will add a separate point on the base design which we need to evaluate too as the core part was vastly disabled before this patch.

Copy link
Contributor

@dr-m dr-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments are about the MDEV-34690 patch.

I see that there is some rather similar code between lock_release_on_prepare_try() and lock_release_on_prepare() for releasing the record locks, but there seem to be enough differences so that the similar code can’t be refactored to a separate function.

storage/innobase/include/btr0btr.h Outdated Show resolved Hide resolved
storage/innobase/lock/lock0lock.cc Outdated Show resolved Hide resolved
storage/innobase/include/trx0trx.h Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
mysql-test/suite/innodb/t/xa_unlock_unmodified.test Outdated Show resolved Hide resolved
@@ -535,6 +535,18 @@ NUMERIC_BLOCK_SIZE NULL
ENUM_VALUE_LIST OFF,ON
READ_ONLY YES
COMMAND_LINE_ARGUMENT NONE
VARIABLE_NAME INNODB_ENABLE_XAP_UNLOCK_UNMODIFIED_FOR_PRIMARY_DEBUG
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be a better idea to remove have_debug.inc from the test file and filter out all debug-only parameters from the output. In that way, this test could be run on all targets.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to run the test for all targets? It just checks the presence of system variables, have_debug.inc gives more coverage for that. We don't have release-only variables, so we don't need a separate run for release build.

storage/innobase/lock/lock0lock.cc Outdated Show resolved Hide resolved
storage/innobase/btr/btr0btr.cc Outdated Show resolved Hide resolved
storage/innobase/include/lock0priv.h Outdated Show resolved Hide resolved
storage/innobase/include/srv0srv.h Outdated Show resolved Hide resolved
storage/innobase/lock/lock0lock.cc Outdated Show resolved Hide resolved
storage/innobase/lock/lock0lock.cc Outdated Show resolved Hide resolved
storage/innobase/lock/lock0lock.cc Outdated Show resolved Hide resolved
There is no need to exclude exclusive non-gap locks from the procedure
of locks releasing on XA PREPARE execution in
lock_release_on_prepare_try() after commit
17e59ed (MDEV-33454), because
lock_rec_unlock_unmodified() should check if the record was modified
with the XA, and release the lock if it was not.

lock_release_on_prepare_try(): don't skip X-locks, let
lock_rec_unlock_unmodified() to process them.

lock_sec_rec_some_has_impl(): add template parameter for not acquiring
trx_t::mutex for the case if a caller already holds the mutex, don't
crash if lock's bitmap is clean.

row_vers_impl_x_locked(), row_vers_impl_x_locked_low(): add new argument
to skip trx_t::mutex acquiring.

rw_trx_hash_t::validate_element(): don't acquire trx_t::mutex if the
current thread already holds it.

Thanks to Andrei Elkin for finding the bug.
Reviewed by Marko Mäkelä, Debarun Banerjee.
lock_rec_unlock_unmodified() is executed either under lock_sys.wr_lock()
or under a combination of lock_sys.rd_lock() + record locks hash table
cell latch. It also requests page latch to check if locked records were
changed by the current transaction or not.

Usually InnoDB requests page latch to find the certain record on the
page, and then requests lock_sys and/or record lock hash cell latch to
request record lock. lock_rec_unlock_unmodified() requests the latches
in the opposite order, what causes deadlocks. One of the possible
scenario for the deadlock is the following:

thread 1 - lock_rec_unlock_unmodified() is invoked under locks hash table
           cell latch, the latch is acquired;
thread 2 - purge thread acquires page latch and tries to remove
           delete-marked record, it invokes lock_update_delete(), which
           requests locks hash table cell latch, held by thread 1;
thread 1 - requests page latch, held by thread 2.

To fix it we need to release lock_sys.latch and/or lock hash cell latch,
acquire page latch and re-acquire lock_sys related latches.

When lock_sys.latch and/or lock hash cell latch are released in
lock_release_on_prepare() and lock_release_on_prepare_try(), the page on
which the current lock is held, can be merged. In this case the bitmap
of the current lock must be cleared, and the new lock must be added to
the end of trx->lock.trx_locks list, or bitmap of already existing lock
must be changed.

The new field trx_lock_t::set_nth_bit_calls indicates if new locks
(bits in existing lock bitmaps or new lock objects) were created during
the period when lock_sys was released in trx->lock.trx_locks list
iteration loop in lock_release_on_prepare() or
lock_release_on_prepare_try(). And, if so, we traverse the list again.

The block can be freed during pages merging, what causes assertion
failure in buf_page_get_gen(), as btr_block_get() passes BUF_GET as page
get mode to it. That's why page_get_mode parameter was added to
btr_block_get() to pass BUF_GET_POSSIBLY_FREED from
lock_release_on_prepare() and lock_release_on_prepare_try() to
buf_page_get_gen().

As searching for id of trx, which modified secondary index record, is
quite expensive operation, restrict its usage for master. System variable
was added to remove the restriction for testing simplifying. The
variable exists only either for debug build or for build with
-DINNODB_ENABLE_XAP_UNLOCK_UNMODIFIED_FOR_PRIMARY option to increase the
probability of catching bugs for release build with RQG.

Reviewed by Marko Mäkelä.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants