-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MDEV-34466 XA prepare don't release unmodified records for some cases #3369
base: 10.6
Are you sure you want to change the base?
Conversation
vlad-lesin
commented
Jun 27, 2024
•
edited
Loading
edited
|
c681c21
to
ff16839
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look OK, I only have some minor nitpicks. I would suggest to write
commit 17e59ed3aad481918f1a01d8afbc071c316d5930 (MDEV-33454)
in the commit message so that it is a little more convenient to find the change that it is referring to. In the GitHub web user interface the 17e59ed should be displayed as a hyperlink.
Once you have addressed these, @mariadb-DebarunBanerjee should review this.
1f515af
to
2384ea2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Vladislav,
Thanks for the patch. I have reviewed the code and included my comments. The design level comments are from my general DB/MySQL understanding and I may not be completely correct with MariaDB. I still think it is worth to point them out for your consideration.
As this code is enabling the major part of the base MDEV work, we should consider testing the feature well for impacts.
It could have significant performance impact in Primary while checking for 2ndary index implicit lock especially with concurrent load modifying index records.
Since this early release of locks hardly helps the primary (rather impacts the performance/concurrency), we should evaluate how much it indeed helps the replica with scenarios where it is effective.
For row level replication, the replica should be updating the rows that are modified directly and the locks acquired during range scanning for update in Primary should not be there while replaying. So, I think there are only very specific cases, like unique index insert S locks where the optimization for releasing S lock early is really useful.
For statement level replication, I think releasing S locks early has a functional issue related to transaction ordering. Statement level replication is essentially dependent on Serializability (Note that RR is serializable when only DMLs are considered) and releasing S lock at XA PREPARE impacts the ordering.I would try to repeat it with mtr and update.
bool rec_granted_exclusive_not_gap= | ||
lock->is_rec_granted_exclusive_not_gap(); | ||
if (!supremum_bit && rec_granted_exclusive_not_gap) | ||
continue; | ||
if (UNIV_UNLIKELY(lock->type_mode & (LOCK_PREDICATE | LOCK_PRDT_PAGE))) | ||
continue; /* SPATIAL INDEX locking is broken. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be the crux of the patch. To make sure we are on same plane, let me first state the logic, as it stands today, of,releasing locks at XA prepare.
- Release all record locks in bitmap only if it is GAP only (X + LOCK_GAP). [!rec_granted_exclusive_not_gap]
- Otherwise release only the supremum record lock if supremum is part of bitmap.
This is the condition in lock_release_on_prepare_try, which actually can release only very few locks excluding all the S locks. The function lock_rec_unlock_unmodified() is completely unreachable from lock_release_on_prepare_try.
The lock_rec_unlock_unmodified is reachable from "lock_release_on_prepare()" where we do call it when [1] & [2] are not satisfied Unfortunately, lock_release_on_prepare_try() would also set all_released =T after releasing only minimal number of locks unless it fails to acquire the lock sys latch.
I have two points to include here.
-
With this context, the patch makes lock_release_on_prepare_try() logic similar to lock_release_on_prepare() and should be correct assuming the base design is ok.
-
Assuming we do intend to release all locks for which we haven't actually modified the record, even after the patch, we would miss the set of records for which the the lock also includes supremum record.
I will add a separate point on the base design which we need to evaluate too as the core part was vastly disabled before this patch.
c9f644a
to
36d174e
Compare
5565788
to
5232882
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments are about the MDEV-34690 patch.
I see that there is some rather similar code between lock_release_on_prepare_try()
and lock_release_on_prepare()
for releasing the record locks, but there seem to be enough differences so that the similar code can’t be refactored to a separate function.
03e31cf
to
4b5c176
Compare
4b5c176
to
428144f
Compare
428144f
to
88abf45
Compare
@@ -535,6 +535,18 @@ NUMERIC_BLOCK_SIZE NULL | |||
ENUM_VALUE_LIST OFF,ON | |||
READ_ONLY YES | |||
COMMAND_LINE_ARGUMENT NONE | |||
VARIABLE_NAME INNODB_ENABLE_XAP_UNLOCK_UNMODIFIED_FOR_PRIMARY_DEBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be a better idea to remove have_debug.inc
from the test file and filter out all debug-only parameters from the output. In that way, this test could be run on all targets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to run the test for all targets? It just checks the presence of system variables, have_debug.inc
gives more coverage for that. We don't have release-only variables, so we don't need a separate run for release build.
88abf45
to
8d3d85e
Compare
There is no need to exclude exclusive non-gap locks from the procedure of locks releasing on XA PREPARE execution in lock_release_on_prepare_try() after commit 17e59ed (MDEV-33454), because lock_rec_unlock_unmodified() should check if the record was modified with the XA, and release the lock if it was not. lock_release_on_prepare_try(): don't skip X-locks, let lock_rec_unlock_unmodified() to process them. lock_sec_rec_some_has_impl(): add template parameter for not acquiring trx_t::mutex for the case if a caller already holds the mutex, don't crash if lock's bitmap is clean. row_vers_impl_x_locked(), row_vers_impl_x_locked_low(): add new argument to skip trx_t::mutex acquiring. rw_trx_hash_t::validate_element(): don't acquire trx_t::mutex if the current thread already holds it. Thanks to Andrei Elkin for finding the bug. Reviewed by Marko Mäkelä, Debarun Banerjee.
lock_rec_unlock_unmodified() is executed either under lock_sys.wr_lock() or under a combination of lock_sys.rd_lock() + record locks hash table cell latch. It also requests page latch to check if locked records were changed by the current transaction or not. Usually InnoDB requests page latch to find the certain record on the page, and then requests lock_sys and/or record lock hash cell latch to request record lock. lock_rec_unlock_unmodified() requests the latches in the opposite order, what causes deadlocks. One of the possible scenario for the deadlock is the following: thread 1 - lock_rec_unlock_unmodified() is invoked under locks hash table cell latch, the latch is acquired; thread 2 - purge thread acquires page latch and tries to remove delete-marked record, it invokes lock_update_delete(), which requests locks hash table cell latch, held by thread 1; thread 1 - requests page latch, held by thread 2. To fix it we need to release lock_sys.latch and/or lock hash cell latch, acquire page latch and re-acquire lock_sys related latches. When lock_sys.latch and/or lock hash cell latch are released in lock_release_on_prepare() and lock_release_on_prepare_try(), the page on which the current lock is held, can be merged. In this case the bitmap of the current lock must be cleared, and the new lock must be added to the end of trx->lock.trx_locks list, or bitmap of already existing lock must be changed. The new field trx_lock_t::set_nth_bit_calls indicates if new locks (bits in existing lock bitmaps or new lock objects) were created during the period when lock_sys was released in trx->lock.trx_locks list iteration loop in lock_release_on_prepare() or lock_release_on_prepare_try(). And, if so, we traverse the list again. The block can be freed during pages merging, what causes assertion failure in buf_page_get_gen(), as btr_block_get() passes BUF_GET as page get mode to it. That's why page_get_mode parameter was added to btr_block_get() to pass BUF_GET_POSSIBLY_FREED from lock_release_on_prepare() and lock_release_on_prepare_try() to buf_page_get_gen(). As searching for id of trx, which modified secondary index record, is quite expensive operation, restrict its usage for master. System variable was added to remove the restriction for testing simplifying. The variable exists only either for debug build or for build with -DINNODB_ENABLE_XAP_UNLOCK_UNMODIFIED_FOR_PRIMARY option to increase the probability of catching bugs for release build with RQG. Reviewed by Marko Mäkelä.
8d3d85e
to
737dd09
Compare