Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#32217 Change the file locking on large uploads #32226

Merged
merged 1 commit into from
Aug 14, 2018

Conversation

micbar
Copy link
Contributor

@micbar micbar commented Aug 2, 2018

Description

When uploading large files >10GB, the assembly process on the server runs very long (>5minutes). This causes a timeout on the client side (Version 2.4.2). The setup is a clustered web server behind a F5 load balancer.
New move Requests by the client, let the running assembly processes fail.

Related Issue

Motivation and Context

Created PR to test if we can make the file locking process more robust

How Has This Been Tested?

  • test environment: Local docker images with Desktop client 2.4.2
  • test case 1: Upload 10GB file, while slowing down the assembly process (sleep()) to produce timout at the client side

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Changed Unit Tests

Checklist:

  • Code changes

Open tasks:

  • Backport (if applicable set "backport-request" label and remove when the backport was done)

@codecov
Copy link

codecov bot commented Aug 2, 2018

Codecov Report

Merging #32226 into master will decrease coverage by <.01%.
The diff coverage is 75%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #32226      +/-   ##
============================================
- Coverage     64.01%      64%   -0.01%     
- Complexity    18560    18563       +3     
============================================
  Files          1171     1171              
  Lines         69837    69845       +8     
  Branches       1267     1267              
============================================
+ Hits          44706    44707       +1     
- Misses        24761    24768       +7     
  Partials        370      370
Flag Coverage Δ Complexity Δ
#javascript 52.84% <ø> (ø) 0 <ø> (ø) ⬇️
#phpunit 65.28% <75%> (-0.01%) 18563 <0> (+3)
Impacted Files Coverage Δ Complexity Δ
apps/dav/lib/Connector/Sabre/File.php 82.52% <75%> (-1.87%) 115 <0> (+3)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 23c9335...8ee1d3f. Read the comment docs.

Copy link
Member

@DeepDiver1975 DeepDiver1975 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix unit tests - rest 👍

@@ -188,9 +188,25 @@ public function put($data) {
// because we have no clue about the cause we can only throw back a 500/Internal Server Error
throw new Exception('Could not write file contents');
}

try {
$this->changeLock(ILockingProvider::LOCK_EXCLUSIVE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please move this before fopen to avoid having two concurrent processes opening the file at the same time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure? If i do this, tho whole operations fails.
Is it possible to lock a file before opening?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how exactly does it fail ? it shouldn't fail as the storage fopen shouldn't lock anything. Maybe there is something else that is suspicious there...

in general if we leave it like you did then a concurrent process would be able to lock the target file, but would fail after fopen and before starting stream copy. so the error handler would need to close the file again, which I think will happen anyway.

would still be good to investigate why moving the lock before makes the operation fail, in case there are some ghosts hidden somewhere...

@micbar micbar force-pushed the file-locking-large-uploads branch 2 times, most recently from 42dbf9b to c9d30d6 Compare August 13, 2018 15:24
@micbar
Copy link
Contributor Author

micbar commented Aug 13, 2018

@DeepDiver1975 @PVince81
I had to adjust the testing of the put method.

As seen in

$node->acquireLock(ILockingProvider::LOCK_SHARED);

the put() is executed with a shared lock.

Please review and ping back, i get the impression, that this is a very sensitive part :-)

Changed locking to avoid exclusive lock during hooks
@DeepDiver1975 DeepDiver1975 added this to the development milestone Aug 14, 2018
@DeepDiver1975 DeepDiver1975 merged commit f68df37 into master Aug 14, 2018
@DeepDiver1975 DeepDiver1975 deleted the file-locking-large-uploads branch August 14, 2018 13:39
@phil-davis
Copy link
Contributor

Backport stable10 #32334

@PVince81
Copy link
Contributor

PVince81 commented Sep 10, 2018

@micbar do you remember where exactly you put the sleep statements ? this is for expanding the reproduction steps for manual QA to verify

$target = $partStorage->fopen($internalPartPath, 'wb');
if (!\is_resource($target)) {
\OCP\Util::writeLog('webdav', '\OC\Files\Filesystem::fopen() failed', \OCP\Util::ERROR);
// because we have no clue about the cause we can only throw back a 500/Internal Server Error
throw new Exception('Could not write file contents');
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a sleep() statement after setting the exclusive lock and before the final move operation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@micbar which value did you use ? how high to set it to have the desktop client time out on its own ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

`sleep(300)' in both locations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The desktop client resends the MOVE request after 300sec.

@lock lock bot locked as resolved and limited conversation to collaborators Sep 17, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change the file locking for chunked uploads
4 participants