-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix same file upload #4746
Fix same file upload #4746
Conversation
740820c
to
67b3f12
Compare
} | ||
|
||
// a revision with this mtime does already exist. | ||
// If the blobs are the same we can just delete the old one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disclaimer: I am not very familiar with the upload code, so I might be missing something obvious.
Why can we be sure here that this is really the same blob, when we only look at the mtimes and blobsizes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I asked myself the same. But what else CAN we check? The etag
is just a hash over id and mtime, which are both the same here.
At the moment we are just silently discarding the old blob without any further checks. So this is at least a improvement. However if you have any good idea how we can savely say that this is the same blob I am very happy to check it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already generate checksums for the content of the upload at
reva/pkg/storage/utils/decomposedfs/upload/upload.go
Lines 118 to 145 in 67b3f12
if session.info.MetaData["checksum"] != "" { | |
var err error | |
parts := strings.SplitN(session.info.MetaData["checksum"], " ", 2) | |
if len(parts) != 2 { | |
return errtypes.BadRequest("invalid checksum format. must be '[algorithm] [checksum]'") | |
} | |
switch parts[0] { | |
case "sha1": | |
err = checkHash(parts[1], sha1h) | |
case "md5": | |
err = checkHash(parts[1], md5h) | |
case "adler32": | |
err = checkHash(parts[1], adler32h) | |
default: | |
err = errtypes.BadRequest("unsupported checksum algorithm: " + parts[0]) | |
} | |
if err != nil { | |
session.store.Cleanup(ctx, session, true, false, false) | |
return err | |
} | |
} | |
// update checksums | |
attrs := node.Attributes{ | |
prefixes.ChecksumPrefix + "sha1": sha1h.Sum(nil), | |
prefixes.ChecksumPrefix + "md5": md5h.Sum(nil), | |
prefixes.ChecksumPrefix + "adler32": adler32h.Sum(nil), | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we use the checksums? For the upload I think the caller of updateExistingNode
(CreateNodeForUpdate
) already has the checksums. For the old I guess we can get them as well somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Good idea 👍 I managed to get the checksums from the xattrs. Also got rid of the blobsize check as it isn't needed any more.
BUT: At the moment I am only checking md5
sum. Should I check another one? Should I check all three?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you only want to check one I think I'd choose sha1
. 🤷♂️
I guess at some point we need to switch to some less out-dated hashing algorithm, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can check two or all three if that makes sense. Just not sure if it makes sense 😃
I'll switch to sha1
meanwhile
67b3f12
to
76b5cbd
Compare
76b5cbd
to
ab0531f
Compare
Signed-off-by: jkoberg <jkoberg@owncloud.com>
Signed-off-by: jkoberg <jkoberg@owncloud.com>
7af0f3f
to
61f8b5a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Fixes a bug where uploading the same file multiple times leads to orphaned blobs in the blobstore.
Reva part of owncloud/ocis#9498