-
-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Typed blobs #729
Typed blobs #729
Conversation
Hum, not sure i understand some of the changes on first reading -- will see to look again more thoughtfully. One thing i am concerned about is that if we raise the level of Rust-understanding required to read and make changes that we reduce the circle of people actually doing it. The other thing is that i prefer to touch/rustify areas where we have troubles/issues or plans. Blobfiles have not been reported or discussed as problematic since the last refactoring. And at my first (admittedly very quick) glance it didn't feel like this PR simplifies code. |
src/context.rs
Outdated
self.inner.kind | ||
} | ||
|
||
fn new_create_failure( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from_failure
would be a better name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for the others down below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced here, these functions create instances with specific error kinds set, in this case BlobErrorKind::CreateFailure
. Using from_failure()
suggests that the struct is made based on data from another struct, which it is not, everything is passed in as args, i don't think the cause
counts truly as something that warrants a from_*()
e2b2c5a
to
f82807f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @dignifiedquire ! I addressed the comments, not rebased onto master yet because master seems currently broken for me. but once that works rebasing shouldn't be a problem.
src/context.rs
Outdated
self.inner.kind | ||
} | ||
|
||
fn new_create_failure( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced here, these functions create instances with specific error kinds set, in this case BlobErrorKind::CreateFailure
. Using from_failure()
suggests that the struct is made based on data from another struct, which it is not, everything is passed in as args, i don't think the cause
counts truly as something that warrants a from_*()
@hpk42 I don't think this uses any new rust construct that aren't used already in the codebase. More generally I'm also not convinced it's worth to avoid idiomatic code to make it simpler, but than what is idiomatic is probably a tenuous debate anyway :) wrt to whether this should be refactored and whether it should be done now... who knows. I did this because I never knew how to handle blobs and never knew what expectations I could have from them. It doesn't (yet) make the code shorter, but it does make it stricter. And all the tests pass ;) |
FWIW i like this now better :) General sidenote: why do we even store $BLOBDIR at all into the DB? All file paths are relative to BLOBDIR so it's totally redundant and just complicating code. |
f82807f
to
483edd9
Compare
This creates a specific type for blobs, with well defined conversions at the borders. It also introduces a strong type for the Param::File value since that param is often used used by the public API to set filenames using absolute paths, but then core changes the param to a blob before it gets to the database. This eliminates a few more functions with very mallable C-like arguments behaviour which combine a number of operations in one. Because blob filenames are stored so often in arbitrary strings this does add more code when receiving those, until the storage is fixed. File name sanitisation is now deletated to the sanitize-filename crate which should do a slightly better job at this.
Turns out that anyone that uses these either justs wants a file or wants a blob. Consolidate those patterns into one place and simplify all the callers.
483edd9
to
e6b8097
Compare
Agree with that storing of the $BLOBDIR in the database. Am not convinced it's worh the churn to change it though, code would have to accept both, and/or we'd have to migrate the database. But if we did want to do that we now have a single clearly defined interface to do so ;) I think this can be considered for another review round/approval/merging (if test pass - the ones I ran locally did) |
for historical reasons :) before 2018-08, external paths were stored in the database, normally absolute but theoretically also relative. the idea was to avoid duplicated files, eg. if a larger video is sent, this generally makes some sense. however, at some point we gave this up as it was too confusing when a message is sent out and the sent file is later deleted - or even modified. on 2018-08 and database migration v41, the external paths were copied and paths were converted to $BLOBDIR, we use the prefix to be sure, the switch is done and not to copy a file twice accidentally (aborted conversion or so) (i've just learnt the word for that: Idempotence iirc ;) however, today, yip, might be that $BLOBDIR is superfluous. not totally sure as it is used in several places, eg. also in the config-section for the self-avatar or for group-images. |
I never knew what kind of stuff was in one of those strings referring to blobs, so I thought fixing it with types would be a good thing. Turns out that was more involved than hoped for.
Anyway, this tries to be strict about blobs by creating a
BlobObject
. This object should also make it easier to in the future not store all blobs into one directory or even into an sqlite store. Furthermore since blobs get passed through the params system a lot, theParamsFile
was added, basically any filename read from params should be parsable into that enum, allowing you to know if you're dealing with a pathname or already with a blob.This doesn't always make things less verbose on call sites, I apologise for that. In the future some of this could improve by making the storage more explicit, e.g. Params could have FromParams/ToParams traits. OTOH maybe Params should be moved to columns in the db directly, anyway I'm not making any calls on those things yet. The upshot of this is that you need to manually remember to parse a filename retrieved from params with ParamsFile for now.
Some places still exists where filenames are retrieved from Params without the ParamsFile bit. They don't have access to a context object and after checking them they already are safe and I felt like adding the complexities of pushing context into those APIs was not a good idea. Again I believe later improvements might end up bypassing some of this, or maybe they wont.
Lastly I must admit I experimented with errors here. I wanted a bespoke error object and wanted to have the cause carried along so you can see where errors come from as well as have backtraces. Again, currently nothing uses the causes and kinds because everything gets dumped in the global Error type. But I think in the future more things could have custom errors and causes and backtraces, I expect this will become especially nice once the core has it's own log it can write.
Finally, (phew, you're still reading), I realise this is bending the "freeze for a release" a bit. But than it's not the only improvement which hasn't been directly related to a bug before. And it's been semi-frozen for ages...
Thanks for reviewing!
The diff is a lot larger than I'd like so here some review pointers:
BlobObject
in context.rs is important. If you don't care about error handling much skip all the BlobError/BlobErrorKind/BlobErrorData stuff, you can see how it's used which is what matters. SecondlyParamsFile
in params.rs is important.After that the rest is mostly applying this stuff to current call sites, sometimes care must be taken not to change the logic.