Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidance on versioning implementation 1 #362

Merged
merged 5 commits into from
Aug 27, 2019

Conversation

neilsjefferies
Copy link
Member

@neilsjefferies neilsjefferies commented Jun 4, 2019

This should work regardless of where the temp files are located. There needs to be some temp files when we write to the object inventory anyway. Fixes #348

<li>
Temporary Version Directories: New versions should be created in a temporary version directory.
This could reside in the OCFL Storage Root workspace directory ("deposit"), the same location as
other version directories in an OCFL Object or elsewhere. In any case, only one such directory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what the does "the same location as other version directories in an OCFL Object or elsewhere" mean.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not saying where the workspace is....just that there has to be one and the contents are predictable and therefore identifiable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that i would change the sentence to say this, because right now i don't really know what that sentence says.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seems really hard to discuss a semi-blessed deposit directory inside objects in a PR. May I suggest creating a new issue (that points back to #320) that proposes guidance on this issue in the implementation notes while not changing the normative specification?

Note also should use code font <code>deposit</code> instead of quoting literals

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...this is guidance in the implementation notes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not being clear: my point is that it would be good to merge other parts of this PR but I don't think we can introduce a new in-object deposit directory without more discussion on a separate issue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not being clear: my point is that it would be good to merge other parts of this PR but I don't think we can introduce a new in-object deposit directory without more discussion on a separate issue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Limited discussion to "deposit" directory in Storage Root.

Temporary Version Directories: New versions should be created in a temporary version directory.
This could reside in the OCFL Storage Root workspace directory ("deposit"), the same location as
other version directories in an OCFL Object or elsewhere. In any case, only one such directory
can exist per OCFL Object at any time and updates to it should be managed by a single
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only one deposit directory can exist per object? only one temporary version directory can exist per object? this isn't clear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. The temporary version directory is called "deposit..."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hoopefully clarified.

other version directories in an OCFL Object or elsewhere. In any case, only one such directory
can exist per OCFL Object at any time and updates to it should be managed by a single
controlling process to avoid conflicts. It is advised that object updates should be given
a unique transaction identifier to simplify any overlying process control logic. For
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is way to into the weeds on how to architect a repository.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bingo! I refer you to Fedora 4 and DPN - apparently this is something some people find rather hard. This is just a guideline on how not to screw it up too badly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm suggesting that you remove this from the language because it shouldn't be on us to tell folks how to architect their repository.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scrapped.

example, the identifier could be used as an an eTag for a REST API implementation. A
good temporary directory name should be "deposit_transaction ID_OCFL Object ID".
If there is guaranteed to only be a single process controlling updates to the contents
of an OCFL storage root, then it is permissible to use "deposit_OCFL Object ID" as a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I have a top level deposit directory, then I don't need to name a directory this because I know everything in that directory is for deposit into the storage root.

Copy link
Member Author

@neilsjefferies neilsjefferies Jun 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be multiple transactions going on in that directory so they need separate distinguishable subdirectories. Embedding a transaction ID also gives protection against multiple writes to the same object by different processes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's one way to do it, but another is to just not let people write to the same object while another version is being deposited. see comment above about this being way too into the weeds and we shouldn't be telling folks what to do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the point - the only way to do any form of locking is to stick a flag somewhere. Since OCFL aims to do away with non-file-based data, this is an OCFL-ish way to do it. This is guidance, as stated above, not the spec so it's just a suggestion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scrapped.

If there is guaranteed to only be a single process controlling updates to the contents
of an OCFL storage root, then it is permissible to use "deposit_OCFL Object ID" as a
temporary version directory name. If temporary directories are stored within OCFL
Objects then the OCFL Object ID part may be omitted since it is implicit.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you're saying that there would be just a V2 directory? how would you know that a version hasn't been completed in the directory?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but in any case, version completeness can be determined from the inventory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scrapped.

@rosy1280
Copy link
Contributor

rosy1280 commented Jun 5, 2019

@neilsjefferies I haven't finished reviewing this, but I would suggest that we re-visit the previous conversations we had around deposit directories and identify what this section's purpose is because right now this doesn't necessarily seem like implementation guidance.

</p>

<section>
<h2>Temporary Names</h2>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Temporary Names" sounds odd to me, should this be "Temporary Files and Directories"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed...done

example, the identifier could be used as an an eTag for a REST API implementation. A
good temporary directory name should be "deposit_transaction ID_OCFL Object ID".
If there is guaranteed to only be a single process controlling updates to the contents
of an OCFL storage root, then it is permissible to use "deposit_OCFL Object ID" as a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't recommend directory names with spaces but maybe that is just old-school of me.

I think we also have to be careful around language like permissible as this is non-normative, the issue here is that with just a single process it is adequate without the transaction id.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, shouldn't be spaces - will fix that. How often will we ever have a single process controlling access since it is likely that any overlying API will be RESTful.

This is in the implementation notes so not part of the spec and therefore shouldn't it all be non-normative. Adequate is adequate however!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly scrapped.

does not specify in detail update and file locking mechanisms since these are implementation
dependent features. Nevertheless, this section includes some basic guidance on how to
update OCFL objects in a manner that tries to ensure that updates are limited to a single
transaction and that failures are detectable and recoverable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to point out that advice here will result in objects that cannot be validated during update, validation should be performed only when updates are complete?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted.

Update object as described earlier.
</li>
<li>
Ideally, update the local inventory after each file change operation, not at the end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether this is ideal. If one thinks of an atomic update of many files then it is just a waste of time

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Advantages both ways depending on use case. Scrapped as we don;t really need to say anything about it.

julianmorley
julianmorley previously approved these changes Aug 20, 2019
<li>
Temporary Version Directories: New versions should be created in a temporary version directory.
This could reside in the OCFL Storage Root workspace directory ("deposit"), the same location as
other version directories in an OCFL Object or elsewhere. In any case, only one such directory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not being clear: my point is that it would be good to merge other parts of this PR but I don't think we can introduce a new in-object deposit directory without more discussion on a separate issue

@rosy1280 rosy1280 merged commit 20e792a into master Aug 27, 2019
@rosy1280 rosy1280 deleted the Guidance-on-versioning-implementation-1 branch October 3, 2019 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide versioning implementation guidance
5 participants