Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it OK to have a missing version directory? #540

Open
zimeon opened this issue Apr 20, 2021 · 21 comments
Open

Is it OK to have a missing version directory? #540

zimeon opened this issue Apr 20, 2021 · 21 comments
Assignees
Milestone

Comments

@zimeon
Copy link
Contributor

zimeon commented Apr 20, 2021

Fixture OCFL/fixtures#79 / E010_missing_versions brings up an interesting question for me. Is it really necessary to have a version directory for every version? There will be a version directory if there is an inventory for every version but this is not required. But an implementation isn't storing an inventory for every version and a version doesn't add any new content files, is an empty version directory required?

@pwinckles
Copy link

I brought this issue up previously (#535) and at the time was told that you must always have a version and if you aren't storing an inventory for every version you're doing it wrong.

@awoods
Copy link
Member

awoods commented Apr 25, 2021

Getting back to the core OCFL principles, I would argue that the first paragraph of "3.3 Version Directories" defines an important characteristic of OCFL. However, given the loophole raised in #535, I would suggest we add wording to the middle of the first paragraph of "3.7 Version Inventory and Inventory Digest" along the lines of:

In the case where no files have been added or updated in a given version, which would result in an empty and therefore absent "content" directory (see https://ocfl.io/1.0/spec/#content-directory), such a version directory MUST include an inventory file.

@zimeon zimeon added this to the 1.1 milestone May 4, 2021
@awoods awoods added the Ready for Review Ready for review by editorial group label May 23, 2021
@zimeon
Copy link
Contributor Author

zimeon commented Nov 16, 2021

I feel uncomfortable with the idea that we might require an inventory.json just as a way to keep the version directory in implementations that choose otherwise not to have an inventory in the version directories.

@rosy1280
Copy link
Contributor

rosy1280 commented Nov 30, 2021

Suggested chage to 3.3 Version Directories. Paragraph 1 should read (changes are highlighted):

OCFL Object content MUST be stored as a sequence of one or more versions. The sequence of version numbers is the sequence of positive, base-ten integers: 1, 2, 3, etc., and the version directory name is constructed by adding the prefix v. The version number sequence MUST start at 1 and MUST be continuous without missing integers. Each object version MUST be stored in a version directory under the object root.

and then the last paragraph should read (changes are highlighted):

There MUST be no other files as children of a version directory, other than an inventory file, an inventory digest, or a .no_content file. The version directory SHOULD NOT contain any directories other than the designated content sub-directory. Once created, the contents of a version directory are expected to be immutable.

I don't think the suggested changes indicate that you can't have an empty version directory which is of course the whole point of this ticket.

@rosy1280
Copy link
Contributor

rosy1280 commented Nov 30, 2021

We need additional language that lets readers know .no_content file should exist when 1. you don't store an inventory file in your version directories AND 2. your version does not have any content to be stored (e.g. the version was created to document a file name change).

Suggestion on language to use welcome.

cc: @zimeon @awoods

@zimeon
Copy link
Contributor Author

zimeon commented Nov 30, 2021

Per slack discussion with @neilsjefferies, I think don't see any benefit of making the no_content file a "dot"/hidden file.

Questions:

  1. Is it allowed to have a no_content file AND an inventory? (I think it YES)
  2. Is it allowed to have a no_content file AND a content sub-directory? (I think NO)
  3. Is it preferred to have a no_content file in the case that there is no content sub-directory, even if there is an inventory? (I think YES)
  4. What should the content of the no_content file be? (I suggest empty, SHOULD?)

Assuming my two answers to the above. I think we could write something like the following although it ends up as a bit of a mouthful:

There MUST be no files as children of a version directory except an inventory file, an inventory digest, or a no_content file. The version directory SHOULD NOT contain any directories other than the designated content sub-directory. The version directory MUST NOT be empty and in the case that there is no content sub-directory there SHOULD be a no_content file. If present, the no_content file SHOULD be empty. Once created, the contents of a version directory are expected to be immutable.

@pwinckles
Copy link

That language does not enforce point 2

@neilsjefferies
Copy link
Member

neilsjefferies commented Nov 30, 2021

Is there any harm in making a no_content file mandatory in the absence of a content subdirectory?

@rosy1280
Copy link
Contributor

We also discussed whether or not 'no_content' should have content and felt that for validation it didn't matter we would just check for presence. Therefore we would remain silent on whether or not there is content in the no_content file. @zimeon can you explain why we need to dictate that no_content is zero length?

@zimeon
Copy link
Contributor Author

zimeon commented Nov 30, 2021

@pwinckles re. #540 (comment) - yes indeed, good point

@neilsjefferies re. #540 (comment) - if we make it mandatory then we don't have backward compatibility with 1.0... but now I think about it, requiring no_content when there isn't a content directory is also not backwards compatible so maybe this whole change has to wait for 2.0?

@rosy1280 re. #540 (comment) - no particular reason why no_content should have no content (though you have to admit it is kinda cute). I do think it is better to recommend something as that avoids someone having to make an arbitrary implementation decision.

Taking the above into account, a revised proposal might be:

There MUST be no files as children of a version directory except an inventory file, an inventory digest, or a no_content file. The version directory SHOULD NOT contain any directories other than the designated content sub-directory. The version directory MUST NOT be empty. In the case that there is no designated content sub-directory there [SHOULD|MUST] be a file named no_content, and there MUST NOT be a file or directory named no_content otherwise. If present, the no_content file [SHOULD be empty|MAY be empty or have any content]. Once created, the contents of a version directory are expected to be immutable.

@rosy1280
Copy link
Contributor

@zimeon 👍🏼 to it may be a breaking change. I've been wondering that as we drafted this. Should have said something sooner.

@awoods
Copy link
Member

awoods commented Dec 1, 2021

Given the fact that we do not want to introduce any breaking changes in a 1.1 release, would a softening of my earlier suggestion to a SHOULD instead of MUST be sufficient guidance for this release?

In the case where no files have been added or updated in a given version, which would result in an empty and therefore absent "content" directory (see https://ocfl.io/1.0/spec/#content-directory), such a version directory SHOULD include an inventory file.

@neilsjefferies
Copy link
Member

I am now leaning towards @awoods suggestion as the minimal change to the spec required to resolve the issue. It is a little untidy only if you are taking the NOT RECOMMENDED route of not having version inventories.

@zimeon
Copy link
Contributor Author

zimeon commented Dec 1, 2021

I do not think we should make the earlier suggestion but with SHOULD instead of MUST because it doesn't solve the problem: it would still just be a warning to not have a version directory even though now two warnings (no inventory and no directory).

I think we should punt this to v2.0 with the understanding that in v1.0 (and v1.1) it is possible (though not recommended) to not have a version directory in the case of no files updated and no version inventories stored. I don't see a non-breaking correction/fix without other implications.

@awoods
Copy link
Member

awoods commented Dec 1, 2021

I agree that my updated suggestion does not solve the problem. It does, however, provide clear guidance on how to address the empty version directory scenario.

If that guidance is less helpful than not, I am happy to leave the text as-is, and punt to 2.0.

@pwinckles
Copy link

Is it spelled out somewhere what the compatibility between 1.0 and 1.1 is supposed to be? Is 1.1 supposed to just be 1.0, but with a few validations made explicit?

It's true that the no_content change would make the representation on disk of 1.0 and 1.1 versions substantively different so that some 1.0 versions would be invalid per 1.1 and some 1.1 versions would be invalid per 1.0. However, this is only true depending on how validators are intended to behave.

If an object is created 1.0 and is later "upgraded" to 1.1, should the 1.0 versions be validated against the 1.0 spec or the 1.1 spec?

For my validators, I had originally planned on simply updating everything to validate to 1.1, because the majority of the changes were providing clarity to constraints that could already be inferred from the 1.0 spec.

If 1.1 were to include the no_content change then things become more complicated, and I was thinking of validating versions based on the spec version that they were created under. It additionally introduces the complication for OCFL clients that would need to create versions slightly differently depending on the current spec version the object conforms to.

All of that to say, I think you could put the no_content change in 1.1 if you wanted. It would make clients and validators more complicated, but it wouldn't "break" anything. Personally, I'm just as happy punting on it because it means I have less work to do, and this is a very niche edge case.

@zimeon
Copy link
Contributor Author

zimeon commented Dec 1, 2021

@pwinckles : per your comment about validating versions: The spec is clear that versions should be validated against the version they were written to conform to, but this isn't actually clear without a version inventory... I have created #569 to discuss

@pwinckles
Copy link

Requiring a namaste file in all versions would solve the empty directory problem. :D

@neilsjefferies
Copy link
Member

+1 Punt this one to 2.0

@neilsjefferies
Copy link
Member

...is there any mileage in putting something about this in the Implementation Notes?

@rosy1280 rosy1280 modified the milestones: 1.1, 2.0 Dec 8, 2021
@zimeon
Copy link
Contributor Author

zimeon commented Dec 9, 2021

Agreement in community call to delay until 2.0 (@rosy1280 @awoods @julianmorley @zimeon present). Removing 1.1 tag

@rosy1280 rosy1280 added Needs Discussion and removed Question Further information is requested Ready for Review Ready for review by editorial group OCFL Object labels Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants