Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What if my content includes files called inventory.json or inventory.json.sha512? #230

Closed
zimeon opened this issue Oct 12, 2018 · 20 comments
Assignees
Milestone

Comments

@zimeon
Copy link
Contributor

zimeon commented Oct 12, 2018

I see two options:

  1. With the spec as currently written there would need to be a note explaining that these are special cases where the existing file path cannot be the original filepath, it must be renamed inside the OCFL object, e.g.:
[object root]
|- v1
|    |- inventory.json   <-- the actual inventory, not the file so named in v1 content
|    |- inventory.json.sha512   <-- the actual digest sidecar
|    |- inventory.json_moved   <-- the content called inventory.json in v1 state
|    \- inventory.json.sha512_moved  <-- the content called inventory.json.sha512 in v1 state
\...
  1. Use another directory as is done in BagIt (where it is called data) so that there is clean separation:
[object root]
|- v1
|    |- inventory.json   <-- the actual inventory, not the file so named in v1 content
|    |- inventory.json.sha512   <-- the actual digest sidecar
|    \- data
|          |- inventory.json  <-- the content called inventory.json in v1 state
|          \- inventory.json.sha512  <-- the content called inventory.json.sha512 in v1 state
\...
@zimeon zimeon added Question Further information is requested OCFL Object labels Oct 12, 2018
@ahankinson
Copy link
Contributor

ahankinson commented Oct 12, 2018

I think this is a problem, but I'm not sure how big a problem it is.

  1. The inventory files are only valid in the root of the version directory. Any other inventory files are treated as content.
  2. Inventory files in the version directories are optional (but recommended)

If we have to address it, though, I would go with option 2.

@ahankinson
Copy link
Contributor

Maybe call it content, instead of data, since that's what we seem to be calling it anyway?

@zimeon
Copy link
Contributor Author

zimeon commented Oct 12, 2018

I agree that if we were to adopt option 2 then content would be a better name.

@neilsjefferies
Copy link
Member

...or make the name ocfl_inventory.json etc. to make it a minimal chance of collision and just say they are reserved and you can't have them.

@ahankinson
Copy link
Contributor

@neilsjefferies that might get a bit messy if we're wanting to handle arbitrary content from someone's HD. Especially if they've adopted OCFL as a way of organizing their content. :)

@neilsjefferies
Copy link
Member

That would be archivists just being silly. If that is what you are doing then packaging is really the way to go.

@awoods
Copy link
Member

awoods commented Oct 12, 2018

It is probably important that we support the case of someone's content having the same name as OCFL administrative metadata files.
That being the case, I would also prefer option no.2 for its avoidance of renaming files.

@awoods awoods added Alpha and removed Alpha labels Oct 12, 2018
@ahankinson ahankinson added this to the Alpha milestone Oct 12, 2018
@zimeon zimeon modified the milestones: Alpha, Beta Oct 12, 2018
@zimeon
Copy link
Contributor Author

zimeon commented Oct 13, 2018

I lean toward having an extra content directory to make it clean/clear

@neilsjefferies
Copy link
Member

Since the inventory already has a mechanism for separating logical path in the object from actual path on disk, isn't this really just an Implementation Note. Since you can't have those names on disk, rename them - they can still have the name in the inventory. The system we have can handle it already, why introduce a change for a small corner case?

@ahankinson
Copy link
Contributor

TBH, having the inventory in the same directory root as content has never really sat well with me. I just wasn't able to say why. It seemed like we were mixing administrative and content data.

@ahankinson
Copy link
Contributor

@rosy1280 @julianmorley pretty please could we have some input on this so we can move it along?

@rosy1280
Copy link
Contributor

i think it would be cleaner and more human readable if their was a content directory and we put the content files in them. there is a reason that bagit and moab do this. so lets not reject use cases because we don't think people should do them.

@ahankinson ahankinson modified the milestones: Beta, Alpha Oct 17, 2018
@julianmorley
Copy link
Contributor

Yeah, having a content directory makes a great deal of sense. It enables a clean nesting of OCFL objects inside OCFL objects.

@rosy1280
Copy link
Contributor

also when you start to think about distributed digital preservation (which i don't want to broach, but...) if Emory's repository implements OCFL and sends their content to Chronopolis that has implemented OCFL you may end up with clashing inventory.json and inventory.json.sha512

@ahankinson ahankinson self-assigned this Oct 17, 2018
@zimeon
Copy link
Contributor Author

zimeon commented Oct 17, 2018

Per https://github.com/OCFL/spec/wiki/2018.10.17-Editors-Meeting agreed to use content directory

@zimeon zimeon removed the Question Further information is requested label Oct 17, 2018
@ahankinson
Copy link
Contributor

if it wasn't clear by the assignments, I'm currently working on a PR to this effect

@ahankinson
Copy link
Contributor

Q: Do the 'logical' filepaths also omit the 'content' part?

@awoods
Copy link
Member

awoods commented Oct 17, 2018

@ahankinson : it would make sense to me that content would be excluded from logical file paths.

@ahankinson
Copy link
Contributor

👍

@ahankinson
Copy link
Contributor

Also, are content directories a MUST?

ahankinson added a commit that referenced this issue Oct 17, 2018
awoods pushed a commit that referenced this issue Oct 17, 2018
Fixes #230

Per speedy-mode... merge on 3!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants