Skip to content

2018.07.20 Community Meeting

Andrew Woods edited this page Sep 13, 2018 · 22 revisions

Call-in Details

Attendees

  • Andrew Hankinson (Bodleian Libraries, University of Oxford)
  • Andrew Woods (Duraspace)
  • Julian Morley (Stanford)
  • Steve Liu
  • Ben Cail (Brown)
  • John Kunze (CDL)
  • Simeon Warner (Cornell)
  • Debra Hanken Kurtz
  • Rosalyn Metz (Emory)

Regrets

  • ...

Agenda

  1. Community updates / points of discussion?
  2. Quick review of where the specification stands (Andrew Hankinson)
  3. Editorial Team Decision Point.
  4. Unresolved issues. Please review and comment (Rosalyn Metz)
  5. Call timing -- is 11am ET on Fridays a good time?
  6. Community experiences that may inform OCFL process

Action Items

  • Use of the term bitstream in the definition of OCFL.
    • If the name includes file in the title, we should include file in the definition. Issue #30

Notes

Audio recording

  1. Community updates / points of discussion?
    • None
  2. Quick review of where the specification stands (Andrew Hankinson)
    • Note rapid update policy through beginning of September
    • Edits being made as PRs and go live at https://ocfl.io/ as soon as merged
    • Editors holding a meeting in-person in early September
    • Seeking feedback on the need statement: 👍 from Ben, John may comment in writing
  3. Editorial Team Decision Point.
    • Decouple object from storage location Issue #22
      • Clear need to support both filesystems and object stores, decision for now is not to support references to resources found elsewhere (ala. fetch.txt in bagit).
      • Julian notes that ability to reference remote content is important to Stanford, so want to have pointers on local filesystem to something like Glacier. But happy not to have this in v1.
      • Andrew Hankinson notes discussion was illuminating in that it opens up many questions in the spec and we will have scoping work to do
      • John notes that versions are front-and-center in OCFL. Notes issue with lexical sort and non-zero padded directories. Notes options of zero-padding to, say, 3 digits that then extends without zero padding.
      • Julian notes Stanford highest version is 20, average 2.6.
      • Current spec https://ocfl.io/#versions-directories and discussion https://github.com/OCFL/spec/issues/2
      • John suggests that question should be motivated by expected number of versions and understanding of who one is optimizing for
      • John - question about rationale for always having a v0 prefix if any zero-padded. What about the case where one starts zero padded choice runs out, is it OK to then switch over to non-padded -- NO, not in current spec. Another option would be say that spec is entirely agnostic on any amount of zero padding, order should be obtained by numeric sort after stripping the v
      • Ben - no legacy data issue, so would follow recommendations and use un-padded
    • Inventory sidecar file with hash
      • Have put in notes on sidecar files for digests, note issue with calculation on calculation and encoding method
      • John - in bagit have found that the list of algorithms changes with time. Have to understand that in time old objects wont't have the best/good algorithms, and need to understand flexibility (registries/updates)
      • John - might want provision for updating objects with recomputed checksums using newer algorithms, would be good if spec permits such a change in objects
      • Julian - notes Moab doesn't allow an easy way to update checksums. Already have MD5, SHA1 and SHA256, and might want to do updates
      • Andrew Hankinson - thinks of spec as "object at rest" and perhaps add implementation notes for "object in motion" or "updates" which might cover processes such as changing the checksum
      • Andrew Woods - note difference between changing algorithms as new versions are added and the case of recomputing/updating checksums for old versions
      • John - second case would be useful in order to understand new digest addition as a normal operation on archival objects
      • Andrew - need to support immutability of versions
      • Simeon - should support though not necessarily require immutability of version, could perhaps change inventory file to use a new digest if data is stored in a way that permits this
      • John - is it required to have an digest sidecar? YES - by current https://ocfl.io/#digest-sidecar-file
      • Rosy - current expectation that every file will have a digest as well as the inventory
    • Object definition to refer to "file" as opposed to "bitstream", from previous community meeting (WIP pull-request)
      • No discussion
  4. Unresolved issues. Please review and comment (Rosalyn Metz)
  5. Call timing -- is 11am ET on Fridays a good time?
  6. Community experiences that may inform OCFL process
    • No comments
Clone this wiki locally