Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update mixs-templates/README.md #786

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

turbomam
Copy link
Member

@turbomam turbomam commented Apr 2, 2024

No description provided.

@@ -1,11 +1,17 @@
## MIxS (meta)data collection spreadsheet templates

This folder contains the MIxS schema (meta)data collection templates in the Excel spreadsheet (.xlsx) format. Each of the templates in this folder are blank spreadsheet templates with only the header row populated with the names of the terms that are associated with a particular [checklist](https://sujaypatil96.github.io/mixs/#checklists), [extension](https://sujaypatil96.github.io/mixs/#extensions), or [combination](https://sujaypatil96.github.io/mixs/combinations/) (of one checklist plus one extension). The templates offer pulldown menus for filling out columns for which there is a brief [enumeration](https://sujaypatil96.github.io/mixs/enumerations/) of permissible values, like [window_type](https://sujaypatil96.github.io/mixs/0000856/), but this [feature is disabled](https://linkml.io/linkml/generators/excel.html) if the total character length of the permissible values is more than 255, like [fao_class](https://sujaypatil96.github.io/mixs/0001083/).
In this folder, the GSC provides Microsoft Excel (.xlsx) spreadsheets for collecting metadata that is roughly aligned with the MIxS standard. THe GSC appreciates that spreadsheets like these can be very useful for preparing data to be submitted to an INSDC database. Users should be aware of the limitations of these spreadsheets, listed below.
Copy link
Collaborator

@sujaypatil96 sujaypatil96 Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • what does "roughly aligned" mean?
  • typo "THe" --> "The"


Note: The [linkml-validate](https://linkml.io/linkml/data/validating-data.html#the-linkml-validate-cli) command-line tool can be used to check the validity of a completed template, but no validation is provided in the template itself.
Each of the templates in this folder are blank spreadsheet templates with only the header row populated with the names of the terms that are associated with a particular [checklist](https://sujaypatil96.github.io/mixs/#checklists), [extension](https://sujaypatil96.github.io/mixs/#extensions), or [combination](https://sujaypatil96.github.io/mixs/combinations/) (of one checklist plus one extension).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we have started using the word "templates" without clarifying what "templates" are exactly? I think we should use either "spreadsheet files" or "templates" uniformly throughout the text?


* There is one template for each *checklist*, *extension* and *combination* in the [schema](src/mixs/schema/mixs.yaml).

* The organization of files in this folder is such that there is one checklist subfolder per checklist, and an extensions subfolder.
* The checklist subfolders contain the template for that checklist, as well as the templates for the combinations that have been derived from that checklist.
* The extensions subfolder contains the templates for all the extensions in the schema.

### Limitations:
- The use of Excel spreadsheets is a violation of the 5-star linked data standards, which require the use of an open format like TSV or CSV
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provide a link to the 5-star linked data standards?

Found this after googling: https://www.w3.org/2011/gld/wiki/5_Star_Linked_Data


### Limitations:
- The use of Excel spreadsheets is a violation of the 5-star linked data standards, which require the use of an open format like TSV or CSV
- Excel files do not play well with GitHub: it is very difficult to view them or diff them within GH, and they take up more disk space than TSV or CSV
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GH --> GitHub

### Limitations:
- The use of Excel spreadsheets is a violation of the 5-star linked data standards, which require the use of an open format like TSV or CSV
- Excel files do not play well with GitHub: it is very difficult to view them or diff them within GH, and they take up more disk space than TSV or CSV
- These spreadsheets confuse the **specification of a standard** and an **application that uses the standard** to collect data. THe spreadsheets do not apply any validation to the entered metadata. They do provide one data entry convenience: pulldown menus are provided for filling out columns when there is a brief [enumeration](https://sujaypatil96.github.io/mixs/enumerations/) of permissible values, like [window_type](https://sujaypatil96.github.io/mixs/0000856/), but this [feature is disabled](https://linkml.io/linkml/generators/excel.html) if the total character length of the permissible values is more than 255, like [fao_class](https://sujaypatil96.github.io/mixs/0001083/).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great review, @sujaypatil96 . Apologies for leaving so much junk in there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants