-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make a quick reference that maps typecode to schema class name #2285
Comments
https://api.microbiomedata.org/docs#/metadata/get_nmdc_schema_typecodes_nmdcschema_typecodes_get retrieves the typecode to class relationships I don't understand the values in the |
The source file can be obtained from the |
Thanks for sharing your initial thoughts about this, @turbomam. Speaking of Runtime endpoints, an alternative to including a table like this in the schema documentation could be (depending upon the use cases @SamuelPurvine and @kheal have in mind) to implement a Runtime API endpoint that returns the JSON "equivalent" of the table; or (as opposed to returning the full "table") performs a lookup (using the "table" under the hood), although the fact that they have built this example table leads me to suspect that doing such one-by-one lookups does not satisfy the use case(s) they have in mind. |
Hi folx! This was generated by Camilo wondering (in a meeting) where in the heck he needed to go to get ahold of what the ID typecodes mean, without trying to dig through all of the yaml files since these are distributed. I started in on trying to cull this info and quickly realized this was a silly thing to do by hand (which is why it's ad hoc and incomplete above), especially as things change and such. Moreover, I can see the utility of having a one stop place in the documentation (i.e. not having to dig it out of the api with various arcane commands) that would reference these ID structures, not just for schema maintainers and newbies, but also the general public who comes to the site and is confronted with ID structures, again who may not be facile with dredging data from an api. Would love to see it as part of https://microbiomedata.github.io/nmdc-schema/identifiers/, maybe as an extension of the "IDs minted for use within NMDC" section, as a quick guide to what each of the type codes relates to. And yes, it would be great if it ran every so often to keep up to date. Of course, if there's no desire to have such info available (easily) in the documentation, they by all means ignore the request! |
@turbomam - thanks for pointing out this endpoint, I had missed it! Between this endpoint and the https://api.microbiomedata.org/docs#/metadata/get_by_id_nmdcschema_ids__doc_id__get:~:text=Get%20By-,Id,-If%20the%20identifier endpoint my needs are satisfied, but I see value in transparent documentation of typecode mapping in the schema documentation. I think a simple table of typecode | class (with link to documentation of that class) | collection is what users might benefit from. |
Thanks, @kheal and @SamuelPurvine. @SamuelPurvine, @turbomam, @aclum, and I discussed this during today's metadata meeting. Action planI will prototype a "page" (or something page-like) in the schema documentation that shows a mapping from typecode to schema class name and, if practical, a link to documentation for that schema class. The "page" will be auto-generated from the schema. It will not exist as a file in the repository, but will be generated whenever the schema documentation gets built (that's also how the inter-collection diagram gets generated). I will prototype it in a PR and then invite people to review it. |
...next sprint. |
Here's a link to a Python notebook in a "private" GitHub Gist (accessible to anyone that has this link, so it is effectively public), in which I implemented an algorithm for generating a Markdown table of typecodes, schema class names, and schema class documentation URLs. https://gist.github.com/eecavanna/dce7c1672ad7d65f265972649d248342 We can use this notebook as a conversation piece at the next metadata meeting. |
@SamuelPurvine reached out to me today to suggest that a table like the following be included in the schema documentation:
He sent it to me as a TSV file that he created by hand in collaboration with @kheal. Here are the TSV lines corresponding to the above table rows:
I think he and @kheal want there to be an easier way to gather this information than looking at the schema.
Here are my thoughts from a technical maintenance perspective:
I will defer to @SamuelPurvine and @kheal regarding the use cases they have in mind, and an explanation of the data in the table.
Note: GitHub's "Discussions" feature may be a better fit for this conversation than a GitHub Issue, at this point.
The text was updated successfully, but these errors were encountered: