-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EBI-GOA to generate since GAF & GPAD & GPI files for each species #1521
Comments
To clarify: this only applies to GAFs where there does not exist a MOD that assigns gene IDs. For a MOD like dictyBase, even though the GAFs are currently managed at EBI-GOA, we just want a traditional GAF, the quad distinction (isoform, complex, etc) is only relevant when UniProt IDs are the primary IDs This is the current dicty file which is what we want
The GAFs that are here are uniprot ID-based: We shouldn't consume or promote the MOD ones (dicty, zfin, etc) as this will lead to confusing. Of course we still want the consolidates ones for human, cow, etc |
Updating this issue to clarify what the actual requirements are for human, etc.
The so-called isoform file should not be included as this confuses things, messes up counts for enrichment etc. This is in fact the contents of the files at https://ftp.ebi.ac.uk/pub/contrib/goa/grcp_plus_test/, as indicated in the header:
This is all good, but it is important to precisely clarify requirements / spec in the issue But my previous point still stands, for dicty we want gene IDs |
Why dont we want annotations to isoforms? we are missing a lot of annotations because we exclude these. |
Clarification: there are two ways people use 'isoforms'; curators mean splice variants, and Uniprot means any entry with small differences. The files now generated by GOA only contain splice variants-types isoforms. |
This is replaced by #2341 (comment) |
And for what it's worth / historical note, Bill Pearson strongly preferred the UniProt broad sense: any variant polypeptide encoded by a gene as a result of alternative splicing or alternative transcriptional start sites, but not, I think variants due to any form of post-translational modification including peptide bind cleavage. |
Hello,
Right now we have 4 files for several species (human chicken cow pig dog), which was requested here #156 (with little explanation as to why we anted that).
@alexsign will generate a single file for those species.
we will also need to update the GOA yaml file
Thanks, Pascale
The text was updated successfully, but these errors were encountered: