Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show all evidence when more than one piece of evidence is associated with a standard annotation #49

Open
krchristie opened this issue Sep 16, 2024 · 8 comments
Assignees
Labels
bug Something isn't working high priority

Comments

@krchristie
Copy link

krchristie commented Sep 16, 2024

@LiNiMGI noticed that that the Standard Annotations editor fails to show all evidence when there is more than one piece of evidence associated with an annotation. It is clearly not a problem with use of the IKR evidence code as I was allowed to add a test annotation (to the term "RNA polymerase II activity") without any problem.

Compare the screenshots of the annotation to "sequence-specific DNA binding" (highlighted in both screenshots below):
In the Form Editor, this annotation is associated with two pieces of evidence:

  • IDA = direct assay evidence used in manual assertion
  • IKR = phylogenetic determination of loss of key residues evidence used in manual assertion
    In the Standard Annotations editor, only ONE evidence line is associated:
  • IDA = direct assay evidence used in manual assertion

I also notice that the Standard Annotations editor does NOT have an option to allow the curator to add additional evidence to the same annotation.

Considering that Noctua has allowed, even encouraged, adding a second piece of evidence to the same annotation for years, I think the best solution would be to add this functionality to the Standard Annotations editor as well. This would certainly produce the best work flow for annotators as it is quicker to add an additional piece of evidence to the same data that is already in the form than to have to reenter all of the information in order to add additional evidence.

SSDB-FormEditor

SSDB-StandardAnnotations

@vanaukenk vanaukenk added the bug Something isn't working label Sep 16, 2024
@vanaukenk vanaukenk changed the title failure to show all evidence when more than one piece of evidence is associated Show all evidence when more than one piece of evidence is associated Sep 16, 2024
@vanaukenk vanaukenk changed the title Show all evidence when more than one piece of evidence is associated Show all evidence when more than one piece of evidence is associated with a standard annotation Sep 16, 2024
@thomaspd
Copy link

I realize that we made the decision, in the initial Noctua standard annotation "conversion" and load, to merge multiple standard annotations to the same term, into a single GO-CAM statement with multiple pieces of evidence. However, now that we're supporting standard annotations as a distinct type versus GO-CAM models, it would be much cleaner to store each standard annotation as a separate statement with one piece of evidence, rather than merging them. This way, all standard annotations would have a consistent structure, and would be less demanding on the UI (which wouldn't have to handle different structures. To do this, we would want to make the conversion on the back-end so that the change wouldn't impact curators. I would suggest that we talk to @dustine32 and @kltm about this possibility, as one side effect would be that "gene-centric" files for highly annotated genes would likely get somewhat larger.

@vanaukenk
Copy link

vanaukenk commented Oct 29, 2024

Thanks @thomaspd
Yes, that is exactly the discussion we need to have with @dustine32 @kltm @tmushayahama @balhoff
Splitting the standard annotations out to have a single piece of evidence would definitely help, we just need to think through exactly how we would implement this since we don't want to do the same for evidence on the GO-CAMs, I think.

@dustine32
Copy link

We can isolate only the MOD import models via the dc:dateAccepted property. This property was explicitly added to note import date and should only be present on the MOD import models. Ex:

<http://model.geneontology.org/MGI_MGI_99702> <http://purl.org/dc/terms/dateAccepted> "2022-03-07" ;

Only other issues I can initially think of:

  1. Obviously this would inflate the model TTL file sizes and I believe the already existent growth has been affecting the GO pipeline by occasionally timing out on noctua-models repo checkouts. @kltm could confirm that.
  2. We would need to mint new individual instance IDs when duplicating nodes for entities and terms. Maybe we can get away with just adding digits (e.g., -0001, -0002) to existing IDs?

@vanaukenk
Copy link

From 2024-11-14 workbenches call:

Query the data store to figure out how many 'annotations' have multiple pieces of evidence so we can get a handle on the scope of the problem.

We are particularly interested in the imported models and could use the dc:dateAccepted property:

We can isolate only the MOD import models via the dc:dateAccepted property. This property was explicitly added to note import date and should only be present on the MOD import models. Ex:

http://model.geneontology.org/MGI_MGI_99702 http://purl.org/dc/terms/dateAccepted "2022-03-07" ;

@balhoff @kltm

@vanaukenk
Copy link

First reports:
There are >7500 models with nodes connected to more than one evidence:
approx. half MGI, half SGD and a smattering of WB

@balhoff
Copy link
Member

balhoff commented Nov 14, 2024

@dustine32 here is the query I ran. Does that look like the right check?

PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX lego: <http://geneontology.org/lego/>
SELECT DISTINCT ?model 
WHERE {
  ?model a owl:Ontology .
  FILTER EXISTS {
    ?model <http://purl.org/dc/terms/dateAccepted> ?accepted .
  }
  FILTER EXISTS {
    GRAPH ?model {
      ?ann lego:evidence ?evidence1 .
      ?ann lego:evidence ?evidence2 .
      FILTER(?evidence1 != ?evidence2)
    }
  }
}

@dustine32
Copy link

@balhoff Yup! This looks about right. I'd be confident with the results after spot-checking a couple of them.

@balhoff
Copy link
Member

balhoff commented Nov 15, 2024

@dustine32 here are some:

<http://model.geneontology.org/SGD_S000002792>
<http://model.geneontology.org/SGD_S000005943>
<http://model.geneontology.org/SGD_S000006292>
<http://model.geneontology.org/MGI_MGI_2446237>
<http://model.geneontology.org/SGD_S000006286>
<http://model.geneontology.org/MGI_MGI_107555>
<http://model.geneontology.org/SGD_S000004491>

I spot-checked the first four.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working high priority
Development

No branches or pull requests

6 participants