-
Notifications
You must be signed in to change notification settings - Fork 89
/
gorefs.yaml
1971 lines (1933 loc) · 115 KB
/
gorefs.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
- id: GO_REF:0000001
title: OBSOLETE GO Consortium unpublished data
description: No abstract available.
authors: GO curators
is_obsolete: true
year: 1998
- id: GO_REF:0000002
title: Gene Ontology annotation through association of InterPro records with GO
terms.
description: |-
InterPro (http://www.ebi.ac.uk/interpro/) is an integrated resource from the EBI of protein families, domains and sites which are combined from a number
of different protein signature databases, including CATH-Gene3D, CDD, HAMAP, NCBIfam, Panther, Pfam, PIRSF, PRINTS, ProSite, SMART, and SUPERFAMILY. Signatures describing the same protein
family or domain are grouped into unique InterPro entries. When appropriate, the InterPro team maps InterPro entries to GO terms (Molecular Function, Biological Process and/or Cellular Component).
InterPro runs InterProScan on the entire set of UniProt proteins. InterPro hits are assigned the corresponding GO annotations by the GOA pipeline. The mapping file is available at http://current.geneontology.org/ontology/external2go/interpro2go.
comments:
- Note that some groups filter GO annotations based on InterPro-to-GO transitive assignment, e.g. to remove annotations redundant with manual curation.
alt_id:
- GO_REF:0000007
- GO_REF:0000014
- GO_REF:0000016
- GO_REF:0000017
authors: GO Central curators, 2024.
external_accession:
- MGI:2152098
- J:72247
- ZFIN:ZDB-PUB-020724-1
- FB:FBrf0174215
- dictyBase_REF:10157
- SGD_REF:S000124036
is_obsolete: false
year: 2001
- id: GO_REF:0000003
title: Gene Ontology annotation based on Enzyme Commission mapping
description: |-
In UniProt, proteins with enzymatic activities have traditionally been annotated using reference vocabularies such as the hierarchical enzyme classification of the Enzyme Nomenclature Committee of the IUBMB
(often referred to as Enzyme Commission or EC numbers; https://enzyme.expasy.org/). EC numbers are manually curated in Swiss-Prot ('UniProt Reviewed') entries and added automatically to
TrEMBL ('UniProt Uneviewed') entries. A mapping between EC numbers and GO Molecular Function terms is maintainted by GO editors. UniProt entries assigned with an EC number
with a GO mapping are automatically annotated with the correponding GO term, as described in PMID:30395331. GO annotations using this method receive the evidence code
Inferred from Electronic Annotation (IEA). This method has been evaluated at up to 100% accurate (Camon et. al. 2005). The EC2GO mapping file is available at
http://current.geneontology.org/ontology/external2go/ec2go.
alt_id:
- GO_REF:0000005
authors: GO Central curators, 2024
citation: PMID:31688925
is_obsolete: false
year: 2001
- id: GO_REF:0000004
title: Gene Ontology annotation based on UniProtKB keyword mapping.
description: |-
Transitive assignments using UniProtKB keywords. The UniProtKB keyword
controlled vocabulary has been created and used by the UniProt Knowledgebase
(UniProtKB) to supply 10 different categories of information to UniProtKB entries.
Further information on the UniProtKB keyword resource can be found at
http://www.uniprot.org/docs/keywlist.
Further information on the UniProt annotation methods is available at
https://www.uniprot.org/help/manual_curation and
https://www.uniprot.org/help/automatic_annotation.
When a UniProtKB keyword describes a concept that is within the scope of the Gene
Ontology, it is investigated to determine whether it is appropriate to map the
keyword to an equivalent term in GO. The mapping between UniProtKB keywords and
GO terms is carried out manually. Definitions and hierarchies of the terms in the
two resources are compared and the mapping generated will reflect the most correct
correspondence. The translation table between GO terms and UniProtKB keywords is
maintained by the UniProt-GOA team and available at
http://www.geneontology.org/external2go/uniprotkb_kw2go.
comments:
- Formerly GOA:spkw.
alt_id:
- GO_REF:0000009
- GO_REF:0000013
authors: GOA curators
external_accession:
- MGI:1354194
- J:60000
- ZFIN:ZDB-PUB-020723-1
- SGD_REF:S000124038
is_obsolete: false
year: 2000
- id: GO_REF:0000006
title: OBSOLETE Gene Ontology annotation by the MGI curatorial staff, Mouse Locus
Catalog
description: |-
For annotations documented via this citation, curators used the information in
the Mouse Locus Catalog in MGI to assign GO terms. The GO terms were assigned
based on MLC textual descriptions of genes that could not be traced to the primary
literature. Details of this strategy can be found in Hill et al, Genomics (2001)
74:121-128.
authors: Mouse Genome Informatics scientific curators
citation: PMID:11374909
external_accession:
- MGI:2152097
- J:72246
is_obsolete: true
year: 2001
- id: GO_REF:0000008
title: Gene Ontology annotation by the MGI curatorial staff, curated orthology
description: |-
The sequence conservation that permits the establishment of orthology between
mouse and rat or mouse and human genes is a strong predictor of the conservation
of function for the gene product across these species. Therefore, in instances
where a mouse gene product has not been functionally characterized, but its human
or rat orthologs have, Mouse Genome Informatics (MGI) curators append the GO terms
associated with the orthologous gene(s) to the mouse gene. Only those GO terms
assigned by experimental determination to the ortholog of the mouse gene will
be adopted by MGI. GO terms that are assigned to the ortholog of the mouse gene
computationally (i.e. IEA), will not be transferred to the mouse ortholog. The
evidence code represented by this citation is Inferred by Sequence Orthology (ISO).
authors: Mouse Genome Informatics scientific curators
external_accession:
- MGI:2154458
- J:73065
is_obsolete: false
year: 2001
- id: GO_REF:0000010
title: OBSOLETE Gene Ontology annotation by the MGI curatorial staff, mouse gene
nomenclature
description: |-
For annotations documented via this citation, curators designed queries based
on their knowledge of mouse gene nomenclature to group genes that shared common
molecular functions, biological processes or cellular components. GO annotations
were assigned to these genes in groups. Details of this strategy can be found
in Hill et al., Genomics (2001) 74:121-128.
authors: Mouse Genome Informatics scientific curators
citation: PMID:11374909
external_accession:
- MGI:1347124
- J:56000
is_obsolete: true
year: 1999
- id: GO_REF:0000011
title: Hidden Markov Models (TIGR)
description: |-
A Hidden Markov Model (HMM) is a statistical representation of patterns found
in a data set. When using HMMs with proteins, the HMM is a statistical model of
the patterns of the amino acids found in a multiple alignment of a set of proteins
called the "seed". Seed proteins are chosen based on sequence similarity to each
other. Seed members can be chosen with different levels of relationship to each
other. They can be members of a superfamily (ex. ABC transporter, ATP-binding
proteins), they can all share the same exact specific function (ex. biotin synthase)
or they could share another type of relationship of intermediate specificity (ex.
subfamily, domain). New proteins can be scored against the model generated from
the seed according to how closely the patterns of amino acids in the new proteins
match those in the seed. There are two scores assigned to the HMM which allow
annotators to judge how well any new protein scores to the model. Proteins scoring
above the "trusted cutoff" score can be assumed to be part of the group defined
by the seed. Proteins scoring below the "noise cutoff" score can be assumed to
NOT be a part of the group. Proteins scoring between the trusted and noise cutoffs
may be part of the group but may not. One of the important features of HMMs is
that they are built from a multiple alignment of protein sequences, not a pairwise
alignment. This is significant, since shared similarity between many proteins
is much more likely to indicate shared functional relationship than sequence similarity
between just two proteins. The usefulness of an HMM is directly related to the
amount of care that is taken in chosing the seed members, building a good multiple
alignment of the seed members, assessing the level of specificity of the model,
and choosing the cutoff scores correctly. In order to properly assess what functional
relevance an above-trusted scoring HMM match has to a query, one must carefully
determine what the functional scope of the HMM is. If the HMM models proteins
that all share the same function then it is likely possible to assign a specific
function to high-scoring match proteins based on the HMM. If the HMM models proteins
that have a wide variety of functions, then it will not be possible to assign
a specific function to the query based on the HMM match, however, depending on
the nature of the HMM in question, it may be possible to assign a more general
(family or subfamily level) function. In order to determine the functional scope
of an HMM, one must carefully read the documentation associated with the HMM.
The annotator must also consider whether the function attributed to the proteins
in the HMM makes sense for the query based on what is known about the organism
in which the query protein resides and in light of any other information that
might be available about the query protein. After carefully considering all of
these issues the annotator makes an annotation.
authors: Michelle Gwinn, TIGR curators
is_obsolete: false
year: 2003
- id: GO_REF:0000012
title: Pairwise alignment (TIGR)
description: |-
Pairwise alignments are generated by taking two sequences and aligning them so
that the maximum number of amino acids in each protein match, or are similar to,
each other. Tools such as BLAST work by comparing a protein-of-interest individually
with every protein in a database of known protein sequences and retaining only
those matches with a high probability of being significant. Basic BLAST generates
local alignments between proteins for regions of high similarity. Other pairwise
alignment tools attempt to generate global (full-length) protein alignments. A
tool called Blast_Extend_repraze (BER, http://ber.sourceforge.net) has some benefits
over basic BLAST. Input into the BER tool includes the underlying DNA sequence
for each protein as well as 300 nucleotides upstream and downstream of the predicted
boundaries of the protein coding sequence. This allows annotators to see the DNA
sequence that underlies the query protein as part of the alignment. In addition,
the BER tool is able to look for continuation of regions of similarity through
frameshifts and in-frame stop codons. If such regions are found the alignment
is continued. BER searches are done in a two-step process: step one is a BLAST
search against a non-redundant protein database, significant BLAST hits are stored
in a mini-database for each query protein; step two is a modified Smith-Waterman
alignment between the query and the proteins in its mini-database. In order to
assess whether a given BER alignment is good enough to assert that the query shares
the function of the match protein, one must look at a several factors. First of
all, the match protein must itself be experimentally characterized in order to
avoid transitive annotation errors. In addition, any residues or secondary structures
known to be important for function in the match protein must be conserved in the
query. The alignment should be visually inspected to look for any areas of lesser
quality that might indicate the two proteins do not share the same function. Although
it is impossible to set cutoff values for percent identity and length of match
that will apply for every alignment, there are some guidelines. In general at
least 40% identity that extends over the full lengths of both proteins is required
in order to even consider functional equivalence. However, this percentage is
highly dependent on the length and complexity of the proteins. 40% identity between
two proteins 500 amino acids long is much more significant that 40% identity between
two proteins that are only 100 amino acids long. Therefore, the annotator's experience
and knowledge of what is considered significant for the organism and protein family
in question is very important. Some sets of proteins are much more highly conserved
than others and therefore tolerances for percent identity may have to be adjusted.
Finally, the alignment must be considered in the context of what else is known
about the query protein and the organism as a whole.
authors: Michelle Gwinn, TIGR curators
is_obsolete: false
year: 2003
- id: GO_REF:0000015
title: Use of the ND evidence code for Gene Ontology (GO) terms.
description: |-
Direct annotations to any of the three root terms 'molecular function; GO:0003674',
'biological process; GO:0008150' or 'cellular component; GO:0005575' indicate
that curators have found no data supporting an annotation to a more specific term,
either in the literature and/or by sequence similarity for this gene or protein
as of the date of the annotation.
authors: GO Curators
external_accession:
- ECO:0000307
- AspGD_REF:ASPL0000111607
- CGD_REF:CAL0125086
- dictyBase_REF:2
- dictyBase_REF:9851
- FB:FBrf0159398
- MGI:MGI:2156816
- RGD:1598407
- SGD_REF:S000069584
- TAIR:Communication:1345790
- ZFIN:ZDB-PUB-031118-1
- GO_REF:nd
is_obsolete: false
year: 2002
- id: GO_REF:0000018
title: OBSOLETE dictyBase 'Inferred from Electronic Annotation (BLAST method)'
description: |-
Gene Ontology (GO) annotations with the evidence code 'Inferred from Electronic
Annotation' (IEA) are assigned automatically to gene products in dictyBase. All
Dictyostelium protein sequences are analyzed by BLAST against GO gene association
sequence files, identifying proteins in other organisms that align with Dictyostelium
proteins with an E value less than or equal to e-50. GO annotations that have
been manually assigned to these proteins from other species are then imported
and attached to the corresponding gene product in dictyBase. The proteins from
which the annotations are derived are displayed in the 'Evidence' column on the
Gene Ontology evidence and references page.
authors: DictyBase curators
external_accession:
- dictyBase_REF:10158
is_obsolete: true
year: 2005
- id: GO_REF:0000019
title: OBSOLETE Automatic transfer of experimentally verified manual GO annotation
data to orthologs using Ensembl Compara
description: |-
GO terms from a source species are projected on to one or more target species
based on gene orthology obtained from the Ensembl Compara system. Only one to
one and apparent one to one orthologies are used for a restricted range of species.
Only GO annotations with a manual experimental evidence type of IDA, IEP, IGI,
IMP or IPI are projected. Projected GO annotations using this technique will receive
the evidence code, inferred from electronic annotation, 'IEA'. The Ensembl protein
identifier of the annotation source is indicated in the 'With' column of the GOA
association file.
authors: Ensembl curators, GOA curators
is_obsolete: true
year: 2006
- id: GO_REF:0000020
title: OBSOLETE Electronic Gene Ontology annotations created by transferring manual
GO annotations between orthologous microbial proteins
description: |-
GO terms are manually assigned to each HAMAP family rule. High-quality Automated
and Manual Annotation of microbial Proteins (HAMAP) family rules are a collection
of orthologous microbial protein families, from bacteria, archaea and plastids,
generated manually by expert curators. The assigned GO terms are then transferred
to all the proteins that belong to each HAMAP family. Only GO terms from the molecular
function and biological process ontologies are assigned. GO annotations using
this technique will receive the evidence code Inferred from Electronic Annotation
(IEA). These annotations are updated monthly by HAMAP and are available for download
on both GO and GOA EBI ftp sites. To report an annotation error or inconsistency,
or for further information, please contact the GO Consortium at [email protected]
or submit a comment the SourceForge Annotation Issues tracker
(http://sourceforge.net/projects/geneontology/). HAMAP is a project based at the Swiss
Institute of Bioinformatics (Gattiker et al. 2003, Comp. Biol and Chem. 27: 49-58).
For further information, please see http://www.expasy.org/sprot/hamap/.
authors: Swiss Institute of Bioinformatics (SIB) curators, GOA curators
is_obsolete: true
year: 2006
- id: GO_REF:0000021
title: Improving the representation of central nervous system development in the
biological process ontology
description: |-
Current genetic and molecular studies in many model organisms are aimed at understanding
formation and development of the nervous system. Up until this point, the GO has
had a very shallow representation of processes pertaining to the nervous system.
In June 2006, curators from MGI and ZFIN met with researchers studying central
nervous system development to improve the representation of these processes in
GO. In particular, emphasis was placed on three areas that are being addressed
actively in current research: forebrain development, hindbrain development and
neural tube development. This collaboration resulted in the addition of over 500
terms that reflect the development of the forebrain, the hindbrain, and the neural
tube from the perspective of biological process and anatomical structure.
authors: Judith Blake (1, 2), William Bug (3), Rex Chisholm (1, 4), Jennifer Clark (1,
5), Erika Feltrin (6), Jacqueline Finger (2), David Hill (1, 2), Midori Harris
(1, 5), Terry Hayamizu (2), Doug Howe (9), Maryanne Martone (7), Kathleen Millen
(8), Francis Sele (4) (1. The Gene Ontology Consortium, 2. Mouse Genome Informatics,
Bar Harbor, ME, 3. Drexel University, Philadelphia, PA, 4. Northwestern University,
Chicago, IL, 5. EMBL-EBI, Hinxton, Cambridgeshire, UK, 6. The University of Padua,
Padua, Italy, 7. The University of California at San Diego, San Diego, CA, 8.
The University of Chicago, Chicago, IL, 9. The Zebrafish Information Network,
University of Oregon, Eugene, OR)
is_obsolete: false
year: 2006
- id: GO_REF:0000022
title: Improving the representation of immunology in the biological process Ontology
description: |-
GO terms describing processes, functions, and cellular components related to the
immune system have existed in the GO from its beginning and been used extensively
in the annotation of gene products. However, particularly in the biological process
ontology, the initial set of terms relating to immunology failed to cover the
breadth of known immunological processes, and in many cases diverged from current
usage and understanding in their names, definitions, and ontological placement.
As part of a larger effort to improve the representation of immunology in the
GO, a GO Content Meeting was held November 15-16, 2005, at The Institute for Genomic
Research, to discuss improvements to representation of immunology in the biological
process ontology of the GO. As a result of the meeting, a number of high level
terms for immunological processes were created, an overall structure for immunologically
related terms was established, and certain existing terms were renamed or redefined
as well to bring them in line with current usage.
authors: Alison Deckhut Augustine (1), Alan Collmer (2), Judith A. Blake (3, 4), Candace
W. Collmer (2, 3), Shane C. Burgess (5), Lindsay Grey Cowell (6), Jennifer I.
Clark (3, 7), Bernard de Bono (7), Russell T. Collins (8), Alexander D. Diehl
(3, 4), Michelle Gwinn Giglio (3, 9), Jamie A. Lee (10), Linda Hannick (3, 9),
Jane Lomax (3, 7), Midori A. Harris (3, 7), Christopher J. Mungall (3, 11), David
P. Hill (3, 4), Richard H. Scheuermann (10), Amelia Ireland (3, 7), Alessandro
Sette (12) (1. NIAID, 2. Cornell University, 3. The GO Consortium, 4. Mouse Genome
Informatics, 5. Mississippi State University, 6. Duke University, 7. EMBL-EBI,
8. University of Cambridge, 9. The Institute for Genomic Research, 10. U.T. Southwestern
Medical Center, 11. HHMI, 12. La Jolla Institute for Allergy and Immunology)
is_obsolete: false
year: 2005
- id: GO_REF:0000023
title: Gene Ontology annotation based on UniProtKB Subcellular Location vocabulary
mapping.
description: |-
Transitive assignment of GO terms based on the UniProtKB Subcellular Location
vocabulary. UniProtKB Subcellular Location is a controlled vocabulary used to
supply subcellular location information to UniProtKB entries in the SUBCELLULAR
LOCATION lines. Terms from this vocabulary are annotated manually to UniProtKB/Swiss-Prot
entries but are automatically assigned to UniProtKB/TrEMBL entries from the underlying
nucleic acid databases and/or by the UniProt automatic annotation program.
Further information on these two different annotation methods is available at
http://www.uniprot.org/faq/45 and
http://www.uniprot.org/program/automatic_annotation.
When a UniProtKB Subcellular Location term describes a concept that is within
the scope of the Gene Ontology, it is investigated to determine whether it is
appropriate to map the term to an equivalent term in GO. The mapping between UniProtKB
Subcellular Location terms and GO terms is carried out manually. Definitions and
hierarchies of the terms in the two resources are compared and the mapping generated
will reflect the most correct correspondence. The translation table between GO
terms and UniProtKB Subcellular Location term is maintained by the UniProt-GOA
team and available at http://www.geneontology.org/external2go/spsl2go.
authors: GOA curators, UniProt curators
external_accession:
- SGD_REF:S000125578
is_obsolete: true
year: 2007
- id: GO_REF:0000024
title: Manual transfer of experimentally-verified manual GO annotation data to orthologs
by curator judgment of sequence similarity.
description: |-
Method for transferring manual annotations to an entry based on a curator's judgment
of its similarity to a putative ortholog that has annotations that are supported
with experimental evidence. Annotations are created when a curator judges that
the sequence of a protein shows high similarity to another protein that has annotation(s)
supported by experimental evidence (and therefore display one of the evidence
codes EXP, IDA, IGI, IMP, IPI or IEP). Annotations resulting from the transfer
of GO terms display the 'ISS' evidence code and include an accession for the protein
from which the annotation was projected in the 'with' field (column 8). This field
can contain either a UniProtKB accession or an IPI (International Protein Index)
identifier. Only annotations with an experimental evidence code and which do not
have the 'NOT' qualifier are transferred. Putative orthologs are chosen using
information combined from a variety of complementary sources. Potential orthologs
are initially identified using sequence similarity search programs such as BLAST.
Orthology relationships are then verified manually using a combination of resources
including sequence analysis tools, phylogenetic and comparative genomics databases
such as Ensembl Compara, INPARANOID and OrthoMCL, as well as other specialised
databases such as species-specific collections (e.g. HGNC's HCOP). In all cases
curators check each alignment and use their experience to assess whether similarity
is considered to be strong enough to infer that the two proteins have a common
function so that they can confidently project an annotation. While there is no
fixed cut-off point in percentage sequence similarity, generally proteins which
have greater than 30% identity that covers greater than 80% of the length of both
proteins are examined further. For mammalian proteins this cut-off tends to be
higher, with an average of 80% identity over 90% of the length of both proteins.
Strict orthologs are desirable but not essential. In general, when there is evidence
of multiple paralogs for a single species, annotations using less specific GO
terms are transferred to the paralogs, however, annotations using more specific
GO terms may be transferred to the most similar paralog in each species, this
decision is taken on a case by case basis and may be influenced by statements
by researchers in the field. Further detailed information on this procedure, including
how ISS annotations are made to protein isoforms, can be found at:
https://wiki.geneontology.org/Inferred_from_Sequence_or_structural_Similarity_(ISS).
authors: AgBase, BHF-UCL, Parkinson's UK-UCL, dictyBase, HGNC, Roslin Institute,
FlyBase and UniProtKB curators.
external_accession:
- dictyBase_REF:9
- J:342587
- FB:FBrf0255270
is_obsolete: false
year: 2011
- id: GO_REF:0000025
title: Operon structure as IGC evidence
description: |-
Genes in prokaryotic organisms are often arranged in operons. Genes in an operon
are all transcribed into one mRNA. Generally the genes in the operons code for
proteins that all have related functions. For example, they may be the steps in
a biochemical pathway, or they may be the subunits of a protein complex. Often
the genes in operons shared between organisms are syntenic; that is, the same
genes are in the same order in the operon in different species. When assessing
sequence-comparison-based evidence during the process of manual annotation of
a genome, it is often the case that some of the genes in the operon will have
strong sequence-based evidence while others will have weak evidence. If seen alone,
not in the presence of an operon, the weak evidence in question may not be sufficient
to make a functional annotation. However, in the presence of an operon in which
there is strong evidence for some of the genes, the very presence of the gene
in the operon is a strong indication that the gene shares in the process carried
out by the operon. If the putative function is one expected to exist for the process
in question and particularly if that function has been observed in the same operon
in another species, then the annotation should be made. This type of evidence
is inferred from the context of the gene in an operon, and therefore the evidence
code is IGC "inferred from genomic context."
authors: Michelle Gwinn, TIGR curators
is_obsolete: false
year: 2007
- id: GO_REF:0000026
title: OBSOLETE Improving the representation of muscle biology in the biological
process and cellular component ontologies.
description: |-
A meeting focused on the biology of skeletal and smooth muscle has been held on
24-25 July 2007 at the University of Padua, Italy, as a collaboration with the
GO consortium and CRIBI Biotechnology Center. The aims of this effort were to
provide a comprehensive representation of muscle biology in the biological process
and cellular component ontologies and to improve the organization of muscle-specific
terms to better describe the current knowledge of biological mechanisms in muscle
tissue. Thus, the collaboration brought together experts in several areas of muscle
biology and physiology who carried out a thorough review of the existing GO muscle
terms as these terms were largely created by non-muscle experts using older definitions.
In particular, several areas are being addressed actively in current research:
the biological processes of muscle contraction, muscle plasticity, muscle development,
and muscle regeneration; and the sarcoplasmic reticulum and membrane delimited
compartments. This work resulted in the addition of 159 new terms and in the modification
of 57 terms to bring them in line with current usage. Funding for the meeting
was provided by Italian Telethon Foundation.
authors: Jennifer Deegan nee Clark (1, 5), Alexander D. Diehl (1,7), Elisabeth Ehler (2),
Georgine Faulkner (3), Erika Feltrin (4), Jennifer Fordham (2), Midori Harris
(1, 5), Ralph Knoell (6) David Hill (1, 7), Paolo Laveder (8), Alessandra Nori
(8), Carlo Reggiani (8), Vincenzo Sorrentino (9), Giorgio Valle (4), Pompeo Volpe
(8) (1. The Gene Ontology Consortium, 2. King's College, London, UK, 3. ICGEB,
Trieste, Italy, 4. CRIBI - University of Padua, Padua, Italy 5. EMBL-EBI, Hinxton,
Cambridgeshire, UK, 6. University of Goettingen, Goettingen, Germany 7. Mouse
Genome Informatics, Bar Harbor, ME, 8. University of Padua, Padua, Italy, 9. University
of Siena, Siena, Italy)
citation: PMID:19178689
is_obsolete: true
year: 2007
- id: GO_REF:0000027
title: BLAST search criteria for ISS assignment in PAMGO_GAT
description: |-
This GO reference describes the criteria used in assigning the evidence code of
ISS via BLAST searches to annotate gene products from PAMGO_GAT. Standard BLASTP
from NCBI was used (http://www.ncbi.nih.gov/blast) to query the non-redundant
(NR) database. Hits are considered to be significant if the E-value is at or less
than 10^-4. All other parameters are default according to http://www.ncbi.nih.gov/blast.
authors: PAMGO_GAT curators
year: 2007
is_obsolete: false
- id: GO_REF:0000028
title: Criteria for IDA, IEP, ISS, IGC, RCA, and IEA assignment in PAMGO_MGG
description: |-
This GO reference describes the criteria used in assigning the evidence codes
of IDA (ECO:0000314), IEP (ECO:0000270), ISS (ECO:0000250), IGC (ECO_0000317),
RCA (ECO:0000245) and IEA (ECO:0000501) to annotate gene products from PAMGO_MGG.
Standard BLASTP from NCBI was used (http://www.ncbi.nih.gov/blast) to iteratively
search reciprocal best hits and thus identify orthologs between predicted proteins
of Magnaporthe grisea and GO proteins from multiple organisms with published association
to GO terms. The alignments were manually reviewed for those hits with e-value
equal to zero and with 80% or better coverage of both query and subject sequences,
and for those hits with e<=10^-20, pid >=35 and sequence coverage >=80%. Furthermore,
experimental or reviewed data from literature and other sources were incorporated
into the GO annotation. IDA was assigned to an annotation if normal function of
its gene was determined through transfections into a cell line and overexpression.
IEP was assigned to an annotation if according to microarray experiments, its
gene was upregulated in a biological process and the fold change was equal to
or bigger than 10, or if according to Massively Parallel Signature Sequencing
(MPSS), its gene was upregulated only in a certain biological process and the
fold change was equal to or bigger than 10. ISS was assigned to an annotation
if the entry at the With_column was experimentally characterized and the pairwise
alignments were manually reviewed. IGC was assigned to an annotation if it based
on comparison and analysis of gene location and structure, clustering of genes,
and phylogenetic reconstruction of these genes. RCA was assigned to an annotation
if it based on integrated computational analysis of whole genome microarray data,
and matches to InterPro, pfam, and COG etc. IEA was assigned to an annotation
if its function assignment based on computational work, and no manual review was
done.
authors: PAMGO_MGG curators
is_obsolete: false
year: 2008
- id: GO_REF:0000029
title: OBSOLETE Gene Ontology annotation based on information extracted from curated
UniProtKB entries
description: |-
Active 2001-2007.
Method by which GO terms were manually assigned to UniProt KnowledgeBase accessions,
using either a NAS or TAS evidence code, by applying information extracted from
the corresponding publicly-available, manually curated UniProtKB entry. Such GO
annotations were submitted by the GOA-UniProt group from 2001, but this annotation
practice was discontinued in 2007.
authors: GOA-UniProt curators
is_obsolete: true
year: 2007
- id: GO_REF:0000030
title: OBSOLETE Portable Annotation Rules
description: |-
The JCVI is developing a collection of mixed-evidence annotation rules, under
the working name BrainGrab/RuleBase (BGRB). A rule has two parts. The first is
the set of conditions that must be met for the rule to fire. The second is the
set actions to be taken for rules that have fired. BGRB rules are designed to
serve as proxies for the annotators that create them. They have very high fidelity
but may have low coverage. Types of evidence used in combination include HMM hits
and BLAST matches, hits to neighboring genes, pathway reconstruction reports from
the Genome Properties system, and species taxonomy. BLAST matches are described
by a number of separate parameters for raw score, percent sequence identity, and
coverage of total sequence length by the match region. These parameters are customized
for each protein family in order to achieve high fidelity in automated annotation
systems. The flexible syntax makes it possible to use existing protein family
classifiers, such as Pfam and TIGRFAMs HMMs, in new ways. It is especially useful
in assigning GO terms to proteins such as SelD (selenide, water dikinase) that
have different roles in different contexts.
authors: Daniel Haft, JCVI
year: 2008
is_obsolete: true
- id: GO_REF:0000031
title: OBSOLETE NIAID Cell Ontology Workshop
description: |-
The NIAID sponsored a Cell Ontology Workshop, May 13-14, 2008, in Bethesda, focusing
on improving representation of immune cell types in the Cell Ontology. The participants
in the workshop worked together to extend the current ontology in the area of
immune cell types and to provide the necessary information for the upcoming restructuring
of the Cell Ontology in single-inheritance form with genus-differentia definitions.
authors: Alexander D. Diehl, Alison Deckhut Augustine, Judith A. Blake, Lindsay G. Cowell,
Elizabeth S. Gold, Timothy A. Gondre-Lewis, Anna Maria Masci, Terrence F. Meehan,
Penelope A. Morel, Anastasia Nijnik, Bjoern Peters, Bali Pulendran, Richard H.
Scheuermann, Q. Alison Yao, Martin S. Zand, Christopher J. Mungall
url: http://www.bioontology.org/wiki/index.php/NIAID_Cell_Ontology_Workshop_May_2008
year: 2008
is_obsolete: true
- id: GO_REF:0000032
title: OBSOLETE Inference of Biological Process annotations from inter-ontology
links
description: |-
We use the GOBO library to propagate annotations from Molecular Function to Biological
Process. This results in both increased numbers of annotations, and increased
consistency between curators.
Duplicate of GO_REF:0000108.
authors: Christopher J. Mungall, Tanya Z. Berardini, David P. Hill
is_obsolete: true
url: http://wiki.geneontology.org/index.php/GAF_Inference
- id: GO_REF:0000033
title: Annotation inferences using phylogenetic trees
description: |-
The Phylogenetic ANnotation using Gene Ontology (PAN-GO) method annotates evolutionary
trees from the PANTHER database with GO terms describing molecular function, biological
process and cellular component. The GO terms are manually selected by a curator
and used to annotate ancestral genes in the phylogenetic tree using the evidence
code IBA (Inferred from Biological Ancestor). All supporting annotations must
be based on experimental data from the scientific literature. The PAN-GO annotations
are fully traceable from the data in the 'with/from' column of the annotation,
which provides the PANTHER node ID (PTN) from which the annotation is derived,
as well as all descendants sequences that support the annotation of the ancestral
node.
The full method is described in PMID:21873635.
authors: Marc Feuermann, Huaiyu Mi, Pascale Gaudet, Dustin Ebert, Anushya Muruganujan,
Paul Thomas
external_accession:
- SGD_REF:S000146947
- TAIR:Communication:501741973
- MGI:MGI:4459044
- J:161428
- ZFIN:ZDB-PUB-110330-1
- FB:FBrf0258542
is_obsolete: false
url: https://wiki.geneontology.org/Phylogenetic_Annotation_Project
year: 2010
- id: GO_REF:0000034
title: Phenoscape Skeletal Anatomy Jamboree
description: |-
Skeletal cell terms and relationships were added and revised at the Skeletal Anatomy
Jamboree held by Phenoscape (NSF grant BDI-0641025) and hosted by the National
Evolutionary Synthesis Center (NESCent), April 9-10, 2010.
authors: Brian K. Hall (Dalhousie University), Matthew Vickaryous (Ontario Veterinary College,
University of Guelph), David Blackburn, University of Kansas; Wasila Dahdul, University
of South Dakota and NESCent; Alexander Diehl, Mouse Genome Informatics (MGI);
Melissa Haendel, Oregon Health Sciences University; John G. Lundberg, Department
of Ichthyology, Academy of Natural Sciences, Philadelphia; Paula Mabee, Department
of Biology, University of South Dakota; Martin Ringwald, Mouse Genome Informatics
(MGI); Erik Segerdell, Oregon Health Sciences University; Ceri Van Slyke, Zebrafish
Information Network (ZFIN); Monte Westerfield, Zebrafish Information Network (ZFIN)
and Institute of Neuroscience, University of Oregon.
year: 2010
- id: GO_REF:0000035
title: OBSOLETE Automatic transfer of experimentally verified manual GO annotation
data to plant orthologs using Ensembl Compara
description: |-
GO terms from a source species are projected onto one or more target species based
on gene orthology obtained from the Ensembl Compara system. One to one, one to
many and many to many orthologies are used but annotations are only projected
between orthologs that have at least a 40% peptide identity to each other. Only
GO annotations with an evidence type of IDA, IEP, IGI, IMP or IPI are projected,
no annotations with a 'NOT' qualifier are projected and annotations to the GO:0005515
protein binding term are not projected. Projected GO annotations using this technique
will receive the evidence code Inferred from Electronic Annotation (IEA). The
model organism database identifier of the annotation source will be indicated
in the 'With' column of the GOA association file.
Duplicate of GO_REF:0000107.
authors: Ensembl, GRAMENE, GOA curators
is_obsolete: true
year: 2011
- id: GO_REF:0000036
title: Manual annotations that require more than one source of functional data to
support the assignment of the associated GO term
description: |-
The Gene Ontology Consortium uses the IC (Inferred by Curator) evidence code when
an annotation cannot be supported by any direct evidence, but can be inferred
by GO annotations that have been annotated to the same gene/gene product identifier
in conjunction with the curator's knowledge of biology (supporting GO annotations
must not be IC-evidenced). In many cases an IC-evidenced annotation simply applies
the same reference that was used in the supporting GO annotation. The use of
IC evidence code in an annotation with reference GO_REF:0000036 signifies a curator
inferred the GO term based on evidence from multiple sources of evidence/GO annotations.
The 'with/from' field in these annotations will therefore supply more than one
GO identifier, obtained from the set of supporting GO annotations assigned to
the same gene/gene product identifier which cite publicly-available references.
authors: GO Annotation working group
external_accession:
- SGD_REF:S000147045
- J:342596
year: 2011
- id: GO_REF:0000037
title: OBSOLETE Gene Ontology annotation based on manual assignment of UniProtKB
keywords in UniProtKB/Swiss-Prot entries.
description: |-
Transitive assignments using UniProtKB keywords. The UniProtKB keyword controlled
vocabulary has been created and used by the UniProt Knowledgebase (UniProtKB)
to supply 10 different categories of information to UniProtKB entries. Further
information on the UniProtKB keyword resource can be found at
http://www.uniprot.org/docs/keywlist. UniProtKB keywords are manually applied to
UniProtKB/Swiss-Prot entries by UniProt curators. Further information on the
UniProtKB manual annotation process is available at http://www.uniprot.org/faq/45.
When a UniProtKB keyword describes a concept that is within the scope of the Gene
Ontology, it is investigated to determine whether it is appropriate to map the
keyword to an equivalent term in GO. The mapping between UniProtKB keywords and
GO terms is carried out manually. Definitions and hierarchies of the terms in
the two resources are compared and the mapping generated will reflect the most
correct correspondence. The translation table between GO terms and UniProtKB keywords
is maintained by the UniProt-GOA team and available at
http://www.geneontology.org/external2go/uniprotkb_kw2go.
Duplicate of GO_REF:0000043.
authors: UniProt-GOA
is_obsolete: true
year: 2011
evidence_codes:
- ECO:0000501
- id: GO_REF:0000038
title: OBSOLETE Gene Ontology annotation based on automatic assignment of UniProtKB
keywords in UniProtKB/TrEMBL entries.
description: |-
Transitive assignments using UniProtKB keywords. The UniProtKB keyword controlled
vocabulary has been created and used by the UniProt Knowledgebase (UniProtKB)
to supply 10 different categories of information to UniProtKB entries. Further
information on the UniProtKB keyword resource can be found at
http://www.uniprot.org/docs/keywlist. UniProtKB keywords are automatically assigned
to UniProtKB/TrEMBL entries from the underlying nucleic acid databases and/or by
the UniProt automatic annotation program. Further information on the prediction
systems applied by UniProt is available here:
http://www.uniprot.org/program/automatic_annotation.
When a UniProtKB keyword describes a concept that is within the scope of the Gene
Ontology, it is investigated to determine whether it is appropriate to map the
keyword to an equivalent term in GO. The mapping between UniProtKB keywords and
GO terms is carried out manually. Definitions and hierarchies of the terms in
the two resources are compared and the mapping generated will reflect the most
correct correspondence. The translation table between GO terms and UniProtKB keywords
is maintained by the UniProt-GOA team and available at
http://www.geneontology.org/external2go/uniprotkb_kw2go.
Duplicate of GO_REF:0000043.
authors: UniProt-GOA
is_obsolete: true
year: 2011
- id: GO_REF:0000039
title: OBSOLETE Gene Ontology annotation based on the manual assignment of UniProtKB
Subcellular Location terms in UniProtKB/Swiss-Prot entries.
description: |-
Transitive assignment of GO terms based on the UniProtKB Subcellular Location
vocabulary. UniProtKB Subcellular Location is a controlled vocabulary used to
supply subcellular location information to UniProtKB entries in the SUBCELLULAR
LOCATION lines. Terms from this vocabulary are annotated manually to UniProtKB/Swiss-Prot
entries. Further information on the UniProtKB manual annotation method is available
at http://www.uniprot.org/faq/45.
When a UniProtKB Subcellular Location term describes a concept that is within
the scope of the Gene Ontology, it is investigated to determine whether it is
appropriate to map the term to an equivalent term in GO. The mapping between UniProtKB
Subcellular Location terms and GO terms is carried out manually. Definitions and
hierarchies of the terms in the two resources are compared and the mapping generated
will reflect the most correct correspondence. The translation table between GO
terms and UniProtKB Subcellular Location terms is maintained by the UniProt-GOA
team and available at http://www.geneontology.org/external2go/spsl2go.
authors: UniProt-GOA
is_obsolete: true
year: 2011
- id: GO_REF:0000040
title: OBSOLETE Gene Ontology annotation based on the automatic assignment of UniProtKB
Subcellular Location terms in UniProtKB/TrEMBL entries.
description: |-
Transitive assignment of GO terms based on the UniProtKB Subcellular Location
vocabulary. UniProtKB Subcellular Location is a controlled vocabulary used to
supply subcellular location information to UniProtKB entries in the SUBCELLULAR
LOCATION lines. Terms from this vocabulary are applied automatically to UniProtKB/TrEMBL
entries from the underlying nucleic acid databases and/or by the UniProt automatic
annotation program. Further information on the UniProtKB automatic annotation
program is available at http://www.uniprot.org/faq/45.
When a UniProtKB Subcellular Location term describes a concept that is within
the scope of the Gene Ontology, it is investigated to determine whether it is
appropriate to map the term to an equivalent term in GO. The mapping between UniProtKB
Subcellular Location terms and GO terms is carried out manually. Definitions and
hierarchies of the terms in the two resources are compared and the mapping generated
will reflect the most correct correspondence. The translation table between GO
terms and UniProtKB Subcellular Location terms is maintained by the UniProt-GOA
team and available at http://www.geneontology.org/external2go/spsl2go.
authors: UniProt-GOA
is_obsolete: true
year: 2011
- id: GO_REF:0000041
title: Gene Ontology annotation based on UniPathway vocabulary mapping.
description: |-
Transitive assignment of GO terms based on the UniPathway pathway vocabulary.
UniPathway is not maintained anymore. It was a manually curated resource of enzyme-catalyzed and spontaneous
chemical reactions that provided a hierarchical representation of metabolic pathways. The last release of UniPathway
was in March 2015 and is archived at NCBO:
https://bioportal.bioontology.org/ontologies/UPA.
authors: UniProt-GOA
external_accession:
- ZFIN:ZDB-PUB-130131-1
- J:342601
year: 2012
- id: GO_REF:0000042
title: OBSOLETE Gene Ontology annotation through association of InterPro records
with GO terms, accompanied by conservative changes to GO terms applied by UniProt.
description: |-
Transitive assignment of GO terms based on InterPro classification. For any database
entry (representing a protein or protein-coding gene) that has been annotated
with one or more InterPro domains, The corresponding GO terms are obtained from
a translation table of InterPro entries to GO terms (interpro2go) generated manually
by the InterPro team at EBI. The mapping file is available at
http://www.geneontology.org/external2go/interpro2go.
Please note that the GO term in the annotation assigned with this GO reference
has been changed from that originally applied by the InterPro2GO mapping. This
change has been carried out by the UniProt group to ensure the GO annotation obeys
the GO Consortium’s ontology structure and taxonomic constraints. Further information
on the rules used by UniProt to transform specific incorrect IEA annotations is
available at http://www.ebi.ac.uk/QuickGO/AnnotationPostProcessing.html.
Duplicate of GO_REF:0000002.
authors: UniProt-GOA
is_obsolete: true
year: 2012
- id: GO_REF:0000043
title: Gene Ontology annotation based on UniProtKB/Swiss-Prot keyword mapping
description: |-
Transitive assignments using UniProtKB/Swiss-Prot keywords. The UniProtKB keyword
controlled vocabulary contains 10 different categories of information to UniProtKB
entries. Further information on the UniProtKB keyword resource can be found at
https://www.uniprot.org/keywords/. UniProtKB keywords are assigned to UniProtKB/Swiss-Prot
entries by UniProt curators as part of the UniProtKB manual curation process.
UniProtKB keywords are also automatically assigned to UniProtKB/TrEMBL entries
from the underlying nucleic acid databases and/or by the UniProt automatic annotation
program. Further information on the two different UniProt annotation methods is
available at https://www.uniprot.org/help/keywords.. When a UniProtKB keyword
describes a concept that is within the scope of the Gene Ontology, a mapping is
manually made to the corresponding GO term.The translation table between GO terms
and UniProtKB keywords is maintained by the EBI GOA team and available at
http://www.geneontology.org/external2go/uniprotkb_kw2go.
authors: UniProt-GOA
external_accession:
- SGD_REF:S000148669
- J:60000
- TAIR:AnalysisReference:501756968
- TAIR:AnalysisReference:501756970
year: 2012
- id: GO_REF:0000044
title: Gene Ontology annotation based on UniProtKB/Swiss-Prot Subcellular Location
vocabulary mapping, accompanied by conservative changes to GO terms applied by
UniProt.
description: |-
Transitive assignment of GO terms based on the UniProtKB/Swiss-Prot Subcellular
Location vocabulary. UniProtKB Subcellular Location is a controlled vocabulary
used to supply subcellular location information to UniProtKB entries in the SUBCELLULAR
LOCATION lines. Terms from this vocabulary are annotated manually to UniProtKB/Swiss-Prot
entries but are automatically assigned to UniProtKB/TrEMBL entries from the underlying
nucleic acid databases and/or by the UniProt automatic annotation program. Further
information on these two different annotation methods is available
https://www.uniprot.org/help/keywords.
When a UniProtKB Subcellular Location term describes a concept that is within
the scope of the Gene Ontology, a mapping is made to the corresponding
GO term. The translation table between GO terms and UniProtKB Subcellular Location
term is maintained by the EBI GOA team and available at
http://current.geneontology.org/ontology/external2go/uniprotkb_sl2go.
authors: UniProt-GOA
external_accession:
- TAIR:AnalysisReference:501756971
- TAIR:AnalysisReference:50175724
- J:342604
year: 2012
- id: GO_REF:0000045
title: OBSOLETE Gene Ontology annotation based on UniProtKB/TrEMBL entries keyword
mapping, accompanied by conservative changes to GO terms applied by UniProt.
description: |-
Transitive assignments using UniProtKB/TrEMBL keywords. The UniProtKB keyword
controlled vocabulary has been created and used by the UniProt Knowledgebase (UniProtKB)
to supply 10 different categories of information to UniProtKB/TrEMBL entries entries.
Further information on the UniProtKB keyword resource can be found at
http://www.uniprot.org/docs/keywlist.
UniProtKB keywords are assigned to UniProtKB/UniProtKB entries by UniProt curators
as part of the UniProtKB manual curation process. In contrast however, UniProtKB
keywords are automatically assigned to UniProtKB/TrEMBL entries from the underlying
nucleic acid databases and/or by the UniProt automatic annotation program.
Further information on the two different UniProt annotation methods is available
at http://www.uniprot.org/faq/45 and http://www.uniprot.org/program/automatic_annotation.
When a UniProtKB keyword describes a concept that is within the scope of the Gene
Ontology, it is investigated to determine whether it is appropriate to map the
keyword to an equivalent term in GO. The translation table between GO terms and
UniProtKB keywords is maintained by the UniProt-GOA team and available at
http://www.geneontology.org/external2go/uniprotkb_kw2go.
Please note that the GO term in the annotation assigned with this GO reference
has been changed from that originally applied by the UniProtKB keywords 2GO mapping.
This change has been carried out by the UniProt group to ensure the GO annotation
obeys the GO Consortium’s ontology structure and taxonomic constraints. Further
information on the rules used by UniProt to transform specific incorrect IEA annotations
is available at http://www.ebi.ac.uk/QuickGO/AnnotationPostProcessing.html.
Duplicate of GO_REF:0000004.
authors: UniProt-GOA
is_obsolete: true
year: 2012
- id: GO_REF:0000046
title: OBSOLETE Gene Ontology annotation based on UniProtKB/TrEMBL Subcellular Location
vocabulary mapping, accompanied by conservative changes to GO terms applied by
UniProt.
description: |-
Transitive assignment of GO terms based on the UniProtKB/TrEMBL Subcellular Location
vocabulary. UniProtKB Subcellular Location is a controlled vocabulary used to
supply subcellular location information to UniProtKB entries in the SUBCELLULAR
LOCATION lines. Terms from this vocabulary are annotated manually to UniProtKB/Swiss-Prot
entries but are automatically assigned to UniProtKB/TrEMBL entries from the underlying
nucleic acid databases and/or by the UniProt automatic annotation program.
Further information on these two different annotation methods is available at
http://www.uniprot.org/faq/45 and http://www.uniprot.org/program/automatic_annotation.
The translation table between GO terms and UniProtKB Subcellular Location term
is maintained by the UniProt-GOA team and available at
http://www.geneontology.org/external2go/spsl2go.
Please note that the GO term in the annotation assigned with this GO reference
has been changed from that originally applied by the UniProtKB Subcellular Location2GO
mapping. This change has been carried out by the UniProt group to ensure the GO
annotation obeys the GO Consortium’s ontology structure and taxonomic constraints.
Further information on the rules used by UniProt to transform specific incorrect
IEA annotations is available at http://www.ebi.ac.uk/QuickGO/AnnotationPostProcessing.html.
Duplicate of GO_REF:0000023.
authors: UniProt-GOA
is_obsolete: true
year: 2012
- id: GO_REF:0000047
title: Gene Ontology annotation based on absence of key sequence residues.
description: |-
This describes a method for supplying a NOT-qualified, IKR-evidenced GO annotation
to a gene product, when general sequence homology considerations would suggest
a function or location, or a role in a biological process, but where a curator
has determined that the absence of key sequence residues, known to be required
for an expected activity or location, indicating the gene product is unlikely
to be able to carry out the implied activity, involvement in a process or cellular
component location. This reference should only be used used when an IKR-evidenced
annotation is made based on curator judgement from manually reviewing the sequence
of the gene product and where no publication can be found to support the curators
conclusion. It is preferable to cite a peer-reviewed publication (such as a PubMed
identifier) for IKR-evidenced annotations whenever possible. Curators will have
carefully reviewed the sequence of the annotated protein, and established that
the key residues known to be required for an expected activity or location are
not present. Inclusion of an identifier in the 'with/from' field, that highlights
to the user the lacking residues(e.g. an alignment, domain or rule identifier)
is absolutely required when annotating to IKR with this GO_REF. Documentation
on the GOC website provides more details on the
<a href="http://www.geneontology.org/GO.evidence.shtml#ikr">correct use of the
IKR evidence code</a>.
authors: GO curators
external_accession:
- FB:FBrf0254415
year: 2012
- id: GO_REF:0000048
title: OBSOLETE TIGR's Eukaryotic Manual Gene Ontology Assignment Method
description: |-
This describes TIGR curators' interpretation of a combination of evidence. Our
internal software tools present us with a great deal of evidence based on domains,
sequence similarities, signal sequences, paralogous proteins, etc. The curator
interprets the body of evidence to make a decision about a GO assignment when
an external reference is not available. The curator places one or more accessions
that informed the decision in the "with" field.
authors: TIGR Arabidopsis annotation team
external_accession:
- TAIR:Communication:501714663
is_obsolete: true
year: 2005
- id: GO_REF:0000049
title: OBSOLETE Automatic transfer of experimentally verified manual GO annotation
data to fungal orthologs using Ensembl Compara
description: |-
GO terms from a source species are projected onto one or more target species based
on gene orthology obtained from the Ensembl Compara system. One to one, one to
many and many to many orthologies are used but annotations are only projected
between orthologs that have at least a 40% peptide identity to each other. Only
GO annotations with an evidence type of IDA, IEP, IGI, IMP or IPI are projected,
no annotations with a 'NOT' qualifier are projected and annotations to the GO:0005515
protein binding term are not projected. Projected GO annotations using this technique
will receive the evidence code Inferred from Electronic Annotation (IEA). The
model organism database identifier of the annotation source will be indicated
in the 'With' column of the GOA association file.
Duplicate of GO_REF:0000107.
authors: Ensembl Genomes
is_obsolete: true
year: 2012
- id: GO_REF:0000050
title: Manual transfer of GO annotation data to genes by curator judgment of sequence
model
description: |-
Transitive assignment of GO terms to a gene based on a curator's judgment of its
match to a sequence model,such as a Pfam or InterPro entry, that has manually
curated GO annotations, mappings to GO terms, or a description from which GO terms
can be inferred. A statistical model of a sequence or group of sequences is used
to make a prediction about the function of a protein or RNA. Annotations are created
when a curator evaluates the results, using criteria that include excluding false
positives and ensuring that the annotation is accurate for all matches. Statistical
scores (such as e values and cutoff scores) and the functional specificity of
the model may also be (but are not always) considered. Annotations resulting from
the transfer of GO terms use the 'ISM' evidence code and include an accession
for the model from which the annotation was projected in the 'with' field (column
8).
authors: PomBase curators
external_accession:
- FB:FBrf0231277
year: 2012
- id: GO_REF:0000051
title: S. pombe keyword mapping
description: |-
Keywords derived from manually curated primary annotation, e.g. gene product descriptions,
are mapped to GO terms. Annotations made by this method have the evidence code
Non-traceable Author Statement (NAS), and are filtered from the PomBase annotation
files wherever another annotation exists that is equally or more specific, and
supported by experimental or manually evaluated comparative evidence (such as
ISS and its subtypes). Formerly GOC:pombekw2GO.
authors: PomBase curators
is_obsolete: false
year: 2012
- id: GO_REF:0000052
title: Gene Ontology annotation based on curation of immunofluorescence data
description: |-