Entity can be a macromolecular complex (in which case it does refer towards the GO CC concept) or even a single macromolecule (in which case it will not); an example of this are mentions of receptors, PubMed ID:http://www.ncbi.nlm.nih.gov/pubmed/21473702 which may very well be either single proteins or protein complexes, the former of which usually do not refer to receptor complicated (GO).It is often hard to ascertain whether or not the type of pointed out receptor can type a complicated and in that case, if it is performing so inside a distinct context; this is a lot more ambiguous if numerous varieties of receptors are being discussed or when the sorts of receptors are usually not specified.Assuming there’s a GO CC macromolecularcomplex term to which a given mention might refer, a mention is straightforwardly annotated if it’s clearly specified as a complex, e.g “receptor complexes”.If there is no such clear specification, it is actually annotated in the event the mention can also be the name of a protein that could possibly be in the kind of a homomeric complex in its context (e.g tubulin complex (GO) for “tubulin”) except if there is a 4EGI-1 Autophagy corresponding MF term (e.g receptor activity (GO) for “receptor”).If there is such a corresponding MF term, the mention just isn’t annotated with the CC term, due to the fact this ambiguity could be captured utilizing the MF term and the oftentricky concern as to whether or not to regard and annotate such as a mention as a macromolecular complex might be avoided.Gene ontology molecular functions (GO MF)As the annotation of GO molecular functions was performed simultaneously with the GO biological processesBada et al.BMC Bioinformatics , www.biomedcentral.comPage ofby the exact same annotator, the aforementioned version with the GO was utilized, which includes , MF terms; amongst the functions represented by these terms are kinds of binding, transporter activity, molecular transducer activity, and catalytic activity.We’ve got previously written from the difficulty of distinguishing amongst and annotating with GO BP and MF ideas in text , and these concerns have continued to make consistent annotation of text with GO MF concepts in certain challenging.As a suboptimal option, we have narrowly annotated the articles with the corpus together with the GO MF terms.The majority of those annotations recognize molecular entities possessing the specified functionalities, and the text spans of these annotations are also marked up with independent_continuant (snapIndependentContinuantd); so, by way of example, the annotation of “cation channel” together with the GO MF concept cation channel activity (GO) and also with snapIndependentContinuant has the semantics that this text span refers to an independent continuant which has cation channel functionality.The one particular main subgraph on the GO MF ontology whose terms are predominantly annotated as moleculelevel processes rather than as molecular entities possessing functionalities is definitely the binding (GO) hierarchy.NCBI taxonomy (NCBITaxon)have identical lexicalizations (e.g Xenopus denotes each a genus along with a subgenus), the a lot more common 1 is utilised.Finally, mentions of taxonomic ranks themselves (e.g class, family, species) are annotated using the acceptable terms of the taxonomic_rank subtree.Protein ontology (PRO)As with the annotations using the special IDs on the records in the Entrez Gene database, annotators functioning with the NCBI Taxonomy directly applied the NCBI Taxonomy interface to search for entries denoting organisms.The issues in ontological representation of biological taxa has been discussed elsewhere ; for this project, we’ve got regarded the entries in the NCBI Taxonomy datab.