Querying by Gene Symbol provides for cross-species accumulation of annotations,
but can be error prone.Reviewed (i.e. Swiss-Prot curated)
Uniprot accessions are retrieved from Uniprot using the search format query=
(gene_exact:\"YourGeneSymbol\"). The search is
not case-sensitive. For example, querying Dmd or DMD will retrieve all of
these
hits, then subcellular localization will be predicted based on prioritization
of their accumulated Uniprot and Gene Ontology annotations.
Querying by Gene Symbol is error prone because multiple genes can share the same Symbol.
For example,
querying 'F10' will accumulate
annotations for both
Coagulation factor X and a
centrosome/spindle pole-associated protein FAM110A.
Uncheck the above box to accumulate annotations for genes matching each Symbol
across all species, but be aware that the risk of crossed identifiers is high (e.g.
querying 'ART3' for all species could prioritize the vesicle localization of the yeast
Arrestin-related trafficking adapter 3 rather than the membrane localization of the
human ADP-ribosyltransferase.)
For robust species-specific queries, use one of the other ID types.
CellWhere can retrieve localization terms from both the Uniprot
"Subcellular location" field and from the Gene Ontology
Cellular Compartment annotation.
Each have their advantages: in general, Uniprot is more conservative,
but the Gene Ontology has a greater depth. The Gene Ontology
tends to be inclusive of all published locations, even locations which
may be rare for a given protein. For example, the protein Dystrophin
is most studied at the membrane of muscle cells and
[its Uniprot Subcellular location] is
restricted to this. However, the Gene Ontology lists
[several related and
sometimes more specific Cell
Compartments] including the 'dystrophin-associated glycoprotein complex'
and 'Z disc', but also
'Filopodium' which has been reported not in muscle cells but in platelets
(see
the user guide for other examples).
We suggest that UniProt alone be used to retrieve the more classically known location(s), but we recommend retrieving
both Uniprot and GO locations if you will use a prioritization flavor to
guide CellWhere towards your research interests.
In the output graph, CellWhere will place each protein at a location
in the cell. For proteins with multiple location annotations, CellWhere must choose which locations to display. The generic
option chooses
based on the number of times each location is annotated to that protein. It may duplicate a protein node to display it at
more than one location if the annotation frequencies are closely matched. The flavor option chooses based on which
location is of most relevance (i.e. has the highest 'priority score') for a given area of research, and it selects only one location
for each protein.
The 'flavor'
mapping file tells CellWhere how to map terms from UniProt/GO to CellWhere localizations, and sets the priority scores
(see
the user guide for more details).
The flavor file also tells CellWhere's
network viewer how to display different localizations relative to each other.
You can either select a pre-loaded flavor, or control each of these steps by creating your own flavor.
To create your own flavor,
download the tab-delimited template (or any of the flavor files listed under downloads in the menu) and give
higher priority numbers to the localizations
that interest you, then upload it in the field above. You can also change the localizations
themselves (the 'OurLocalization' column). Just be careful of the following:
(1) don't alter the first two columns
(2) don't give the same priority number to more than one
of your mapped localizations
(3) don't attribute more than one spatial relation to a mapped localization
(4) don't invent your own spatial relation terms (though feel free to re-attribute the ones that are already
there).