User Guide and FAQs

Jump to...
What is CellWhere?
What is CellWhere for?
How can I use CellWhere for my research?
How does CellWhere localize proteins?
How does CellWhere know where to place sub-locations on the cell diagram?
What are CellWhere's default localizations?
Advanced options: what if a protein localizes to multiple locations?
How does CellWhere produce it's graphical display?
What features does the graph have?
How can I find out about the evidence for an interaction?
Advanced options: what are promiscuous interactors?
How do I share the output?
Why is my favorite protein at a weird location?
What are the different download files and how can I use them?
Is there a CellWhere API? Can I connect with CellWhere programmatically?

What is CellWhere?

CellWhere is a data combining and visualization tool that enables bench researchers to quickly explore the reported subcellular locations of a list of genes/proteins, and to put these subcellular locations into the context of previously identified physical interactions that could be occurring between these proteins and others within the cell.

CellWhere retrieves localizations from UniProt and/or The Gene Ontology, and retrieves interactions from the Mentha server. It graphs the resulting network to resemble a physical map of the cell, placing proteins in a way that helps biologists to hypothesize and interpret mechanistic links between their genes/proteins of interest. It produces an interactive display of the graph using the Cytoscape.js library.

What is CellWhere for?

A researcher can begin with one or more genes/proteins of special interest to her, or may have a longer list resulting from some screening or omics analysis. In either case:

1. CellWhere can show where proteins are typically described to locate in the cell, and what their most strongly evidenced interactions are. This may suggest mechanistic pathways.

2. CellWhere can show whether proteins and their interactors could be at locations that the researcher has defined as being of special interest to her.

and, if the researcher has used some other process to arrive at a gene association network in Cytoscape 3 (such as IntAct or GeneMANIA results):

3. CellWhere can add subcellular locations to a pre-made network and color nodes according a selected attribute of that pre-made network (such as fold-changes from an omics study).

How can I use CellWhere for my research?

Below are a few ways that you can benefit from CellWhere (there are almost certainly other uses that we have not thought of!)


1. To share the network graph with collaborators, to help discuss and interpret new findings
2. To produce visual displays that can be added to publications to help explain/discuss new findings

Exploration and interpretation

3. You can use the visual display to imagine mechanistic hypotheses based on interactions and/or localizations, for either:
- a gene short-list derived from omics analysis
- a few genes or proteins of special interest to your project

Basic information retrieval

4. Visualize where proteins are usually reported to locate in the cell
5. Determine if your proteins of interest have been observed at localizations of special interest to you
6. For each protein, quickly obtain a complete list of reported localization terms from The Gene Ontology and Uniprot
7. Visualize how proteins could be interacting with each other, and how this relates to the prioritized localizations
8. Visualize the wider network of strongly supported interactions of your query proteins

How does CellWhere localize proteins?

The table below shows the localization procedure for 3 example queries:


For the three query genes, reviewed Swiss-Prot (UniProt) protein accessions were retrieved (in this example, gene names are queried, but other identifiers can also be used), along with localization terms from The Gene Ontology and UniProt (actually, CellWhere retrives the localization text field from UniProt and parses it into phrases). These terms (of which ~3000 are in use) are then mapped to CellWhere localizations (of which there are 50 - see "What are CellWhere's default localizations?"). The relative frequency of each CellWhere localization is calculated for a given protein.

CellWhere currently maps all Uniprot and GO localization terms that have been applied to more than 25 proteins. This covers more than 99% of all protein localization annotations (1,258,337 out of a total of 1,269,645), and includes the most frequently used 1013 of the 3812 terms that comprise the Gene Ontology Cellular Component namespace, and 422 of the 1283 terms parsed from the Uniprot Subcellular location field.

If you select "Generic" as the localization flavor then it is the frequency percentage that is used to select the localization at which the protein will be shown on the final graph. In this case, RRAD, EMILIN2, and ACTC1, would be placed at the Membrane, ECM, and Cytoplasm, respectively.

Alternatively, a ranked list of all the CellWhere localizations is consulted and the highest ranking for a given protein is chosen for that proteins location on the graph. We include several ranked list 'flavors', and by using one as a template you can create your own flavor according to your research interests (please see the front page for template and upload instructions, and also the downloadable files available from the menu). By using our muscle flavor, in the example above ACTC1 would be placed into the 'Focal adhesion' location, because the muscle flavor sets a high priority score on this location, due to it being of special interest to muscle researchers.

You are not obliged to use both the UniProt and GO localizations, and can choose either alone. The UniProt localization field is carefully (and conservatively) curated to contain the classicly known locations of a protein, whereas The Gene Ontology is aimed more towards a comprehensive listing of all locations at which a protein has been observed. Therefore, in our context, GO is useful for screening proteins against localizations of interest, whereas UniProt is useful if you are interested to know the 'typical' location of a protein.

How does CellWhere know where to place sub-locations on the cell diagram?

The "spatial relation" tells CellWhere how to place a localization on the output graph. CellWhere currently supports the following spatial relations. Each CellWhere localization must be associated with one of these in the mapping file.

Spatial relation vocabulary:

- Nucleus*
- IN Nucleus
- Cytoplasm*
- IN Cytoplasm
- Membrane*
- IN Membrane
- UNDER Membrane
- ACROSS Membrane
- SURFACE Membrane
- Extracellular*
- IN Extracellular

* "Nucleus", "Cytoplasm", "Membrane", and "Extracellular", are the primary compartments of CellWhere's visual display. A localization that is designated to one of these will not be independently labeled. Instead, proteins carrying this localization will float freely within the primary compartment - the localization will NOT be given a separate box (these boxes are referred to in Cytoscape.js as 'compound nodes').

† A location designated as "IN Nucleus", "IN Cytoplasm", "IN Membrane", or "IN Extracellular", will be given it's own box, and member proteins will be displayed within this box. The box will be located somewhere within the appropriate primary compartment (Nucleus, Cytoplasm, Membrane, or Extracellular).

‡ A location designated as "UNDER Membrane", "SURFACE Membrane", or ""ACROSS MEMBRANE", will be given it's own box. The box will be placed, respectively: touching the interior of the cell membrane, touching the exterior of the cell membrane, or traversing entirely the membrane.

What are CellWhere's default localizations?

When you localize proteins by annotation frequency, CellWhere uses its generic mapping file. This file maps Uniprot and GO localization terms to one or more of the 50 CellWhere localization terms shown in the following table. As described above, you can download an example mapping file by following the link on the front page, and you can create your own mappings by creating and uploading a pre-made flavor.
CellWhere localizationSpatial relationCellWhere localizationSpatial relation
1. AcrosomeUNDER Membrane26. LysosomeIN Cytoplasm
2. Actin cytoskeletonIN Cytoplasm27. MelanosomeIN Cytoplasm
3. AmyloplastIN Cytoplasm28. MembraneMembrane
4. ApoplastSURFACE Membrane29. Microtubule cytoskeletonIN Cytoplasm
5. AutophagosomeIN Cytoplasm30. MitochondrionIN Cytoplasm
6. CaveolaeUNDER Membrane31. Motile partsACROSS Membrane
7. Cell cortexUNDER Membrane32. NucleoidIN Cytoplasm
8. Cell junctionACROSS Membrane33. NucleolusIN Nucleus
9. Cell surfaceSURFACE Membrane34. NucleusNucleus
10. Cell wallSURFACE Membrane35. Outer membraneIN Membrane
11. ChloroplastIN Cytoplasm36. PeriplasmIN Membrane
12. CyanelleIN Cytoplasm37. PeroxisomeIN Cytoplasm
13. CytoplasmCytoplasm38. PlasmodesmaSURFACE Membrane
14. CytoskeletonIN Cytoplasm39. PlastidIN Cytoplasm
15. Endoplasmic reticulumIN Cytoplasm40. PodosomeSURFACE Membrane
16. EndosomeIN Cytoplasm41. ProteasomeIN Cytoplasm
17. ERMES complexIN Cytoplasm42. RibosomeIN Cytoplasm
18. ExtracellularExtracellular43. SarcomereIN Cytoplasm
19. Extracellular matrixIN Extracellular44. Sarcoplasmic reticulumIN Cytoplasm
20. Focal adhesionUNDER Membrane45. Spectrin cytoskeletonUNDER Membrane
21. Gap JunctionACROSS Membrane46. SynapseACROSS Membrane
22. GlycosomeIN Cytoplasm47. VacuoleIN Cytoplasm
23. GolgiIN Cytoplasm48. VesicleIN Cytoplasm
24. Inner membraneIN Membrane49. Vesicular exosomeIN Extracellular
25. Intermediate filamentsIN Cytoplasm50. VirionIN Extracellular

Advanced options: what if a protein localizes to multiple locations?

Many proteins have been experimentally observed at more than one location and carry mutliple localization annotations in Uniprot and/or GO. Examples include proteins that shuttle between organelles, that perform different functions at different points in development or cell differentiation, or that behave differently depending on cell type.

Cellwhere now includes an advanced feature to display duplicate copies of a protein node at alternative locations. The feature can be observed by running the default query but selecting “Annotation frequency” under localization options. Duplicate nodes are connected to their parent node by a green edge labelled with a question mark.

This feature is switched on by default when using "annotation frequency" but off by default when using priority flavors. It can be modified under advanced options at the bottom of the front page. By default, when the feature is selected, an alternative location will be shown if it has a frequency score >0.33 (i.e. if >33% of the protein's Uniprot/GO annotations map to this CellWhere localization). The user can set this cut-off value as desired, but low values will be refused if they result in a too large and over-populated graph. If using the feature with a priority flavor rather than annotation frequency, the score should be adjusted accordingly (e.g. to 7000 or some other high value).

How does CellWhere produce it's graphical display?

The localization step described above is used to visually organize the graph, as indicated in the two schema to the right below

CellWhere first attributes subcellular locations, as described above, for either the uploaded gene list or an uploaded pre-made network.

CellWhere will also query the Mentha server to retrieve known interactions between query proteins, together with the score allocated by Mentha as a measure of the strength of the evidence supporting each interaction.

If the option to grow the network using Mentha is selected, then CellWhere will request additional interactors of the query proteins. It will rank these by their score and retain all of the interacting proteins up to the maximum number set by the user.

The resulting network, with subcellular locations added is then organized for visualization.

Proteins are grouped according to their location, and locations are organized graphically.

A limited vocabulary (as described above) specified in the mapping file tells CellWhere where to place each location relative to the cell and to the cell membrane (for example, the Gap Junction may be marked 'ACROSS Membrane', or the Peroxisome marked 'IN Cytoplasm').

Co-ordinates are recalculated and the network is displayed using Cytoscape.js, on top of a membrane background.

What features does the graph have?

(this is also summarised in the 'GRAPH GUIDE' panel on the right side of the output page)

  • • Left-clicking on a protein (a 'node') opens a link to it's UniProt page
  • • Left-clicking on an interaction (an 'edge') pops up a box giving information from Mentha about its interaction evidence
  • • Edge thickness is proportional to the strength of the supporting evidence
  • • Hold left-click and drag to move nodes and localizations around
  • • Hold right-click and swipe with the mouse to delete nodes and localizations
  • • Query nodes are highlighted in red, except if a pre-made network is uploaded, and one of its attributes (e.g. fold-change) is selected to superimpose on the nodes (which would then be shaded red or blue, representing up- or down-regulation, respectively)

How can I find out about the evidence for an interaction?

As explained above, left-clicking on an interaction (an 'edge') pops up a box giving the evidence score and listing interaction evidence from Mentha. Edges with strong evidence scores have thicker widths. Evidence scores range from 0.03 (weakest) to 1 (strongest), and average around 0.25. Only ~10% of scores are stronger than 0.5. Details of the Mentha scoring function are given here.

The example table below shows the evidence for the interaction between Dystrophin (DMD gene) and Dystroglycan (DAG1 gene).

Mentha has provided four pieces of evidence for this interaction, two by crystallography, one by affinity chromatography and the other using pull down. A link is provided to the PubMed entry of the work in which the experiment was performed, and the source database is shown from which Mentha obtained the interaction evidence, together with the DOI of the supporting publication.

Interaction typeExperimental methodPubMedSource databaseDOI
MI:0915(physical association)MI:0004(affinity chromatography technology) 7592992 psi-mi:"MI:0463"(biogrid)
MI:0915(physical association)MI:0096(pull down) 19109891 psi-mi:"MI:0469"(IntAct)
MI:0914(association)MI:0114(x-ray crystallography) 10932245 psi-mi:"MI:0463"(biogrid)
MI:0407(direct interaction)MI:0114(x-ray crystallography) 10932245 psi-mi:"MI:0469"(IntAct)

Advanced options: what are promiscuous interactors?

Certain proteins (for example, Ubiquitins and heat shock proteins) form a great many interactions due to general functions that are unlikely to be pertinent to a specific mechanistic pathway. CellWhere provides a feature to ignore 'promiscuous interactors' during the addition of binding partners from Mentha. CellWhere pre-processes the Mentha data, making interaction counts for every protein, and storing these in the CellWhere database. The default behavior is to ignore proteins that bind more than 100 partners. Mentha currently reports interactions for ~82,000 proteins, and ~1,300 (~1.5%) of these have more than 100 reported binding partners. The user may adjust the cut-off as desired – it is available under “advanced options” at the bottom of the front page.

How do I share the output?

Various sharing options are given in the 'SAVE AS...' panel on the right side of the output page

The output can be saved simply as html (if you want to email it, then we recommended to use the 'zipped' option as this will avoid problems with some email applications that will otherwise try to read the attached html)

The network can also be downloaded in Cytoscape 3's xgmml format, for example if you wish to manipulate the network using the Cytoscape 3 desktop application. Please note however that the localization groupings (known as 'compound nodes' in Cytoscape.js) cannot currently be displayed by Cytoscape 3.

Why is my favorite protein at a weird location?

What are the different download files and how can I use them?

Localization frequencies

You can now download all of CellWhere's default annotations frequencies. These are available from the menu under downloads. They include 3 tables listing the most frequently annotated localization for each protein, based on CellWhere mappings of either Uniprot, GO, or both:

1. Localization frequencies: Uniprot and GO
2. Localization frequencies: Uniprot only
3. Localization frequencies: GO only

These files contain, for each protein, all of the CellWhere localizations - these are the results of mapping GO and Uniprot annotations to CellWhere localizations. They also list a frequency score for each localization. For example, Uniprot/GO localizations for the protein below, Q9ZT82, were mapped to 4 CellWhere localizations (Membrane, Unknown, Golgi, and Plasmodesma). 57% of Q9ZT82's Uniprot/GO localizations mapped to 'Membrane', and just 14% to each of the other three CellWhere localizations:

Q9ZT82 Membrane 0.57
Q9ZT82 Unknown 0.14
Q9ZT82 Golgi 0.14
Q9ZT82 Plasmodesma 0.14


You may also download existing CellWhere flavors (for an explanation of flavors, see How does CellWhere localize proteins? ). These are already selectable in box 4 under "Screening by flavor", but they can be useful to download to use as a starting point if you want to create your own flavor.

Is there a CellWhere API? Can I connect with CellWhere programmatically?

If you have a question not covered here, or indeed if you are just curious about how CellWhere works and whether we can tweak it to work better for you, please contact us at: