Cancer Genomics Browser at UCSC
   Home  -   Cancer Genomics  -   Genomes  -   Help
  UCSC Cancer Genomics Browser User's Guide
  Table of Contents:


  Introduction
 

Comprehensive characterization of individual tumors requires the integration of clinical features with genomic changes. Clinical details must be correlated with genomic, transcriptomic, and epigenetic data and the result displayed, analyzed, and compared within the field. The Cancer Genomics Browser is an extension of the UCSC Genome Browser that provides a mechanism to link these two types of information and a platform to visualize the complex dataset produced. It can then be distilled and presented in a variety of ways, including by value, chromosome location, clinical feature, and biological pathway or geneset of interest. It is also possible to quickly perform and easily view statistical analysis of either all or specific subset of the data. The goal of the Cancer Genomics Browser is to serve as a tool to help researchers understand and predict the pathological course of the disease, allowing the development and implication of more effective, individual treatment plans.



  Homepage
 

The Cancer Genomics Browser homepage contains a brief description of the browser as well as a 10 minute tutorial to quickly familiarize users with the basic features of the browser. News and conditions of use are also found here.

There are two menu panels on the page. The top panel links to the primary tools, while the sidebar provides additional information. The browser is entered via the Cancer Genomics link on the top or the Cancer Genomics Browser along the side. Information on the human genome assembly currently in use is available by clicking on the Genomes option. This user guide is accessed by selecting Help. There are also links to the main UCSC Genome Browser, credits for the development of the Cancer Genomics Browser, instructions on how to download a stand-alone version, project personnel, and email contact.



  Cancer Genomics Browser
 

Upon entry into the browser visualization portal, the default dataset of published breast cancer copy number variation is displayed in two heatmaps. The left panel shows the genomic data, which is correlated by sample to the clinical features in the right panel. Display options are located beneath the heatmap followed by track controls of the publicly available datasets organized by tumor type.



  Datasets
 

Published datasets available for manipulation in the Cancer Genomics Browser are listed in the Datasets panel. They are organized by tissue type. Multiple types of data (e.g. gene expression or copy number) from a single study are separated into discrete datasets. The dataset heading links to the abstract for the study publication in Pubmed. Below each heading is a drop-down menu of visualization options. Single or multiple datasets can be viewed. In addition to the heatmap, a summary view is also available. This displays a box plot of probe-wide data distribution. The line represents the median, the box the inner quartile, and the colored bars the outer quartile. The color intensity is proportional to the deviation of the median from zero. The Summary view facilitates rapid recognition of genomic regions with large deviations within the data.



  Genomic Heatmap
 

The Genomic Heatmap displays genome-wide data from the selected dataset(s) of published genomic, transcriptomic, or epigenetic studies. A heading directly above the heatmap identifies which dataset is displayed. Samples are arranged in rows, while columns designate individual probes and are mapped onto the human chromosome track shown below the heatmap. Centromeres are represented in red and cytobands in grayscale. Access the sample name and value at a specific genomic location by mousing-over the heatmap at the region of interest. Clicking the gear icon to the right of the header opens the Dataset Settings panel, where the color scheme can be changed. In the default color settings, red represents an amplification or increased relative expression and blue a deletion or decreased expression. Genomic areas where no data is available are shown in white.



  Zooming
 

In the default display, zoom in on the Genomic Heatmap by clicking on the desired location. Once the single-chromosome level is reached, the button under Display Options that allows viewing of the region in the UCSC Genome Browser becomes active. Click it to open the browser in a new window and view all available tracks. This also gives access to the probe identifiers used in the cancer dataset when available. Shift-click to zoom out in the Cancer Genomics Browser.



  Clinical Feature Heatmap
 

To the right of the Genomic Heatmap is the Clinical Feature Heatmap, which represents clinical features by column. Clinical feature data available depends on the published study currently being viewed. Sample rows are consistent between the Genomic and Clinical Feature Heatmaps facilitating comparison based on either type of data. Mousing-over a sample row in a particular clinical feature column generates a pop-up box with the sample identifier, a description of the clinical feature, and the value of the feature for that sample. The values of clinical features are represented by a yellow and black color scheme in the heatmap. Features with both binary (+/-) and graded (e.g. age, tissue source) values are displayed. Samples with no data for a particular clinical feature are colored gray.



  Sorting
 

Clicking on a clinical feature column once sorts samples by the values of that feature. A second click reverses the order of the sort (for example if in the initial sort estrogen receptor positive samples were displayed in the top rows of the heatmaps and those negative for the receptor were below, a second click will display them in the opposite configuration). After samples are sorted on their values for one feature, perform a secondary sort based on an additional feature by holding down the shift key while clicking on the second feature column (only applicable when the primary sort was on a binary feature, e.g. +/-, but not age). Increasingly nested sorts are possible by shift-clicking on more features. Samples can also be sorted by their genomic value for a specific location by changing the function of a heatmap click under Display Options to "Sorts" and then clicking on the desired point in the heatmap.



  Feature Configuration Panel
 

Clicking on the blue bar to the right of the Clinical Feature Heatmap opens the Feature Settings immediately below the heatmaps. The features displayed and their order in the Clinical Heatmap can be set here. Additional features are added by selecting them from the drop-down menu under the Select Features heading. Clinical features are removed by clicking the delete icon next to the feature name and can be rearranged by dragging and dropping in the list. The Update Features link must be clicked before changes to the clinical heatmap take effect.

Subgroups based on feature values can be specified through Feature Settings. Subgroups allow identification of both broad and very limited sets of samples within the data. The subgrouping user interface to the right of the drop-down menu is activated when a feature is selected from the list. It displays the full name of the selected feature followed by a red and a green group panel each containing a representation of the values encoded by the feature. These take two forms. For discrete values, a menu of options is shown. These can be selected individually or in multiples using shift- or control-click. When a feature represents a range of values, such as age, a slider is used. Moving the slider arrows changes the bounds. Subgroups of the dataset are created by selecting the desired feature and assigning some or all of its values to either the red or green subgroup. After changes are made, the current subgrouping is created or saved by clicking on the Add Subgrouping button. It then appears at right under Current Subgroups. This can be repeated with multiple features (see the Examples section to view additional applications of subgrouping). A feature value is deleted from a subgroup by clicking the delete icon to the right of the value under Current Subgroups. Red and green bars between the genomic and clinical heatmap correspond to the samples in each subgroup. Note that specification of subgroups that do not exist in the dataset will fail to generate these bars. Several common statistical tests as well as a multiple hypothesis adjustment are available to facilitate comparison between subgroups.

The parameters used when performing the tests in the browser are listed here. and additional details on the application of specific tests can be found in the NIST Engineering Statistics Handbook. Clicking on the Generate Statistics button after selecting the appropriate test displays the Statistical Track below the Genomic Heatmap, showing a logarithmic plot of the p-values for each probe. The height of the bar is inversely proportional to the p-value. P-values of less than 0.05 are colored to indicate significance, otherwise bars are shaded gray. If the test is inapplicable, the Statistical Track will not appear. Clicking on the blue bar again will close the Feature Configuration Panel, however neither changes to the feature display or subgroups are lost.



  Display Options
 

Display Options are located just beneath the heatmap box. When data is displayed in the default Chromosomes view and the Heatmap Click option is set to Zooms, the genomic range visible in the heatmap is indicated in the Position text box and its calculated size in base pairs immediately to the right. A range can be manually entered into the text box and visualized by clicking the Update Display Settings button. This method is used when moving from a region in the Genome Browser to the same region in the Cancer Genomics Browser. To view the RefSeq Gene Track, consisting of genes and their intron/exon structure, select the Show option from the corresponding drop down menu. Clicking on a value of a probe in the genomic heatmap will either zoom in on that genomic region or sort based on the value. Toggle between these two effects using the Heatmap Click drop down menu. Zooming is only active when data is displayed in the Chromosomes view. If the heatmap is zoomed to display data at the single-chromosome level, the button linking to that region in the Genome Browser becomes active.

Data for defined sets of genes such as components of a particular biological pathway can also be displayed by selecting Genesets from the Display As drop-down menu. This brings up the Geneset option menus and loads the default geneset from the Gene Ontology database of G1/S transition of mitotic cell cycle (GO:0000082). In the geneset view, each gene occupies equal space in the heatmap. If multiple probes from the study map to the same gene, they are scaled into columns within the space allocated for that gene. This is designed to facilitate cross-dataset comparison. Mousing-over the data for a sample in the heatmap produces a pop-up box with information about the sample including sample ID, probe ID at that locus, corresponding gene name from the geneset, and value. Genesets can also be sorted and subgrouped as with the whole-genome Chromosome view. Note that when datasets with multiple probes for a gene are sorted by that gene, the sort is based on the average value of all probes. Genesets are added to the display by selecting from the database of pre-existing genesets or by searching for and adding genes individually or in a list to create a new set. Genesets to be displayed are shown under the Display heading on the right of the Geneset menus and removed by clicking the delete icon next to the geneset name. After adding or removing a geneset, the Update button is used to update the data display. Grey bars in the display panel separate individual genesets. The browser includes a database of biological pathway- and process-specific genesets from well-established sources, including GSEA Molecular Signatures Database, KEGG, Gene Ontology, BioCarta, and NCI-Nature. Select from these genesets using the interface in the Existing Genesets tab. They can be searched either by the name of the geneset or a gene within the set (using HUGO nomenclature). Once a search string is entered, genesets matching that search are shown immediately below. Select a geneset to view a list of genes in the set and add it to the Display. User-defined genesets can be created in two ways. In either case HUGO gene names are used to identify genes. The Gene Search tab allows the user to search for individual genes and compile them into a list. Entering a string into the search box will bring up all genes with that string in their name. Select the genes to add them to the list at the right of the search box. Remove them by clicking on the delete icon next to the gene name. Alternatively, enter a list of HUGO gene names (delimited by comma or line return) into the text box under the Gene List tab and click the button to Validate & Add. The validated list will appear to the right. After completing the gene list by either method, name the geneset in the dark blue box above the list and click the save & add geneset button. User-defined genesets cannot be modified once they are saved and do not persist beyond the current session. The geneset will appear under the display heading according to the selected name prefaced by "user_" to avoid ambiguity with pre-loaded genesets.



  Saving and Loading Layouts
 

After configuring a preferred session layout, it may be saved by clicking the Save button at the bottom of the page. A saved session layout must be loaded using the same computer and browser on which it was saved. Upon re-entering the browser, simply click the Load button located adjacent to Save. Currently, only a single session configuration per user can be stored.



  Examples
 

Example 1: Comparing the overall genomic profile of samples with specific clinical characteristics within a dataset
Displaying subgroups in Summary View allows the user to visualize genomic regions that may be differentially regulated in tumors with distinct clinical feature values. For example, to compare the profile of ER+/Her- tumors from a particular dataset (here the default breast cancer data) with those that are ER-/Her2+, create the appropriate subgroups (Group 1: ER+ and Her2-; Group 2: ER- and Her2+) and select Summary View for the dataset. The Summary Views for the subgroups are displayed in vertical panels to facilitate comparison. Here differences in genomic amplifications and deletions are shown.

It is useful to initially determine the proportion of all samples that fall into each subgroup in Heatmap view, as in Summary View the Clinical Feature Panels of both subgroups have the same height regardless of the number of samples in each.

Example 2: Comparing the clinical feature profile of samples with specific clinical characteristics within a dataset
The clinical profile of a subset of samples can also be viewed using subgroups. For example, to compare the clinical features of tumors in patients of different age groups, create a subgroup based on age (here, Group 1: 70-82 years; Group 2: 23-30 years). Select Summary View for the desired dataset. The Clinical Feature Panel displays feature values as a proportion of samples within each subgroup. In this case, none of the tumors from the older patient group are ER+ or PR+, while none of the tumors from patients in the younger group are Her2+.



Example 3: Identifying a gene with levels that deviate significantly in a dataset and sorting on that gene to look for correlation to clinical feature values and overall genomic profile

In Summary View, the user can look for probes with significantly deviant values. To identify a particular probe, zoom in on its location. Here, a probe on chromosome 9 is shown.

At this point, selecting Heatmap View and changing Heatmap Click to Sort allows the user to click on the isolated probe and sort the samples based on its value.

This also sorts the clinical features based on the probe value. In this example, the majority of ER+ tumors express lower relative levels of the selected probe than ER- samples.

The gene the probe maps to can be identified by clicking the button to view the region in the UCSC Genome Browser and then clicking on the desired gene track.

This brings up the Description Page for the gene in the Genome Browser, in this case Annexin 1.

Alternatively, after sorting on the sample value of the desired probe switch back to Zoom on Heatmap Click and use Shift+Zoom to zoom out and see the entire genomic heatmap sorted by Annexin 1 expression.

Example 4: Comparing clinical features and genomic data of specific genesets by sorting and subgrouping
To determine how the genomic data (in this example gene expression from three breast cancer studies) of specific genesets (here three sets of genes predictive of tumor outcome) correlates with particular clinical features, first display the heatmap of each dataset. Select genesets from the existing list or create them. In this case the van't Veer breast cancer outcome up and down and the TFAC30 genesets are searched for by name and the display updated.



Open the Clinical Features Panel for each dataset and choose the features to display (here ER status and metastasis/relapse/response), order them as desired, and Update Features.

Sort the features first by ER status (click) and then by clinical outcome (shift+click). Note that for the third dataset, complete response is the reverse outcome of the first two dataset features (metasatis/relapse); repeating the shift+click for this feature reverses the sort so that all three datasets are consistent.

Next, create subgroups of ER status (Group 1: ER+, Group 2: ER-) and generate Wilcoxon statistics with a Bonferroni correction for each dataset. The gene expression data can now be compared, revealing a correlation between ER status and prognosis.



  Additional Resources
 

Video Tutorial -- A ten-minute video tutorial is available to familiarize users with the basic features of the cancer genomics browser.

Genome-cancer mailing list -- User questions are posted to and answered on the Cancer Genomics mailing list.

Contact -- Feedback and questions can be sent directly to the Cancer Genomics Browser staff.