Heatmaps Highlighting Markers

Description

The function markermap draws a heatmap of a reference expression data (e.g., the expression matrix from pure samples or estimated cell-specific signatures), where marker are annotated by colored bands on the left-hand side of the heatmap.

basismarkermap calls markermap with arguments tuned so that: no column reordering is performed, the rows are scaled to sum up to one, the heatmap only shows marker expression values, and the markers are placed according to their most expressing estimated signature. This view is meant to help in assessing the validity of deconvolution results, when known markers are available: ideally marker row annotations should be composed of monochrome monoblocks.

Usage

markermap(object, data, ...)

S4 (MarkerList,matrix)
`markermap`(object, data, annCol = NA, annColors = NA, annRow = NA, view = "split", 
  subsetRow = NULL, scale = "row", color = "YlOrRd:100", Rowv = TRUE, Colv = NA, 
      distfun = "correlation", hclustfun = "average", ...)

basismarkermap(object, data, scale = "r1", view = "predict", subsetRow = TRUE, Rowv = NA, 
      labRow = NA, ...)

Arguments

object
a MarkerList object
data
reference data object, whose values are used in the heatmap.
subsetRow
this argument acts as in aheatmap, but if subsetRow=TRUE, then the heatmap is limited to the markers only.
...
graphical parameters for the text used in plot. Parameters passed to grid.text, see gpar.
annCol
specifications of column annotation tracks displayed as coloured rows on top of the heatmaps. The annotation tracks are drawn from bottom to top. A single annotation track can be specified as a single vector; multiple tracks are specified as a list, a data frame, or an ExpressionSet object, in which case the phenotypic data is used (pData(eset)). Character or integer vectors are converted and displayed as factors. Unnamed tracks are internally renamed into Xi, with i being incremented for each unamed track, across both column and row annotation tracks. For each track, if no corresponding colour is specified in argument annColors, a palette or a ramp is automatically computed and named after the track's name.
annColors
list for specifying annotation track colors manually. It is possible to define the colors for only some of the annotations. Check examples for details.
annRow
specifications of row annotation tracks displayed as coloured columns on the left of the heatmaps. The annotation tracks are drawn from left to right. The same conversion, renaming and colouring rules as for argument annCol apply.
scale
character indicating how the values should scaled in either the row direction or the column direction. Note that the scaling is performed after row/column clustering, so that it has no effect on the row/column ordering. Possible values are:
  • "row": center and standardize each row separately to row Z-scores
  • "column": center and standardize each column separately to column Z-scores
  • "r1": scale each row to sum up to one
  • "c1": scale each column to sum up to one
  • "none": no scaling
color
colour specification for the heatmap. Default to palette '-RdYlBu2:100', i.e. reversed palette 'RdYlBu2' (a slight modification of RColorBrewer's palette 'RdYlBu') with 100 colors. Possible values are:
  • a character/integer vector of length greater than 1 that is directly used and assumed to contain valid R color specifications.
  • a single color/integer (between 0 and 8)/other numeric value that gives the dominant colors. Numeric values are converted into a pallete by rev(sequential_hcl(2, h = x, l = c(50, 95))). Other values are concatenated with the grey colour '#F1F1F1'.
  • one of RColorBrewer's palette name, or one of 'RdYlBu2', 'rainbow', 'heat', 'topo', 'terrain', 'cm'.
When the coluor palette is specified with a single value, and is negative or preceded a minus ('-'), the reversed palette is used. The number of breaks can also be specified after a colon (':'). For example, the default colour palette is specified as '-RdYlBu2:100'.
Rowv
clustering specification(s) for the rows. It allows to specify the distance/clustering/ordering/display parameters to be used for the rows only. Possible values are:
  • TRUE or NULL (to be consistent with heatmap): compute a dendrogram from hierarchical clustering using the distance and clustering methods distfun and hclustfun.
  • NA: disable any ordering. In this case, and if not otherwise specified with argument revC=FALSE, the heatmap shows the input matrix with the rows in their original order, with the first row on top to the last row at the bottom. Note that this differ from the behaviour or heatmap, but seemed to be a more sensible choice when vizualizing a matrix without reordering.
  • an integer vector of length the number of rows of the input matrix (nrow(x)), that specifies the row order. As in the case Rowv=NA, the ordered matrix is shown first row on top, last row at the bottom.
  • a character vector or a list specifying values to use instead of arguments distfun, hclustfun and reorderfun when clustering the rows (see the respective argument descriptions for a list of accepted values). If Rowv has no names, then the first element is used for distfun, the second (if present) is used for hclustfun, and the third (if present) is used for reorderfun.
  • a numeric vector of weights, of length the number of rows of the input matrix, used to reorder the internally computed dendrogram d by reorderfun(d, Rowv).
  • FALSE: the dendrogram is computed using methods distfun, hclustfun, and reorderfun but is not shown.
  • a single integer that specifies how many subtrees (i.e. clusters) from the computed dendrogram should have their root faded out. This can be used to better highlight the different clusters.
  • a single double that specifies how much space is used by the computed dendrogram. That is that this value is used in place of treeheight.
Colv
clustering specification(s) for the columns. It accepts the same values as argument Rowv (modulo the expected length for vector specifications), and allow specifying the distance/clustering/ordering/display parameters to be used for the columns only. Colv may also be set to "Rowv", in which case the dendrogram or ordering specifications applied to the rows are also applied to the columns. Note that this is allowed only for square input matrices, and that the row ordering is in this case by default reversed (revC=TRUE) to obtain the diagonal in the standard way (from top-left to bottom-right). See argument Rowv for other possible values.
distfun
default distance measure used in clustering rows and columns. Possible values are:
  • "correlation" and all the distances supported by dist (e.g. "euclidean").
  • an object of class dist such as returned by dist or as.dist.
hclustfun
default clustering method used to cluster rows and columns. Possible values are:
  • a method name (a character string) supported by hclust (e.g. 'average').
  • an object of class hclust such as returned by hclust
  • a dendrogram
labRow
labels for the rows.
view
a character string that specifies how markers should be annotated:
  1. “single”markers are shown in a single track
  2. “split”each cell type is shown in a separate track
  3. “predict”one track for each column in data is added, and markers are shown on the track associated with the column corresponding to its maximum value.

Details

Argument view controls the way markers are annotated. On all views, markers are coloured according to their type, defined by the element of object in which they appear. Each couloured tick/cell corresponds to a different marker position.

The following views are available:

  1. single a single row annotation is added, showing the position of each marker.
  2. split one track per marker type in added, showing the position of each marker in its respective cell type.
  3. predict one track per column in data is added, showing the position of each marker in the most expressing column. When data is a basis matrix obtained from deconvolution, this view is useful to check how known cell type markers (the coulours) map on estimated signatures (the annotation columns).

Methods

  1. markermapsignature(object = "MarkerList", data = "matrix"): Workhorse method is for markermap.

  2. markermapsignature(object = "MarkerList", data = "ExpressionSet"): The method markermap for ExpressionSet objects calls the main markermap method on the expression matrix exprs(object).

  3. markermapsignature(object = "MarkerList", data = "NMF"): The method markermap for NMF objects calls the main markermap method on the basis matrix basis(object).

  4. markermapsignature(object = "MarkerList", data = "NMFfitX"): The method markermap for NMFfitX objects calls the main markermap method on the best fit fit(object).

  5. markermapsignature(object = "MatrixData", data = "ANY"): This method extracts and plots a list of markers from a set of basis signature matrix (in object), where each feature is associated with the most-expressing signature.

Examples


x <- rmix(3, 100, 20)
m <- getMarkers(x)
markermap(m, basis(x))

plot of chunk unnamed-chunk-1

markermap(m, x, view='single')

plot of chunk unnamed-chunk-1

basismarkermap(m, rnmf(3, x))

plot of chunk unnamed-chunk-1


# after real deconvolution
res <- ged(x, coef(x), 'csSAM')
basismarkermap(m, res)

plot of chunk unnamed-chunk-1

markermap(m, res, view='split')

plot of chunk unnamed-chunk-1