Marker Scoring Method: Tukey Honest Significant Difference

Description

This scoring method markerScoreHSD performs pairwise comparison between groups of pure samples, and scores each comparison using Tukey's Honest Significant Difference p-values (see TukeyHSD).

The method selectMarkersmarkerScore_HSD selects, within each cell type separately, the markers with the lower aggregated p-value for Tukey's HSD. The default aggregation method is to compute the maximum HSD p-value.

Usage

markerScoreHSD(object, data, log = !is_logscale(object), lbase = 2, verbose = FALSE)

S3 (markerScore_HSD)
`selectMarkers`(x, data, statistic = max, ...)

Arguments

verbose
verbosity level, usually TRUE or FALSE.
statistic
method used to aggregate p-values of the pairwise comparisons with each of the other cell types into a single numeric score.
object
A numeric matrix, an object of class ExpressionSet, or a MarkerList-class object.
data
main extra data used by the scoring and selection methods. If object is a matrix-like object, data is generally a factor or grouping variable, as a vector, that defines the cell-type for each -- pure -- sample. If a vector, this argument is converted into a factor with the levels in their order of appearance in data, by factor(data, levels=unique(data)). This is to obtain levels in an order that is consistent with the samples' order. If object is a MarkerList object, then data is generally a matrix-like object that contains expression data.
log
a logical specifying if the should the data be log-transformed before computing the p-values. Default is to log-transform the data if is not already in log-scale, which is determined by the function is_logscale.
lbase
log base to used if the data is log-transformed.
x
data object based on which the markers are selected, as computed returned by extractMarkers(..., format='raw'). The type of x depends on the scoring method used to compute it.
...
other parameters passed to the scoring method in extractMarkers and selectMarkers, or that define default arguments when defining a scoring method with markerScoreMethod.

Details

The scores are returned in a matrix, with features in rows and cell types in column, which contains the HSD p-values corresponding to the comparisons between the most expressing cell type and other cell types. Each row contains an NA value that identifies the column corresponding to the associated feature's most expressing cell type.

Features whose expression is not consistently higher in one cell type than in any other cell type are discarded.

Examples


# generate data from pure cell type samples
x <- rpure(3)
x
## ExpressionMix (storageMode: lockedEnvironment)
## assayData: 100 features, 60 samples 
##   element names: exprs 
## protocolData: none
## phenoData
##   sampleNames: 1 2 ... 60 (60 total)
##   varLabels: CellType
##   varMetadata: labelDescription
## featureData: none
## experimentData: use 'experimentData(object)'
## Annotation:  
## Composition:  (3 total)
aheatmap(x, annCol=TRUE)

plot of chunk unnamed-chunk-1


# extract markers
ml <- extractMarkers(x, x$CellType, method='HSD')
# check score/p-value distribution
hist(ml)

plot of chunk unnamed-chunk-1

# plot most significant ones
profplot(ml[ml < 0.0001], x, split=TRUE)
## Warning: 'x' is NULL so the result will be NULL
## Warning: 'x' is NULL so the result will be NULL
## Warning: 'x' is NULL so the result will be NULL

plot of chunk unnamed-chunk-1