Marker Scoring Method: Abbas et al. (2009)

Description

markerScoreAbbas implements the scoring/selection method proposed by Abbas et al. (2009), to select marker genes from pure cell type samples.

Usage

markerScoreAbbas(object, data, statistic = "p.value", ntop = 2, log = !is_logscale(object), 
  lbase = 2, vsall = FALSE)

Arguments

statistic
statistic to use as a score. The method computes the following quantities for each comparison between the most expressing cell type (highest within-group mean) with the i-th most expressing other cell type:
  1. p.valuep-value (t-test).
  2. dmdifference in means.
  3. statisticstatistic value (t statistic).
  4. dmMminimum difference = min(top) - max(top).
  5. foldfold change in means.
  6. mMfoldminimum fold change = min(top)/max(top).
  7. topgroup index of the i-th most expressing cell type. It is relative to the levels of data.
ntop
the number of groups for which statistics should be computed (>= 2). If ntop=2, only statistics between the highest and second-highest expressing cell-type are computed. If ntop=3, statistics between the highest and third-highest are also computed, and so on. Use ntop=Inf to compare the highest-expressing group to all other groups.
log
a logical specifying if the should the data be log-transformed before computing the p-values. Default is to log-transform the data if is not already in log-scale, which is determined by the function is_logscale.
lbase
log base to used if the data is log-transformed.
vsall
logical used when ntop=2 that indicates if the comparison should be carried out between the highest-expressing cell-type and the rest of all other cell-types.
object
A numeric matrix, an object of class ExpressionSet, or a MarkerList-class object.
data
main extra data used by the scoring and selection methods. If object is a matrix-like object, data is generally a factor or grouping variable, as a vector, that defines the cell-type for each -- pure -- sample. If a vector, this argument is converted into a factor with the levels in their order of appearance in data, by factor(data, levels=unique(data)). This is to obtain levels in an order that is consistent with the samples' order. If object is a MarkerList object, then data is generally a matrix-like object that contains expression data.

Value

extractMarkersa numeric matrix with the following named columns:

  • topthe index of the highest-expressing cell-type, as defined by the levels of the factor derived from argument group.
  • p.value2 p-value from the comparison between the highest-expressing (top) and second highest-expressing (second top) cell-type.
  • dm2difference in mean expression between the top and second top cell-type
  • dmM2minimum difference of expression between the top and second top cell-type
  • fold2fold change mean expression between the top and second top cell-type
  • mMfold2minimum fold change expression between the top and second top cell-type
  • dm3, mdM3, fold3, mMfold3 same quantitities computes between the top and third-top cell-type.

The result matrix has an attribute 'types' that contains the levels of the original (or converted) factor group.

Details

The method ‘Abbas’ uses a t-test approach, so that the data is assumed to contain at least 2 pure samples per cell-type. It implements the method from Abbas et al. (2009):

"[...] top differentially expressed (based on 95 change confidence intervals from Student's T-test) probesets were determined by comparing each probe's highest-expressed group with the next highest-expressed group in order to find probesets that are good markers for each cell population. This step was repeated with comparison between the top group and the third-highest group in order to also include probesets that were strong markers for two cell populations."

For each gene, the highest-expressing cell type is determined by ordering them by mean expression. Comparisons and p-value computations are performed using the fast t-test implementation from rowttests in the genefilter package.

References

Abbas AR, Wolslegel K, Seshasayee D, Modrusan Z and Clark HF (2009). "Deconvolution of blood microarray data identifies cellular activation patterns in systemic lupus erythematosus." _PloS one_, *4*(7), pp. e6098. ISSN 1932-6203, , .