DSection Gene Expression Deconvolution Method

Description

The DSection algorithm performs gene expression deconvolution when priors on proportions are available, using a Markov Chain Monte Carlo approach Erkkila et al. (2010).

Usage

DSection(Y, p0, nSamples, groups = NULL, nBurnIn = 4 * nSamples, W0 = max(nrow(p0), 
  10), W_proposal = 100, samplep = TRUE, summarize = TRUE, verbose = FALSE)

Arguments

Y
matrix of measurements from heterogeneous tissues. I is the number of probes/genes/etc., and J is the number of tissues.
p0
matrix of prior predictions on cell type proportions. T is the number of cell types, and columns in p0 must be positive and add up to one.
groups
<1-by-J> vector of treatment indices, so that unique(Treatment) = [1,2,...,C], where C is the number of treatments including control, i.e., "no treatment", if available.
W0
Prior prediction weight, i.e., degree of confidence, on p0. Defines the peakedness of Dirichlet density around p0. NOTE: keep W0 >= T.
W_proposal
Transition kernel weight, defines the peakedness of Dirichlet density around p*, the old value. The higher W_proposal is, the smaller the proposal steps around p* are.
nBurnIn
Amount of burn-in. NOTE: keep nBurnIn > 0.
nSamples
Amount of sampling. NOTE: keep nSamples > 0.
samplep
logical value, indicating whether to sample from the posterior for cell type proportions (TRUE) or not (FALSE). SUGGESTED USE: sample from the posterior (samplep = 1).
summarize
logical indicating whether only average values should be returned -- and computed.
verbose
logical that indicates if verbose messages should be shown.

Value

A list with the following elements:

MCData results from the MCMC estimation. x_LS Standard-least square estimate. groups factor defining the groups of samples, if any was provided. call the call to DSection. parameters a list of some of the parameters used in the estimation. p0 initial prior on proportions.

Details

In CellMix, this method is registered with the key 'DSection', and is can be applied to gene expression data via the function ged.

This function uses the RcppOctave package to run the original Matlab code in Octave. The documentation was extracted from the Matlab source file, that can be found in the CellMix package "scripts/DSection" subdirectory.

The Matlab code requires the Octave packages statistics and optim to run properly. These packages can be downloaded from Octave-forge:

http://sourceforge.net/projects/octave/files/Octave Forge Packages/Individual Package Releases/

and installed in Octave with:

pkg install ''

or in R

install.extras('CellMix', 'octave:DSection')

References

http://informatics.systemsbiology.net/DSection/ http://www.cs.tut.fi/~erkkila2/software/dsection/index.html

Erkkila T, Lehmusvaara S, Ruusuvuori P, Visakorpi T, Shmulevich I and Lahdesmaki H (2010). "Probabilistic analysis of gene expression measurements from heterogeneous tissues." _Bioinformatics (Oxford, England)_, *26*(20), pp. 2571-7. ISSN 1367-4811, , .

Examples


#' # random global expression
x <- rmix(3, 50, 10)
dim(x)
##   Features    Samples Components 
##         50         10          3

# extract true proportions
p <- coef(x)
# add noise to proportions
p0 <- scoef(abs(p + rmatrix(p, dist=rnorm, sd=0.15)))
# check how noisy this makes the proportion estimates
profplot(p, p0)

plot of chunk unnamed-chunk-1


## Not run: 
##D # requires octave package statistics to be installed (for gamfit)
##D res <- DSection(exprs(x), p0, nSamples=100, nBurnIn=1L)
##D profplot(p, res$MCData$p)
## End(Not run)

Author

Original Matlab code: Timo Erkkila Wrapper function: Renaud Gaujoux