Partial Gene Expression Deconvolution by Standard Least-Squares

Description

Estimates cell/tissue proportions given a known set of cell/tissue-specific expression signatures, using standard least-squares as proposed by Abbas et al. (2009).

Usage

gedAlgorithm.lsfit(..., rescale = TRUE, fit = c("ls", "nnls"))

Arguments

...
extra arguments passed to fitting the methods .nn_lsfit or .fcnnls.
rescale
logical used when estimating proportions from signatures, that indicates if the esti,ated coefficients should be scaled to sum up to one (TRUE) or left as estimated by the linear regression (FALSE). This scaling is performed after the coefficients have been forced to be nonnegative.
fit
least-square fitting method: ls uses lm, nnls uses fcnnls.

Details

The default algorithm uses a heuristic to enforce the nonnegativity of the estimated proportions, that consists in fitting successive regressions, each time excluding the most negative coefficient from the model, until all coefficients are nonnegative. In this case all regressions are fitted using the function lm.

An alternative least-square fitting method is included for test/experimental purposes. It uses the fast combinatorial nonnegative least-square method of Van Benthem et al. (2004), which was adapted by Kim et al. (2007) to perform nonnegative matrix factorization of gene expression -- but not originally for deconvolution. This general method in implemented in the NMF package. In this case a single regression is fitted using the function fcnnls.

References

Abbas AR, Wolslegel K, Seshasayee D, Modrusan Z and Clark HF (2009). "Deconvolution of blood microarray data identifies cellular activation patterns in systemic lupus erythematosus." _PloS one_, *4*(7), pp. e6098. ISSN 1932-6203, , .

Van Benthem M and Keenan MR (2004). "Fast algorithm for the solution of large-scale non-negativity-constrained least squares problems." _Journal of Chemometrics_, *18*(10), pp. 441-450. ISSN 0886-9383, , .

Kim H and Park H (2007). "Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis." _Bioinformatics (Oxford, England)_, *23*(12), pp. 1495-502. ISSN 1460-2059, , .

Examples


# random target matrix
x <- rmatrix(100, 20)
# random cell signatures
s <- rmatrix(100, 3)

# deconvolve using standard least-squares
res <- ged(x, s, 'lsfit')
coef(res)
##        [,1]   [,2]   [,3]   [,4]    [,5]   [,6]   [,7]   [,8]   [,9]
## [1,] 0.1648 0.2875 0.2933 0.2999 0.06797 0.2663 0.2900 0.3466 0.2836
## [2,] 0.4602 0.3411 0.3873 0.2745 0.51872 0.3639 0.3119 0.2357 0.3638
## [3,] 0.3750 0.3714 0.3193 0.4256 0.41332 0.3698 0.3981 0.4177 0.3527
##       [,10]  [,11]  [,12]  [,13]  [,14]  [,15]  [,16]  [,17]  [,18]  [,19]
## [1,] 0.3396 0.2933 0.1989 0.3242 0.1042 0.2596 0.2781 0.2361 0.4365 0.3504
## [2,] 0.3838 0.2725 0.2232 0.3997 0.4502 0.3825 0.4132 0.4085 0.2498 0.2835
## [3,] 0.2766 0.4342 0.5779 0.2761 0.4456 0.3579 0.3087 0.3553 0.3136 0.3661
##       [,20]
## [1,] 0.2723
## [2,] 0.2745
## [3,] 0.4531
# signatures are not updated
identical(basis(res), s)
## [1] TRUE
## Don't show: 
    stopifnot(identical(basis(res), s))
    stopifnot( nmf.equal(res, ged(x, s, 'lsfit')) )
## End Don't show

# Fitting with fcnnls
res <- ged(x, s, 'lsfit', fit = 'nnls')
coef(res)
##        [,1]   [,2]   [,3]   [,4]    [,5]   [,6]   [,7]   [,8]   [,9]
## [1,] 0.1648 0.2875 0.2933 0.2999 0.06797 0.2663 0.2900 0.3466 0.2836
## [2,] 0.4602 0.3411 0.3873 0.2745 0.51872 0.3639 0.3119 0.2357 0.3638
## [3,] 0.3750 0.3714 0.3193 0.4256 0.41332 0.3698 0.3981 0.4177 0.3527
##       [,10]  [,11]  [,12]  [,13]  [,14]  [,15]  [,16]  [,17]  [,18]  [,19]
## [1,] 0.3396 0.2933 0.1989 0.3242 0.1042 0.2596 0.2781 0.2361 0.4365 0.3504
## [2,] 0.3838 0.2725 0.2232 0.3997 0.4502 0.3825 0.4132 0.4085 0.2498 0.2835
## [3,] 0.2766 0.4342 0.5779 0.2761 0.4456 0.3579 0.3087 0.3553 0.3136 0.3661
##       [,20]
## [1,] 0.2723
## [2,] 0.2745
## [3,] 0.4531
# signatures are not updated
identical(basis(res), s)
## [1] TRUE
## Don't show: 
    stopifnot(identical(basis(res), s))
    stopifnot( nmf.equal(res, ged(x, s, 'lsfit', fit = 'nnls')) )
## End Don't show