Generating Random Global Mixed Gene Expression Data

Description

The function rmix generates an ExpressionMix-class object, composed of a given number of underlying cell types. The amount of noise added to both the cell-specific signatures and the global expression values is customisable.

Usage

rmix(x, n = 100, p = 20, markers = ceiling(nrow(x)/20), min = 0, max = 20, mfold = 2, 
  alpha = 1, snoise = list(mean = 0, sd = 0.05), gnoise = list(shape = 5, scale = 1), 
      ...)

Arguments

x
number of true underlying cell types or a matrix containing the signatures themselves, i.e. cell-specific expression values for each feature. For convenience, it may also specify the markers to enforce on the signatures, as a vector or list of length > 1, in which case argument markers must be missing.
n
number of features, i.e. genes. The argument is required if x specifies the number of signatures. If x is provided as a matrix, then n is used to subset it (x[n, ]) before simulating the global expression data.
p
number of samples
markers
specification of the number of markers to enforce on each cell type signature. This should be a value supported by enforceMarkers. Markers enforcement may be disabled with markers=NA.
min
minimum cell-specific expression value before adding noise and marker differential.
max
maximum cell-specific expression value before adding noise and marker differential.
mfold
fold change expected on cell-specific expression for marker genes
alpha
parameter for the dirichlet distribution from which are drawn the mixture proportions, using rdirichlet.
snoise
parameters for the normal noise added to each true underlying signatures as x + N(\mu, \sigma).
gnoise
parameters for the normal noise with inverse gamma variance added to each feature global expression profile as e_{ij} + N(0, 1/\gamma_i).
...
extra arguments currently not used.

Value

an ExpressionMix-class object, that contains the true underlying signatures and proportions stored as an NMF model.

Examples


# 3 cell types, 100 features, 20 samples
rmix(3, 100, 20)
## ExpressionMix (storageMode: lockedEnvironment)
## assayData: 100 features, 20 samples 
##   element names: exprs 
## protocolData: none
## phenoData: none
## featureData: none
## experimentData: use 'experimentData(object)'
## Annotation:  
## Composition: 'CL_1', 'CL_2', 'CL_3' (3 total)

# from known signature matrix
s <- rmatrix(100, 5)
x <- rmix(s, p=20)
dim(x)
##   Features    Samples Components 
##        100         20          5
if( !isCRAN_timing() ){
aheatmap(x)
}

plot of chunk unnamed-chunk-1


# markers are enforced on each true signature
x <- rmix(4, 50, 20, markers=6)
if( !isCRAN_timing() ){
basismap(x, Rowv=NA)
}

plot of chunk unnamed-chunk-1

# or also
x <- rmix(1:4, 50, 20)
if( !isCRAN_timing() ){
basismap(x, Rowv=NA)
}

plot of chunk unnamed-chunk-1