Generating Random NMF Models

Description

Generates NMF models with random values drawn from a uniform distribution. It returns an NMF model with basis and mixture coefficient matrices filled with random values. The main purpose of the function rnmf is to provide a common interface to generate random seeds used by the nmf function.

Usage

rnmf(x, target, ...)

S4 (NMF,numeric)
`rnmf`(x, target, ncol = NULL, keep.names = TRUE, dist = runif)

S4 (ANY,matrix)
`rnmf`(x, target, ..., dist = list(max = max(max(target, na.rm = TRUE), 
  1)), use.dimnames = TRUE)

S4 (numeric,missing)
`rnmf`(x, target, ..., W, H, dist = runif)

S4 (missing,missing)
`rnmf`(x, target, ..., W, H)

S4 (numeric,numeric)
`rnmf`(x, target, ncol = NULL, ..., dist = runif)

S4 (formula,ANY)
`rnmf`(x, target, ..., dist = runif)

Arguments

x
an object that determines the rank, dimension and/or class of the generated NMF model, e.g. a numeric value or an object that inherits from class NMF-class. See the description of the specific methods for more details on the supported types.
target
optional specification of target dimensions. See section Methods for how this parameter is used by the different methods.
...
extra arguments to allow extensions and passed to the next method eventually down to nmfModel, where they are used to initialise slots that are specific to the instantiating NMF model.
ncol
single numeric value that specifies the number of columns of the coefficient matrix. Only used when target is a single numeric value.
keep.names
a logical that indicates if the dimension names of the original NMF object x should be conserved (TRUE) or discarded (FALSE).
dist
specification of the random distribution to use to draw the entries of the basis and coefficient matrices. It may be specified as:
  • a function which must be a distribution function such as e.g. runif that is used to draw the entries of both the basis and coefficient matrices. It is passed in the dist argument of rmatrix.
  • a list of arguments that are passed internally to rmatrix, via do.call('rmatrix', dist).
  • a character string that is partially matched to ‘basis’ or ‘coef’, that specifies which matrix in should be drawn randomly, the other remaining as in x -- unchanged.
  • a list with elements ‘basis’ and/or ‘coef’, which specify the dist argument separately for the basis and coefficient matrix respectively. These elements may be either a distribution function, or a list of arguments that are passed internally to rmatrix, via do.call('rmatrix', dist$basis) or do.call('rmatrix', dist$coef).
use.dimnames
a logical that indicates whether the dimnames of the target matrix should be set on the returned NMF model.
W
value for the basis matrix. data.frame objects are converted into matrices with as.matrix.
H
value for the mixture coefficient matrix data.frame objects are converted into matrices with as.matrix.

Value

An NMF model, i.e. an object that inherits from class NMF-class.

Details

If necessary, extensions of the standard NMF model or custom models must define a method "rnmf,,numeric" for initialising their specific slots other than the basis and mixture coefficient matrices. In order to benefit from the complete built-in interface, the overloading methods should call the generic version using function callNextMethod, prior to set the values of the specific slots. See for example the method rnmf defined for NMFOffset-class models: showMethods(rnmf, class='NMFOffset', include=TRUE)).

For convenience, shortcut methods for working on data.frame objects directly are implemented. However, note that conversion of a data.frame into a matrix object may take some non-negligible time, for large datasets. If using this method or other NMF-related methods several times, consider converting your data data.frame object into a matrix once for good, when first loaded.

Methods

  1. rnmfsignature(x = "NMFOffset", target = "numeric"): Generates a random NMF model with offset, from class NMFOffset.

    The offset values are drawn from a uniform distribution between 0 and the maximum entry of the basis and coefficient matrices, which are drawn by the next suitable rnmf method, which is the workhorse method rnmf,NMF,numeric.

  2. rnmfsignature(x = "NMF", target = "numeric"): Generates a random NMF model of the same class and rank as another NMF model.

    This is the workhorse method that is eventually called by all other methods. It generates an NMF model of the same class and rank as x, compatible with the dimensions specified in target, that can be a single or 2-length numeric vector, to specify a square or rectangular target matrix respectively.

    The second dimension can also be passed via argument ncol, so that calling rnmf(x, 20, 10, ...) is equivalent to rnmf(x, c(20, 10), ...), but easier to write.

    The entries are uniformly drawn between 0 and max (optionally specified in ...) that defaults to 1.

    By default the dimnames of x are set on the returned NMF model. This behaviour is disabled with argument keep.names=FALSE. See nmfModel.

  3. rnmfsignature(x = "ANY", target = "matrix"): Generates a random NMF model compatible and consistent with a target matrix.

    The entries are uniformly drawn between 0 and max(target). It is more or less a shortcut for: rnmf(x, dim(target), max=max(target), ...)

    It returns an NMF model of the same class as x.

  4. rnmfsignature(x = "ANY", target = "data.frame"): Shortcut for rnmf(x, as.matrix(target)).

  5. rnmfsignature(x = "NMF", target = "missing"): Generates a random NMF model of the same dimension as another NMF model.

    It is a shortcut for rnmf(x, nrow(x), ncol(x), ...), which returns a random NMF model of the same class and dimensions as x.

  6. rnmfsignature(x = "numeric", target = "missing"): Generates a random NMF model of a given rank, with known basis and/or coefficient matrices.

    This methods allow to easily generate partially random NMF model, where one or both factors are known. Although the later case might seems strange, it makes sense for NMF models that have fit extra data, other than the basis and coefficient matrices, that are drawn by an rnmf method defined for their own class, which should internally call rnmf,NMF,numeric and let it draw the basis and coefficient matrices. (e.g. see NMFOffset-class and rnmf,NMFOffset,numeric-method).

    Depending on whether arguments W and/or H are missing, this method interprets x differently:

    • W provided, H missing: x is taken as the number of columns that must be drawn to build a random coefficient matrix (i.e. the number of columns in the target matrix).

    • W is missing, H is provided: x is taken as the number of rows that must be drawn to build a random basis matrix (i.e. the number of rows in the target matrix).

    • both W and H are provided: x is taken as the target rank of the model to generate.
    • Having both W and H missing produces an error, as the dimension of the model cannot be determined in this case.

    The matrices W and H are reduced if necessary and possible to be consistent with this value of the rank, by the internal call to nmfModel.

    All arguments in ... are passed to the function nmfModel which is used to build an initial NMF model, that is in turn passed to rnmf,NMF,numeric with dist=list(coef=dist) or dist=list(basis=dist) when suitable. The type of NMF model to generate can therefore be specified in argument model (see nmfModel for other possible arguments).

    The returned NMF model, has a basis matrix equal to W (if not missing) and a coefficient matrix equal to H (if not missing), or drawn according to the specification provided in argument dist (see method rnmf,NMF,numeric for details on the supported values for dist).

  7. rnmfsignature(x = "missing", target = "missing"): Generates a random NMF model with known basis and coefficient matrices.

    This method is a shortcut for calling rnmf,numeric,missing with a suitable value for x (the rank), when both factors are known: codernmf(min(ncol(W), nrow(H)), ..., W=W, H=H).

    Arguments W and H are required. Note that calling this method only makes sense for NMF models that contains data to fit other than the basis and coefficient matrices, e.g. NMFOffset-class.

  8. rnmfsignature(x = "numeric", target = "numeric"): Generates a random standard NMF model of given dimensions.

    This is a shortcut for rnmf(nmfModel(x, target, ncol, ...)), dist=dist). It generates a standard NMF model compatible with the dimensions passed in target, that can be a single or 2-length numeric vector, to specify a square or rectangular target matrix respectively. See nmfModel.

  9. rnmfsignature(x = "formula", target = "ANY"): Generate a random formula-based NMF model, using the method nmfModel,formula,ANY-method.

Examples


# random NMF model with offset
x <- rnmf(2, 3, model='NMFOffset')
x
## <Object of class:NMFOffset>
## features: 3 
## basis/rank: 2 
## samples: 3 
## offset: [ 0.8593 0.4552 0.1415 ]
offset(x)
## [1] 0.8593 0.4552 0.1415
# from a matrix
x <- rnmf(2, rmatrix(5,3, max=10), model='NMFOffset')
offset(x)
## [1] 0.1676 1.4397 0.0907 1.6305 8.4577
## random NMF of same class and rank as another model

x <- nmfModel(3, 10, 5)
x
## <Object of class:NMFstd>
## features: 10 
## basis/rank: 3 
## samples: 5
rnmf(x, 20) # square
## <Object of class:NMFstd>
## features: 20 
## basis/rank: 3 
## samples: 20
rnmf(x, 20, 13)
## <Object of class:NMFstd>
## features: 20 
## basis/rank: 3 
## samples: 13
rnmf(x, c(20, 13))
## <Object of class:NMFstd>
## features: 20 
## basis/rank: 3 
## samples: 13

# using another distribution
rnmf(x, 20, dist=rnorm)
## <Object of class:NMFstd>
## features: 20 
## basis/rank: 3 
## samples: 20

# other than standard model
y <- rnmf(3, 50, 10, model='NMFns')
y
## <Object of class:NMFns>
## features: 50 
## basis/rank: 3 
## samples: 10 
## theta: 0.5
## Don't show: 
 stopifnot( identical(dim(y), c(50L,10L,3L)) ) 
## End Don't show
## Don't show: 
 stopifnot( is(y, 'NMFns') ) 
## End Don't show
# random NMF compatible with a target matrix
x <- nmfModel(3, 10, 5)
y <- rmatrix(20, 13)
rnmf(x, y) # rank of x
## <Object of class:NMFstd>
## features: 20 
## basis/rank: 3 
## samples: 13
rnmf(2, y) # rank 2
## <Object of class:NMFstd>
## features: 20 
## basis/rank: 2 
## samples: 13
## random NMF from another model

a <- nmfModel(3, 100, 20)
b <- rnmf(a)
## Don't show: 
 stopifnot( !nmf.equal(a,b) ) 
## End Don't show
# random NMF model with known basis matrix
x <- rnmf(5, W=matrix(1:18, 6)) # 6 x 5 model with rank=3
basis(x) # fixed
##      [,1] [,2] [,3]
## [1,]    1    7   13
## [2,]    2    8   14
## [3,]    3    9   15
## [4,]    4   10   16
## [5,]    5   11   17
## [6,]    6   12   18
coef(x) # random
##         [,1]   [,2]   [,3]   [,4]   [,5]
## [1,] 0.66625 0.5059 0.2495 0.2319 0.1443
## [2,] 0.56969 0.2239 0.1809 0.4358 0.8978
## [3,] 0.06515 0.6946 0.9902 0.7095 0.7371

# random NMF model with known coefficient matrix
x <- rnmf(5, H=matrix(1:18, 3)) # 5 x 6 model with rank=3
basis(x) # random
##         [,1]    [,2]   [,3]
## [1,] 0.92839 0.73168 0.1459
## [2,] 0.30243 0.74252 0.4800
## [3,] 0.58870 0.46922 0.4528
## [4,] 0.90568 0.32728 0.4540
## [5,] 0.01955 0.05723 0.7322
coef(x) # fixed
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1    4    7   10   13   16
## [2,]    2    5    8   11   14   17
## [3,]    3    6    9   12   15   18

# random model other than standard NMF
x <- rnmf(5, H=matrix(1:18, 3), model='NMFOffset')
basis(x) # random
##         [,1]    [,2]    [,3]
## [1,] 0.33448 0.30402 0.17421
## [2,] 0.95253 0.08696 0.21987
## [3,] 0.29234 0.86826 0.33064
## [4,] 0.24446 0.26103 0.83145
## [5,] 0.08534 0.14251 0.07539
coef(x) # fixed
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1    4    7   10   13   16
## [2,]    2    5    8   11   14   17
## [3,]    3    6    9   12   15   18
offset(x) # random
## [1]  6.757  5.272  1.035 15.726 15.844
# random model other than standard NMF
x <- rnmf(W=matrix(1:18, 6), H=matrix(21:38, 3), model='NMFOffset')
basis(x) # fixed
##      [,1] [,2] [,3]
## [1,]    1    7   13
## [2,]    2    8   14
## [3,]    3    9   15
## [4,]    4   10   16
## [5,]    5   11   17
## [6,]    6   12   18
coef(x) # fixed
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]   21   24   27   30   33   36
## [2,]   22   25   28   31   34   37
## [3,]   23   26   29   32   35   38
offset(x) # random
## [1]  1.139  9.845  7.353 31.626  1.325 32.649
## random standard NMF of given dimensions

# generate a random NMF model with rank 3 that fits a 100x20 matrix
rnmf(3, 100, 20)
## <Object of class:NMFstd>
## features: 100 
## basis/rank: 3 
## samples: 20
## Don't show: 
 stopifnot( identical(dim(rnmf(3, 100, 20)), c(100L,20L,3L)) ) 
## End Don't show
# generate a random NMF model with rank 3 that fits a 100x100 matrix
rnmf(3, 100)
## <Object of class:NMFstd>
## features: 100 
## basis/rank: 3 
## samples: 100
## Don't show: 
 stopifnot( identical(dim(rnmf(3, 100)), c(100L,100L,3L)) ) 
## End Don't show