calculatePathwayStatistics {sigPathway}R Documentation

Calculate the NTk and NEk statistics

Description

Calculates the NTk and NEk statistics and the corresponding p-values and q-values for each selected pathway.

Usage

calculate.NTk(tab, phenotype, gsList, nsim = 1000,
              ngroups = 2, verbose = FALSE)
calculate.NEk(tab, phenotype, gsList, nsim = 1000,
              weightType = c("constant", "variable"),
              ngroups = 2, verbose = FALSE)

Arguments

tab a numeric matrix of expression values, with the rows and columns representing probe sets and sample arrays, respectively
phenotype a numeric vector indicating the phenotype
gsList a list containing three vectors from the output of the selectGeneSets function
nsim an integer indicating the number of permutations to use
weightType a character string specifying the type of weight to use when calculating NEk statistics
ngroups an integer indicating the number of groups in the matrix
verbose a boolean to indicate whether to print debugging messages to the R console

Details

These functions calculate the NTk and NEk statistics and the corresponding p-values and q-values for each selected pathway. The output of both functions should be together to rank top pathways with the rankPathways function.

Value

A list containing

ngs number of gene sets
nsim number of permutations performed
t.set a numeric vector of Tk/Ek statistics
t.set.new a numeric vector of NTk/NEk statistics
p.null the proportion of nulls
p.value a numeric vector of p-values
q.value a numeric vector of q-values

Author(s)

Lu Tian and Peter Park, with contributions from Weil Lai

References

Tian L., Greenberg S.A., Kong S.W., Altschuler J., Kohane I.S., Park P.J. (2005) Discovering statistically significant pathways in expression profiling studies. Proceedings of the National Academy of Sciences of the USA, 102, 13544-9.

http://www.pnas.org/cgi/doi/10.1073/pnas.0506577102

Examples

## Load in expression data and select the probe sets have expression
## values greater than the trimmed mean in at least 1 out of 49 arrays
data(MuscleData)
sf <- apply(MuscleData, 2, mean, tr = 0.025)
temp <- sweep(MuscleData, 2, sf, FUN = '/')
ind.pskeep <- which(rowSums(temp > 1) > 0)
tabMD <- MuscleData[ind.pskeep, ]
probeID <- names(ind.pskeep)

rm(temp)

## Select the data to study: IBM vs. NORM _or_ DM vs. NORM
compIBM <- TRUE

if( compIBM == TRUE )  {
  tab <- tabMD[,c(index.NORM, index.IBM)]
  phenotype <- c(rep.int(0,length(index.NORM)), rep.int(1,length(index.IBM)))
}else  {
  tab <- tabMD[,c(index.NORM, index.DM)]
  phenotype <- c(rep.int(0,length(index.NORM)), rep.int(1,length(index.DM)))
}

## Prepare the pathways to analyze
data(GenesetsU133a)
gsList <- selectGeneSets(G, probeID, 20, 500)

## Calculate NTk and weighted NEk for each gene set
## * Use a higher nsim (e.g., 2500) value for more reproducible results
nsim <- 100
ngroups <- 2
verbose <- TRUE
weightType <- "constant"
methodNames <- c("NTk", "NEk")
npath = 25
res.NTk <- calculate.NTk(tab, phenotype, gsList, nsim, ngroups, verbose)
res.NEk <- calculate.NEk(tab, phenotype, gsList, nsim, weightType,
                         ngroups, verbose)

## Summarize results
res.pathways <- rankPathways(res.NTk, res.NEk, G, gsList, methodNames,
                             npath)
print(res.pathways)

## Get more information about the probe sets' means and other statistics
## for the top pathway in res.pathways
topIndex <- res.pathways$IndexG[1]
res.topPathway <- getPathwayStatistics(tab, phenotype, G, topIndex,
                                       ngroups, NULL, FALSE)
print(res.topPathway[[1]])

[Package sigPathway version 1.1-0 Index]