calculatePathwayStatistics {sigPathway} | R Documentation |
Calculates the NTk and NEk statistics and the corresponding p-values and q-values for each selected pathway.
calculate.NTk(tab, phenotype, gsList, nsim = 1000, ngroups = 2, verbose = FALSE) calculate.NEk(tab, phenotype, gsList, nsim = 1000, weightType = c("constant", "variable"), ngroups = 2, verbose = FALSE)
tab |
a numeric matrix of expression values, with the rows and columns representing probe sets and sample arrays, respectively |
phenotype |
a numeric vector indicating the phenotype |
gsList |
a list containing three vectors from the output of
the selectGeneSets function |
nsim |
an integer indicating the number of permutations to use |
weightType |
a character string specifying the type of weight to use when calculating NEk statistics |
ngroups |
an integer indicating the number of groups in the matrix |
verbose |
a boolean to indicate whether to print debugging messages to the R console |
These functions calculate the NTk and NEk statistics and the
corresponding p-values and q-values for each selected pathway. The output
of both functions should be together to rank top pathways with
the rankPathways
function.
A list containing
ngs |
number of gene sets |
nsim |
number of permutations performed |
t.set |
a numeric vector of Tk/Ek statistics |
t.set.new |
a numeric vector of NTk/NEk statistics |
p.null |
the proportion of nulls |
p.value |
a numeric vector of p-values |
q.value |
a numeric vector of q-values |
Lu Tian and Peter Park, with contributions from Weil Lai
Tian L., Greenberg S.A., Kong S.W., Altschuler J., Kohane I.S., Park P.J. (2005) Discovering statistically significant pathways in expression profiling studies. Proceedings of the National Academy of Sciences of the USA, 102, 13544-9.
http://www.pnas.org/cgi/doi/10.1073/pnas.0506577102
## Load in expression data and select the probe sets have expression ## values greater than the trimmed mean in at least 1 out of 49 arrays data(MuscleData) sf <- apply(MuscleData, 2, mean, tr = 0.025) temp <- sweep(MuscleData, 2, sf, FUN = '/') ind.pskeep <- which(rowSums(temp > 1) > 0) tabMD <- MuscleData[ind.pskeep, ] probeID <- names(ind.pskeep) rm(temp) ## Select the data to study: IBM vs. NORM _or_ DM vs. NORM compIBM <- TRUE if( compIBM == TRUE ) { tab <- tabMD[,c(index.NORM, index.IBM)] phenotype <- c(rep.int(0,length(index.NORM)), rep.int(1,length(index.IBM))) }else { tab <- tabMD[,c(index.NORM, index.DM)] phenotype <- c(rep.int(0,length(index.NORM)), rep.int(1,length(index.DM))) } ## Prepare the pathways to analyze data(GenesetsU133a) gsList <- selectGeneSets(G, probeID, 20, 500) ## Calculate NTk and weighted NEk for each gene set ## * Use a higher nsim (e.g., 2500) value for more reproducible results nsim <- 100 ngroups <- 2 verbose <- TRUE weightType <- "constant" methodNames <- c("NTk", "NEk") npath = 25 res.NTk <- calculate.NTk(tab, phenotype, gsList, nsim, ngroups, verbose) res.NEk <- calculate.NEk(tab, phenotype, gsList, nsim, weightType, ngroups, verbose) ## Summarize results res.pathways <- rankPathways(res.NTk, res.NEk, G, gsList, methodNames, npath) print(res.pathways) ## Get more information about the probe sets' means and other statistics ## for the top pathway in res.pathways topIndex <- res.pathways$IndexG[1] res.topPathway <- getPathwayStatistics(tab, phenotype, G, topIndex, ngroups, NULL, FALSE) print(res.topPathway[[1]])