calculate.NGSk {sigPathway} | R Documentation |
Calculates the NGSk (NTk-like) statistics with gene label permutation and the corresponding p-values and q-values for each selected pathway.
calculate.NGSk(statV, gsList, nsim = 1000, verbose = FALSE)
statV |
a numeric vector of test statistic (not p-values) for each individual probe/gene |
gsList |
a list containing three vectors from the output of
the selectGeneSets function |
nsim |
an integer indicating the number of permutations to use |
verbose |
a boolean to indicate whether to print debugging messages to the R console |
This function is a generalized version of NTk calculations;
calculate.NTk
calls this function internally. To use this
function, the user must specify a vector of test statistics (e.g.,
t-statistic, Wilcoxon). Pathways from this function can be ranked
with rankPathways.NGSk
or with rankPathways
when
combined with results from another pathway analysis algorithm (e.g.,
calculate.NEk
).
A list containing
ngs |
number of gene sets |
nsim |
number of permutations performed |
t.set |
a numeric vector of Tk/Ek statistics |
t.set.new |
a numeric vector of NTk/NEk statistics |
p.null |
the proportion of nulls |
p.value |
a numeric vector of p-values |
q.value |
a numeric vector of q-values |
Lu Tian and Peter Park, with contributions from Weil Lai
Tian L., Greenberg S.A., Kong S.W., Altschuler J., Kohane I.S., Park P.J. (2005) Discovering statistically significant pathways in expression profiling studies. Proceedings of the National Academy of Sciences of the USA, 102, 13544-9.
http://www.pnas.org/cgi/doi/10.1073/pnas.0506577102
## Load in expression data and select the probe sets have expression ## values greater than the trimmed mean in at least 1 out of 49 arrays data(MuscleData) sf <- apply(MuscleData, 2, mean, tr = 0.025) temp <- sweep(MuscleData, 2, sf, FUN = '/') ind.pskeep <- which(rowSums(temp > 1) > 0) tabMD <- MuscleData[ind.pskeep, ] probeID <- rownames(tabMD) rm(temp) ## Select the data to study: IBM vs. NORM _or_ DM vs. NORM compIBM <- TRUE if( compIBM == TRUE ) { tab <- tabMD[,c(index.NORM, index.IBM)] phenotype <- c(rep.int(0,length(index.NORM)), rep.int(1,length(index.IBM))) }else { tab <- tabMD[,c(index.NORM, index.DM)] phenotype <- c(rep.int(0,length(index.NORM)), rep.int(1,length(index.DM))) } ## Prepare the pathways to analyze data(GenesetsU133a) gsList <- selectGeneSets(G, probeID, 20, 500) nsim <- 1000 ngroups <- 2 verbose <- TRUE weightType <- "constant" methodNames <- c("NTk", "NEk") npath = 25 allpathways <- FALSE statV <- calcTStatFast(tab, phenotype, ngroups)$tstat res.NGSk <- calculate.NGSk(statV, gsList, nsim, verbose) res.NEk <- calculate.NEk(tab, phenotype, gsList, nsim, weightType, ngroups, verbose) ## Summarize top pathways from NGSk res.pathways.NGSk <- rankPathways.NGSk(res.NGSk, G, gsList, methodName = "NGSk", npath) print(res.pathways.NGSk) res.pathways <- rankPathways(res.NGSk, res.NEk, G, tab, phenotype, gsList, ngroups, methodNames, npath, allpathways) print(res.pathways) ## Get more information about the probe sets' means and other statistics ## for the top pathway in res.pathways topIndex <- res.pathways$IndexG[1] res.topPathway.NGSk <- getPathwayStatistics.NGSk(statV, probeID, G, topIndex, FALSE, NULL) print(res.topPathway.NGSk[[1]]) res.topPathway <- getPathwayStatistics(tab, phenotype, G, topIndex, ngroups, NULL, FALSE, NULL) print(res.topPathway[[1]])