runSigPathway {sigPathway} | R Documentation |
Performs pathway analysis
runSigPathway(G, minNPS = 20, maxNPS = 500, tab, phenotype, nsim = 1000, weightType = c("constant", "variable"), ngroups = 2, npath = 25, verbose = FALSE)
G |
a list containing the source, title, and probe sets associated with each curated pathway |
minNPS |
an integer specifying the minimum number of probe sets
in tab that should be in a gene set |
maxNPS |
an integer specifying the maximum number of probe sets
in tab that should be in a gene set |
tab |
a numeric matrix of expression values, with the rows and columns representing probe sets and sample arrays, respectively |
phenotype |
a numeric vector indicating the phenotype |
nsim |
an integer indicating the number of permutations to use |
weightType |
a character string specifying the type of weight to use when calculating NEk statistics |
ngroups |
an integer indicating the number of groups in the matrix |
npath |
an integer indicating the number of top gene sets to consider from each statistic when ranking the top pathways |
verbose |
a boolean to indicate whether to print debugging messages to the R console |
runSigPathway
is a wrapper function that
(1) Selects the gene sets to analyze using selectGeneSets
(2) Calculates NTk and NEk statistics using calculate.NTk
and
calculate.NEK
(3) Ranks the top npath
pathways from each statistic using
rankPathways
(4) Summarizes the means, standard deviation, and individual
statistics of each probe set in each of the above pathways using
getPathwayStatistics
A list containing
gsList |
a list containing three vectors from the output of
the selectGeneSets function |
list.NTk |
a list from the output of calculate.NTk |
list.NEk |
a list from the output of calculate.NEk |
df.pathways |
a data frame from rankPathways which
contains the top pathways' indices in G , gene set category,
pathway title, set size, NTk statistics, NEk statistics, the
corresponding q-values, and the ranks.
|
list.gPS |
a list from getPathwayStatistics containing
nrow(df.pathways) data frames corresponding to the pathways
listed in df.pathways . Each data frame contains the
name, mean, standard deviation, the test statistic (e.g., t-test),
and the corresponding unadjusted p-value. If ngroups = 1, the
Pearson correlation coefficient is also returned. |
Lu Tian and Peter Park, with contributions from Weil Lai
Tian L., Greenberg S.A., Kong S.W., Altschuler J., Kohane I.S., Park P.J. (2005) Discovering statistically significant pathways in expression profiling studies. Proceedings of the National Academy of Sciences of the USA, 102, 13544-9.
http://www.pnas.org/cgi/doi/10.1073/pnas.0506577102
## Load in expression data and select the probe sets have expression ## values greater than the trimmed mean in at least 1 out of 49 arrays data(MuscleData) sf <- apply(MuscleData, 2, mean, tr = 0.025) temp <- sweep(MuscleData, 2, sf, FUN = '/') ind.pskeep <- which(rowSums(temp > 1) > 0) tabMD <- MuscleData[ind.pskeep, ] probeID <- names(ind.pskeep) rm(temp) ## Select the data to study: IBM vs. NORM _or_ DM vs. NORM compIBM <- TRUE if( compIBM == TRUE ) { tab <- tabMD[,c(index.NORM, index.IBM)] phenotype <- c(rep.int(0,length(index.NORM)), rep.int(1,length(index.IBM))) }else { tab <- tabMD[,c(index.NORM, index.DM)] phenotype <- c(rep.int(0,length(index.NORM)), rep.int(1,length(index.DM))) } ## Prepare the pathways to analyze data(GenesetsU133a) nsim <- 100 ngroups <- 2 verbose <- TRUE weightType <- "constant" npath = 25 res.muscle <- runSigPathway(G, 20, 500, tab, phenotype, nsim, weightType, ngroups, npath, verbose) ## Summarize results print(res.muscle$df.pathways) ## Get more information about the probe sets' means and other statistics ## for the top pathway in res.pathways print(res.muscle$list.gPS[[1]])