runCGHAnalysis {CGHweb} | R Documentation |
This is a wrapper function to call various algorithms.
runCGHAnalysis(tab, BioHMM = TRUE, UseCloneDists = TRUE, Lowess = TRUE, Lwidth = 15, Wavelet = TRUE, Wlevels = 3, Runavg = TRUE, Rwidth = 5, CBS = TRUE, alpha = 0.05, Picard = TRUE, Km = 20, S = -0.5, FusedLasso = TRUE, fluv = FALSE, FDR = 0.5, rsm = FALSE, GLAD = TRUE, qlambda = 0.999, FASeg = TRUE, sig = 0.025, delta = 0.1, srange = 50, fineTune = FALSE, Quantreg = TRUE, lambda = 1, minLR = -2, maxLR = 2, Threshold = 0.2, genomeType = "HG18", tempDir = getwd(), resultDir = "CGHResults")
tab |
data frame containing the array. The Column names of the dataframe should be ProbeID, Chromsome, Position, LogRatio. All the column names are case sensitive |
BioHMM |
boolean to run or not to run BioHMM. BioHMM uses a heterogeneous hidden Markov model. By default, it considers the distance between probes when estimating its parameters and gives higher probabilities to probes that are further apart than others. This algorithm is called from the snapCGH package, which is available in BioConductor. |
UseCloneDists |
boolean to use clone distances. Tells the algorithm either to consider probe distances in its calculations or assume a homogeneous hidden Markov model instead. Enabled by default. |
Lowess |
boolean to run or not to run LOWESS. LOWESS smoothes the data with robust weighted local polynomial fitting. Probes inside the smoothing window are weighted according to their distance from the center, with the more distant probes having less weight. This algorithm is called from the stats package of R. |
Lwidth |
the number of probes to use when calculating the weights around each probe in the lowess smoother. Default: 15 Range: min 5 max 50 |
Wavelet |
boolean to run or not to run wavelet. Wavelet smoothing smoothes the data by transforming the data into frequency components with maximal overlap discrete wavelet transform. The transformed data are filtered through soft SURE thresholding and then transformed back to the time domain to get the smoothed data. This approach is similar to the procedure described in Hsu et al. (2005). This algorithm uses functions from the waveslim package, which is available in the Comprehensive R Archive Network. |
Wlevels |
the depth of decomposition for maximal overlap discrete wavelet transform and the depth of thresholding for SURE. Default: 3 Range: min 1 max 6 |
Runavg |
boolean to run or not to run runavg. This method takes the average of probe values inside a smoothing window. |
Rwidth |
The number of probes to use around a probe when calculating their means. Default: 15 Range: min 5 max 50 |
CBS |
booelan to run or not to run Circular Binary Segmentation. CBS estimates the location of change-points by calculating a likelihood-ratio statistic for each probe and assessing its significance by permutation. This algorithm is called from the DNAcopy package, which is available in BioConductor. |
alpha |
the likelihood by chance that the segment means surrounding the change-point are equal. Default: 0.05 Range: min > 0.0 max < 1.0 |
Picard |
booelan to run or not to run CGHseg. CGHseg estimates breakpoints by making a cost matrix, finding all possible breakpoints from this matrix, and selecting the most likely number of breakpoints with adaptive penalty. Because of CGHseg's memory requirements, this website will divide the chromosome into smaller pieces if it has more than 10000 probes. |
Km |
The maximum number of segments to consider per chromosome. Default: 20 Range: min 5 max 50 |
S |
The adaptive penalty threshold. Default: -0.5 Range: min -1.0 max < 0.0 |
FusedLasso |
boolean to run or not to run cghFLasso. cghFLasso smoothes the data with the fused lasso, a spatial smoothing technique. Because of cghFLasso's memory requirements, this website will divide the chromosome into smaller pieces if it has more than 10000 probes. |
FDR |
False discovery rate (the proportion of true null hypotheses among those called significant). Default: 0.05 Range: min > 0.0 max < 1.0 |
fluv |
tells the algorithm to use FDR to determine significant segments. Disabled by default |
rsm |
Recalculate Segment Means: A post-processing step (not part of cghFLasso) to recalculate the segment means after finding the breakpoints with cghFLasso. Disabled by default. |
GLAD |
boolean to run or not to run GLAD. GLAD smoothes the data with likelihood-based adaptive weights smoothing, removes extraneous breakpoints with a penalized likelihood, and groups the segments with unsupervised clustering. This algorithm is called from the GLAD package, which is available in BioConductor. |
qlambda |
a scaling parameter used by adaptive weights smoothing for its stochastic penalty. Default: 0.9990 Range: min 0.9000 max 0.9999 |
FASeg |
boolean to run or not to run FASeg. FASeg uses lowess to find the location of possible breakpoints and conducts local ANOVA to identify significant breakpoints. |
sig |
Significance cutoff value. Default: 0.025 Range: min > 0.0 max < 1.0 |
delta |
The minimum height of the 'bumps' in the lowess-smoothed data to consider their boundaries as potential breakpoints. Default: 0.1 Range: min > 0.0 max 0.5 |
srange |
the number of probes to use when calculating the weights around each probe in the lowess smoother. Default: 50 Range: min 10 max 100 |
fineTune |
tells the algorithm to recalculate breakpoint locations (in smaller neighborhoods) after edge selection. Disabled by default. |
Quantreg |
boolean to run or not to run Quanttile smoothing. Quantile smoothing uses penalized quantile regression to find trends in the data. The code follows closely to the R code outlined in the Eilers and de Menezes (2005) paper, except we use the sparse implementation of the Frisch-Newton interior-point algorithm. The results represent the 50th quantile. This algorithm uses functions from the quantreg package, which is available in the Comprehensive R Archive Network. |
lambda |
The penalty parameter for the regression. Default: 2.0 Range: min > 0.0 max 10.0 |
minLR |
These parameters adjust the minimum and maximum log-ratios to show in the graphical results. Log-ratios below the minimum are drawn at the minimum value. Log-ratios above the maximum are drawn at the maximum value. Default: -2.0 Range: min -1.0 max -10.0 |
maxLR |
These parameters adjust the minimum and maximum log-ratios to show in the graphical results. Log-ratios below the minimum are drawn at the minimum value. Log-ratios above the maximum are drawn at the maximum value. Default: 2.0 Range: min 1.0 max 10.0 |
Threshold |
This is a simple way to identify gains and losses in the processed data. Gains are called for regions above the positive threshold. Losses are called for regions below the negative threshold. Default is 0.2 |
genomeType |
what genome type to be used |
tempDir |
Directory where temporary folder will be created |
resultDir |
A directory of this name will be created in tempDir. This directory will have results.html and all other temp files |
Thias is a wrraper function to call various algorithms
retuns on 0 or 1 for fail and success respectively
none
Vidhu Choudhary
"http://compbio.med.harvard.edu/CGHweb"
data(BacArray) ## To run the with all the default settings. x<-runCGHAnalysis(tab,tempDir=getwd(),resultDir="CGHResults")