runCGHAnalysis {CGHweb}R Documentation

runCGHAnalysis

Description

This is a wrapper function to call various algorithms.

Usage

runCGHAnalysis(tab, BioHMM = TRUE, UseCloneDists = TRUE, Lowess = TRUE,
                Lwidth = 15, Wavelet = TRUE, Wlevels = 3, Runavg = TRUE,
                Rwidth = 5, CBS = TRUE, alpha = 0.05, Picard = TRUE, Km = 20, 
                S = -0.5, FusedLasso = TRUE, fluv = FALSE, FDR = 0.5,
                rsm = FALSE, GLAD = TRUE, qlambda = 0.999, 
                FASeg = TRUE, sig = 0.025, delta = 0.1, 
                srange = 50, fineTune = FALSE, Quantreg = TRUE, lambda = 1,
                minLR = -2, maxLR = 2, Threshold = 0.2, 
                genomeType = "HG18", tempDir = getwd(), resultDir = "CGHResults")

Arguments

tab data frame containing the array. The Column names of the dataframe should be ProbeID, Chromsome, Position, LogRatio. All the column names are case sensitive
BioHMM boolean to run or not to run BioHMM. BioHMM uses a heterogeneous hidden Markov model. By default, it considers the distance between probes when estimating its parameters and gives higher probabilities to probes that are further apart than others. This algorithm is called from the snapCGH package, which is available in BioConductor.
UseCloneDists boolean to use clone distances. Tells the algorithm either to consider probe distances in its calculations or assume a homogeneous hidden Markov model instead. Enabled by default.
Lowess boolean to run or not to run LOWESS. LOWESS smoothes the data with robust weighted local polynomial fitting. Probes inside the smoothing window are weighted according to their distance from the center, with the more distant probes having less weight. This algorithm is called from the stats package of R.
Lwidth the number of probes to use when calculating the weights around each probe in the lowess smoother. Default: 15 Range: min 5 max 50
Wavelet boolean to run or not to run wavelet. Wavelet smoothing smoothes the data by transforming the data into frequency components with maximal overlap discrete wavelet transform. The transformed data are filtered through soft SURE thresholding and then transformed back to the time domain to get the smoothed data. This approach is similar to the procedure described in Hsu et al. (2005). This algorithm uses functions from the waveslim package, which is available in the Comprehensive R Archive Network.
Wlevels the depth of decomposition for maximal overlap discrete wavelet transform and the depth of thresholding for SURE. Default: 3 Range: min 1 max 6
Runavg boolean to run or not to run runavg. This method takes the average of probe values inside a smoothing window.
Rwidth The number of probes to use around a probe when calculating their means. Default: 15 Range: min 5 max 50
CBS booelan to run or not to run Circular Binary Segmentation. CBS estimates the location of change-points by calculating a likelihood-ratio statistic for each probe and assessing its significance by permutation. This algorithm is called from the DNAcopy package, which is available in BioConductor.
alpha the likelihood by chance that the segment means surrounding the change-point are equal. Default: 0.05 Range: min > 0.0 max < 1.0
Picard booelan to run or not to run CGHseg. CGHseg estimates breakpoints by making a cost matrix, finding all possible breakpoints from this matrix, and selecting the most likely number of breakpoints with adaptive penalty. Because of CGHseg's memory requirements, this website will divide the chromosome into smaller pieces if it has more than 10000 probes.
Km The maximum number of segments to consider per chromosome. Default: 20 Range: min 5 max 50
S The adaptive penalty threshold. Default: -0.5 Range: min -1.0 max < 0.0
FusedLasso boolean to run or not to run cghFLasso. cghFLasso smoothes the data with the fused lasso, a spatial smoothing technique. Because of cghFLasso's memory requirements, this website will divide the chromosome into smaller pieces if it has more than 10000 probes.
FDR False discovery rate (the proportion of true null hypotheses among those called significant). Default: 0.05 Range: min > 0.0 max < 1.0
fluv tells the algorithm to use FDR to determine significant segments. Disabled by default
rsm Recalculate Segment Means: A post-processing step (not part of cghFLasso) to recalculate the segment means after finding the breakpoints with cghFLasso. Disabled by default.
GLAD boolean to run or not to run GLAD. GLAD smoothes the data with likelihood-based adaptive weights smoothing, removes extraneous breakpoints with a penalized likelihood, and groups the segments with unsupervised clustering. This algorithm is called from the GLAD package, which is available in BioConductor.
qlambda a scaling parameter used by adaptive weights smoothing for its stochastic penalty. Default: 0.9990 Range: min 0.9000 max 0.9999
FASeg boolean to run or not to run FASeg. FASeg uses lowess to find the location of possible breakpoints and conducts local ANOVA to identify significant breakpoints.
sig Significance cutoff value. Default: 0.025 Range: min > 0.0 max < 1.0
delta The minimum height of the 'bumps' in the lowess-smoothed data to consider their boundaries as potential breakpoints. Default: 0.1 Range: min > 0.0 max 0.5
srange the number of probes to use when calculating the weights around each probe in the lowess smoother. Default: 50 Range: min 10 max 100
fineTune tells the algorithm to recalculate breakpoint locations (in smaller neighborhoods) after edge selection. Disabled by default.
Quantreg boolean to run or not to run Quanttile smoothing. Quantile smoothing uses penalized quantile regression to find trends in the data. The code follows closely to the R code outlined in the Eilers and de Menezes (2005) paper, except we use the sparse implementation of the Frisch-Newton interior-point algorithm. The results represent the 50th quantile. This algorithm uses functions from the quantreg package, which is available in the Comprehensive R Archive Network.
lambda The penalty parameter for the regression. Default: 2.0 Range: min > 0.0 max 10.0
minLR These parameters adjust the minimum and maximum log-ratios to show in the graphical results. Log-ratios below the minimum are drawn at the minimum value. Log-ratios above the maximum are drawn at the maximum value. Default: -2.0 Range: min -1.0 max -10.0
maxLR These parameters adjust the minimum and maximum log-ratios to show in the graphical results. Log-ratios below the minimum are drawn at the minimum value. Log-ratios above the maximum are drawn at the maximum value. Default: 2.0 Range: min 1.0 max 10.0
Threshold This is a simple way to identify gains and losses in the processed data. Gains are called for regions above the positive threshold. Losses are called for regions below the negative threshold. Default is 0.2
genomeType what genome type to be used
tempDir Directory where temporary folder will be created
resultDir A directory of this name will be created in tempDir. This directory will have results.html and all other temp files

Details

Thias is a wrraper function to call various algorithms

Value

retuns on 0 or 1 for fail and success respectively

Note

none

Author(s)

Vidhu Choudhary

References

"http://compbio.med.harvard.edu/CGHweb"

Examples

data(BacArray)
## To run the with all the default settings.
x<-runCGHAnalysis(tab,tempDir=getwd(),resultDir="CGHResults")

[Package CGHweb version 1.0 Index]