If we want to cluster samples based on CNV data, a dataframe is needed. However, CNV segments in each sample are not the same. Maybe overlap or distinct. I think CNTools package migh solve this challenge. An example is shown as below. The result is a reduced segment data frame.
BiocManager::install("CNTools") data("sampleData") seg <- CNSeg(sampleData) rdseg <- getRS(seg, by = "region", imput = FALSE, XY = FALSE, what = "mean") View(rdseg@rs)
Input dataframe has six columns (“ID”,”chrom”,”loc.start”,”loc.end”,”num.mark”,”seg.mean”) including 277 samples and 54825 segments.
The result can be got from rdseg@rs, like this