Matlab error when running GISTIC

If you instal MCR (MATLAB Compiler Runtime) provided by GISTIC package, may have the following error. This error could disrupt GISTIC. libGL error: failed to load driver: swrast If this situation occurs, rename the file found at " $MATLAB_ROOT/sys/os/glnxa64/libstdc++.so.6" to “libstdc++.so.6.old”, This forces MATLAB to use the OS library. Works for me. Ref: https://ww2.mathworks.cn/matlabcentral/answers/296999-libgl-error-unable-to-load-driver-in-ubuntu-16-04-while-running-matlab-r2013b GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers

Prepare a data frame for sample CNV data

If we want to cluster samples based on CNV data, a dataframe is needed. However, CNV segments in each sample are not the same. Maybe overlap or distinct. I think CNTools package migh solve this challenge. An example is shown as below. The result is a reduced segment data frame.

1
2
3
4
5
BiocManager::install("CNTools")
data("sampleData")
seg <- CNSeg(sampleData)
rdseg <- getRS(seg, by = "region", imput = FALSE, XY = FALSE, what = "mean") 
View(rdseg@rs)

对Autoencoder(自编码器)的理解

通常数据的维度太大,可视化很难,也不利用模型的学习。有时候拿到数据做个PCA或者tSNE,就是把维度缩小到2维(当然也可以3维),便于看数据之间的关系。在机器学习中,Autoencoder也是一种降维的方式, Autoencoder输入层的神经元的数目和输出层的神经元的数目必须,而且要保证输出的结果尽最大可能和输入的结果一致。

可变多聚腺苷酸化Alternative Polyadenylation (APA) 检测

可变多聚腺苷酸化Alternative Polyadenylation (APA),如下图所示(图片来自参考),在不同的APA信号位点切割,然后添加polyA。这种调控机制属于转录后调控,可能会影响蛋白的序列(发生在编码区),也可能影响蛋白的稳定性(比如非编码区内的miRNA的调控区域)。其实也是可变剪接的一种情况。

常用的软件是Dapars,这个软件现在也有了升级的版本Dapars2。参考: https://github.com/ZhengXia/dapars https://github.com/3UTR/DaPars2 分析流程很相似,Dapars2多了 normalize library sizes 。