Sort VCF by Chr and Pos根据染色体位置对VCF进行排序
文章目录
对VCF文件中的突变按照染色体和位置进行排序,下面是本人的总结,其中利用bash命令的方法不依赖其他的工具或包。htslib前文中也提到过。
1, Use bash
bash raw.vcf
|
|
2,Use awk and sed
(awk ‘/^#/{print}!/^#/{exit}’ raw.vcf;sed ‘/^#/d’ raw.vcf’awk -F"\t" ‘($1~/^[0-9]+$/){sub("^chr","",$0);print $0}‘‘sort -k1,1n -k2,2n’awk ‘{print “chr”$0}’ ;sed ‘/^#/d’ raw.vcf’ awk -F"\t" ‘($1!~/^[0-9]+$/){sub("^chr","",$0);print $0}‘‘sort -k1,1d -k2,2n’awk ‘{print “chr”$0}') > sort.vcf
3, Use Picard
Sorts one or more VCF files. This tool sorts the records in VCF files according to the order of the contigs in the header/sequence dictionary and then by coordinate. It can accept an external sequence dictionary.
java -jar picard.jar SortVcf I=unsort.vcf O=sorted.vcf
4,Use vcf-sort (in vcftools)
cat file.vcf ' vcf-sort > sorted.vcf
5, Use htslib
|
|
How to install htslib
|
|
How to install vcftools
|
|
Ref:
https://www.biostars.org/p/84747/
http://broadinstitute.github.io/picard/command-line-overview.html#SortVcf
https://github.com/samtools/htslib
http://vcftools.sourceforge.net/perl_module.html#vcf-sort
####################################################################
#版权所有 转载请告知 版权归作者所有 如有侵权 一经发现 必将追究其法律责任
#Author: Jason
###################################################################
文章作者 zzx
上次更新 2016-03-01