标签归档:1000 Genome

Get the allele frequency of 1000 Genome subpopulation

在1000genome的FTP服务器上可以下载一个all的vcf文件,里面可以看到AFR, AMR, EAS, EUR, SAS人群的allele频率,但是该种族下面的亚群的频率信息需要在http://grch37.ensembl.org搜索得到,比如 http://grch37.ensembl.org/Homo_sapiens/Variation/Population?db=core;r=1:230845294-230846294;v=rs699;vdb=variation;vf=102788013 ,还有一种方式,就是下载包含所有样本的突变信息的VCF文件,利用vcftools计算。

The allele frequency of super population (AFR, AMR, EAS, EUR, SAS, see http://www.1000genomes.org/category/population/) can be obtained from all.vcf.

However, the allele population frequency in subpopulation is not well obtained.

One way is search the web http://grch37.ensembl.org by rs identifier, E.g. http://grch37.ensembl.org/Homo_sapiens/Variation/Population?db=core;r=1:230845294-230846294;v=rs699;vdb=variation;vf=102788013 .

Another way is calculated on your local machine. The following will introduce how to get allele frequency of CHB population in chr1 chromosome. CHB Han Chinese in Beijing

Prepare file and software dependency

panel file contains sample information

panel file: integrated_call_samples_v3.20130502.ALL.panel

vcf file contains allele information of all samples

chr1.vcf: ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf

Also, you should install vcftools.

wget ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/integrated_call_samples_v3.20130502.ALL.panel
wget ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz
git clone https://github.com/vcftools/vcftools.git
cd vcftools/ 
./autogen.sh 
./configure 
make 
make install
cd ..

继续阅读