一直用miRDeep2来定量miRNA。miRDeep2包含了bowtie比对和定量两步,比自己搭流程更方便一点。最近换了太服务器准备跑miRDeep2,发现要准备的文件还挺多,比如全基因组的bowtie索引,miRBase的序列等,都忘记当初准备的过程了,所以重新整理一下。

安装

我用conda 安装的

1
mamba install -c bioconda mirdeep2

文件准备

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
mkdir -p ~/ref/miRBase22
cd ~/ref/miRBase22

# 下载miRBase22
wget -c https://www.mirbase.org/ftp/CURRENT/mature.fa.gz
wget -c https://www.mirbase.org/ftp/CURRENT/hairpin.fa.gz

# 提取人has的序列
gunzip mature.fa.gz hairpin.fa.gz
extract_miRNAs.pl mature.fa hsa > mature_ref.fa
extract_miRNAs.pl hairpin.fa hsa > hairpin_ref.fa

# 如果需要预测novel miRNA,还需要下载人基因组序列
wget -c https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_43/GRCh38.p13.genome.fa.gz
gunzip GRCh38.p13.genome.fa.gz
bowtie-build GRCh38.p13.genome.fa GRCh38.p13.genome.fa

运行miRDeep2

这里没有预测新的novel miRNA,另外fastq应该是经过质控后的,不含adapter序列。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
        gunzip -dc clean.R1.fastq.gz > clean.R1.fastq
        # -s file         print processed reads to this file
        # -t file         print read mappings to this file
        # -v              outputs progress report
        mapper.pl clean.R1.fastq -e -h -i -j -k $Adapter -l 18 -m -n -q -p /path/to/GRCh38.p13.genome.fa -o 10 -s mapper.reads_collapsed.fa -t mapper.reads_vs_refdb.arf -v 2> Mapper_log

        # -W           read counts are weighed by their number of mappings. e.g. A read maps twice so each position gets 0.5 added to its read profile
        # -p precursor.fa  miRNA precursor sequences from miRBase
        # -d    if parameter given pdfs will not be generated, otherwise pdfs will be generated
        quantifier.pl -W -T 10 -p /path/to/miRBase22/hairpin_ref.fa -m /path/to/miRBase22/mature_ref.fa -r mapper.reads_collapsed.fa -t hsa -y 01_05_2023 -d 2> Quantifier_log
         
        # final file: miRNAs_expressed_all_samples_01_05_2023.csv

结果

1
2
3
4
5
# norm是CPM,seq是count,同一个miRNA可能来源于不同的precursor
#miRNA  read_count      precursor       total   seq     seq(norm)
hsa-let-7a-3p   7.50    hsa-let-7a-1    7.50    7.50    45.29
hsa-let-7a-5p   23.62   hsa-let-7a-1    23.62   23.62   142.63
hsa-let-7a-5p   23.62   hsa-let-7a-2    23.62   23.62   142.63

参考

https://drmirdeep.github.io/mirdeep2_tutorial.html

####################################################################

#版权所有 转载请告知 版权归作者所有 如有侵权 一经发现 必将追究其法律责任

#Author: Jason

#####################################################################