I always use gtf file and retrieve gene information. There isn’t a highly flexible tool to solve my demand. I modified the code from “https://github.com/Jverma/GFF-Parser”, thanks Jverma. This tool will be easier to use.
Usage
Basically, there are three parameters.
id: either transcript id or gene id.
attType: attribute defined in gtf file. E.g. feature (column 3), gene_name, exon_number, transcript_id in column 9
attValue: the attribute value you want to search for.
>>> import sys >>> from gtfParser import gtfParser >>> gtf = gtfParser("example.gtf") >>> # Get all exons in CDK7 >>> gtf.getRecordsByID("CDK7", "feature", "exon") >>> # Get all features of transcript_id defined as "NM_001324069" in gene "CDK7" >>> gtf.getRecordsByID("CDK7", "transcript_id", "NM_001324069") >>> # Get start codon where feature was defined as "start_codon" in transcript "NM_001324069" >>> gtf.getRecordsByID("NM_001324069", "feature", "start_codon") >>> # Get a exon where its id is "NM_001324078.1" in "NM_001324078" transcript >>> gtf.getRecordsByID("NM_001324078", "exon_id", "NM_001324078.1")
# Example gtf
Here is an simple example of gtf file. You can use to test. A subset from refSeq.hg38.gtf.