Skip to content

Instantly share code, notes, and snippets.

@paulklemm
Created November 14, 2016 13:17
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save paulklemm/ef847feb6037a945fe8d26bd9f60b8ac to your computer and use it in GitHub Desktop.
Save paulklemm/ef847feb6037a945fe8d26bd9f60b8ac to your computer and use it in GitHub Desktop.
DEXSeq prepare annotation throws error for RNACentral GTF file
# Convert GFF3 to GTF file
wget ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/releases/5.0/genome_coordinates/Mus_musculus.GRCm38.gff3.gz
gunzip --verbose Mus_musculus.GRCm38.gff3.gz
# Download gffread source and compile it
mkdir gffread_build
cd gffread_build
git clone https://github.com/gpertea/gclib
git clone https://github.com/gpertea/gffread
cd gffread
make
# convert GFF3 to GTF
gffread ../../Mus_musculus.GRCm38.gff3 -T -o ../../Mus_musculus.GRCm38.gtf
# Remove gffread_build again
cd ../../
rm -r -f gffread_build/
# Download dexseq_count.py from the bioconductor GIT mirror
wget https://raw.githubusercontent.com/Bioconductor-mirror/DEXSeq/master/inst/python_scripts/dexseq_prepare_annotation.py
python dexseq_prepare_annotation.py Mus_musculus.GRCm38.gtf dexseq_prepare_annotation.out
# Yields the following error:
# Traceback (most recent call last):
# File "dexseq_prepare_annotation.py", line 54, in <module>
# f.attr['gene_id'] = f.attr['gene_id'].replace( ":", "_" )
# KeyError: 'gene_id'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment