outrigger.index.adjacencies module

Find exons adjacent to junctions

class outrigger.index.adjacencies.ExonJunctionAdjacencies(metadata, db, junction_id='junction_id', exon_start='exon_start', exon_stop='exon_stop', chrom='chrom', strand='strand', max_de_novo_exon_length=100, n_jobs=-1)[source]

Bases: object

Annotate junctions with neighboring exons (upstream or downstream)

Methods

detect_exons_from_junctions() Find exons based on gaps in junctions
junctions_adjacent_to_this_exon(exon) Get junctions adjacent to this exon
upstream_downstream_exons() Get upstream and downstream exons of each junction
write_de_novo_exons([filename]) Write all de novo exons to a gtf

Initialize class to get upstream/downstream exons of junctions

Parameters:

metadata : pandas.DataFrame

A table of splice junctions with the columns indicated by the variables junction_id, exon_start, exon_stop, chrom, strand

db : gffutils.FeatureDB

Gffutils Database of gene, transcript, and exon features.

junction_id, exon_start, exon_stop, chrom, strand : str

Columns in metadata

Methods

detect_exons_from_junctions() Find exons based on gaps in junctions
junctions_adjacent_to_this_exon(exon) Get junctions adjacent to this exon
upstream_downstream_exons() Get upstream and downstream exons of each junction
write_de_novo_exons([filename]) Write all de novo exons to a gtf
detect_exons_from_junctions()[source]

Find exons based on gaps in junctions

exon_types = ('exon', 'novel_exon')
junctions_adjacent_to_this_exon(exon)[source]

Get junctions adjacent to this exon

Parameters:

exon : gffutils.Feature

An item in a gffutils database

upstream_downstream_exons()[source]

Get upstream and downstream exons of each junction

The “upstream” and “downstream” is relative to the junction, e.g.

exonA upstream junctionX exonB downstream junctionX

should be read as “exonA is upstream of juction X” and “exonB is downstream of junctionX”

Use junctions defined in sj_metadata and exons in db to create triples of (exon, direction, junction), which are read like (subject, object, verb) e.g. (‘exon1’, ‘upstream’, ‘junction12’), for creation of a graph database.

Parameters:

sj_metadata : pandas.DataFrame

A splice junction metadata dataframe with the junction id as the index, with columns defined by variables exon_start and exon_stop.

db : gffutils.FeatureDB

A database of gene annotations created by gffutils. Must have features of type “exon”

exon_start : str, optional

Name of the column in sj_metadata corresponding to the start of the exon

exon_stop : str, optional

Name of the column in sj_metadata corresponding to the end of the exon

Returns:

junction_exon_triples : pandas.DataFrame

A three-column dataframe describing the relationship of where an exon is relative to junctions

write_de_novo_exons(filename='novel_exons.gtf')[source]

Write all de novo exons to a gtf

outrigger.index.adjacencies.is_there_an_exon_here(self, junction1, junction2)[source]

Check if there could be an exon between these two junctions

Parameters:

junction{1,2} : outrigger.Region

Outrigger.Region objects

Returns:

start, stop : (int, int) or (False, False)

Start and stop of the new exon if it exists, else False, False