outrigger.io.star module¶
Read splice junction output files from STAR aligner (SJ.out.tab)
-
outrigger.io.star.
make_metadata
(spliced_reads, columns=('junction_id', 'chrom', 'junction_start', 'junction_stop', 'strand', 'annotated', 'exon_start', 'exon_stop'))[source]¶ Get barebones junction chrom, start, stop, strand information
Parameters: spliced_reads : pandas.DataFrame
Concatenated SJ.out.tab files created by read_sj_out_tab
columns : iterable
Which columns to use to make the metadata
Returns: junctions : pandas.DataFrame
- A (n_junctions, 9) dataframe containing the columns:
- junction_id
- chrom
- intron_start
- intron_stop
- exon_start
- exon_stop
- strand
- intron_motif
- annotated
-
outrigger.io.star.
read_multiple_sj_out_tab
(filenames, ignore_multimapping=False, sample_id_func=<function basename>, n_jobs=-1)[source]¶ Read the splice junction files and return a tall, tidy dataframe
Adds a column called “sample_id” based on the basename of the file, minus “SJ.out.tab”
Parameters: filenames : iterator
A list or other iterator of filenames to read
multimapping : bool
If True, include the multimapped reads in total read count
sample_id_func : function
A function to extract the sample id from the filenames
Returns: metadata : pandas.DataFrame
A tidy dataframe, where each row has the observed reads for a sample
-
outrigger.io.star.
read_sj_out_tab
(filename)[source]¶ Read an SJ.out.tab file as produced by the RNA-STAR aligner into a pandas Dataframe
Parameters: filename : str of filename or file handle
Filename of the SJ.out.tab file you want to read in
Returns: sj : pandas.DataFrame
Dataframe of splice junctions with the columns, (‘chrom’, ‘junction_start’, ‘junction_stop’, ‘strand’, ‘junction_motif’, ‘exon_start’, ‘exon_stop’, ‘annotated’, ‘unique_junction_reads’, ‘multimap_junction_reads’, ‘max_overhang’)