Data models for “studies” studies include attributes about the data and are heavier in terms of data load
Bases: object
A biological study, with associated metadata, expression, and splicing data.
Construct a biological study
This class only accepts data, no filenames. All data must already have been read in and exist as Python objects.
Parameters: | sample_metadata : pandas.DataFrame
version : str
expression_data : pandas.DataFrame
expression_feature_data : pandas.DatFrame
expression_feature_rename_col : str
expression_log_base : float
thresh : float
expression_plus_one : bool
splicing_data : pandas.DataFrame
splicing_feature_data : pandas.DataFrame
splicing_feature_rename_col : str
splicing_feature_expression_id_col : str
mapping_stats_data : pandas.DataFrame
mapping_stats_number_mapped_col : str
spikein_data : pandas.DataFrame
spikein_feature_data : pandas.DataFrame
drop_outliers : bool
species : str
gene_ontology_data : pandas.DataFrame
metadata_pooled_col : str
|
---|
Splicing events whose change in NMF space is large
By large, we mean that difference is 2 standard deviations away from the mean
Parameters: | phenotype_transitions : list of length-2 tuples of str
data_type : ‘splicing’ | ‘expression’
n : int
|
---|---|
Returns: | big_transitions : pandas.DataFrame
|
Percentage of events inconsistent with pooled at expression threshs
Parameters: | bins : list-like
|
---|---|
Returns: | expression_vs_inconsistent : pd.DataFrame
|
Given a name of a feature subset, get the associated feature ids
Parameters: | data_type : str
feature_subset : str
|
---|---|
Returns: | feature_ids : list of strings
|
Filter splicing events on expression values
Parameters: | expression_thresh : float
|
---|---|
Returns: | psi : pandas.DataFrame
|
Create a study object from a datapackage dictionary
Parameters: | datapackage : dict |
---|---|
Returns: | study : flotilla.Study
|
Create a study from a url of a datapackage.json file
Parameters: | datapackage_url : str
species_data_pacakge_base_url : str
|
---|---|
Returns: | study : Study
|
Raises: | AttributeError
|
Calculate gene ontology enrichment of provided features
Parameters: | feature_ids : list-like
background : list-like, optional
domain : str or list, optional
p_value_cutoff : float, optional
min_feature_size : int, optional
min_background_size : int, optional
Returns ——- enrichment : pandas.DataFrame
|
---|
User selects from columns that start with ‘outlier_‘ to merge multiple outlier classifications
Performs Jensen-Shannon Divergence on both splicing and expression study_data
Jensen-Shannon divergence is a method of quantifying the amount of change in distribution of one measurement (e.g. a splicing event or a gene expression) from one celltype to another.
Get modality assignments of splicing data
Parameters: | sample_subset : str or None, optional
feature_subset : str or None, optional
expression_thresh : float, optional
|
---|---|
Returns: | modalities : pandas.DataFrame
|
Get number of splicing events in modality categories
Parameters: | sample_subset : str or None, optional
feature_subset : str or None, optional
expression_thresh : float, optional
|
---|---|
Returns: | modalities : pandas.DataFrame
|
The change in NMF space of splicing events across phenotypes
Parameters: | phenotype_transitions : list of length-2 tuples of str
data_type : ‘splicing’ | ‘expression’
n : int
|
---|---|
Returns: | big_transitions : pandas.DataFrame
|
Plot a predictor for the specified data type and trait(s)
Parameters: | data_type : str
trait : str
|
---|
Visualize hierarchical relationships within samples and features
Visualize clustered correlations of samples across features
Plot the violinplot and NMF transitions of a splicing event
Plot the graph (network) of these data
Parameters: | data_type : str
sample_subset : str or None
feature_subset : str or None
|
---|
Make grouped barplots of the number of modalities per phenotype
Parameters: | sample_subset : str or None
feature_subset : str or None
expression_thresh : float
percentages : bool
|
---|
Plot each modality in each celltype on a separate axes
Parameters: | sample_subset : str or None
feature_subset : str or None
expression_thresh : float
|
---|
Plot splicing events with modality assignments in NMF space
This will plot a separate NMF space for each celltype in the data, as well as one for all samples.
Parameters: | sample_subset : str or None
feature_subset : str or None
expression_thresh : float
|
---|
Performs DataFramePCA on both expression and splicing study_data
Parameters: | data_type : str
x_pc : int, optional
y_pc : int, optional
sample_subset : str or None
feature_subset : str or None
title : str, optional
featurewise : bool, optional
plot_violins : bool
show_point_labels : bool, optional
reduce_kwargs : dict, optional
color_samples_by : str, optional
bokeh : bool, optional
most_variant_features : bool, optional
std_multiplier : float, optional
scale_by_variance : bool, optional
kwargs : other keyword arguments
|
---|
Make a scatterplot of two features’ data
Parameters: | feature1 : str
feature2 : str
|
---|
Plot a scatterplot of two samples’ data
Parameters: | sample1 : str
sample2 : str
data_type : “expression” | “splicing”
Any other keyword arguments valid for seaborn.jointplot |
---|---|
Returns: | jointgrid : seaborn.axisgrid.JointGrid
See Also seaborn.jointplot |
Convert a string naming a subset of phenotypes in the data into sample ids
Parameters: | phenotype_subset : str
|
---|---|
Returns: | sample_ids : list of strings
|