Data models for “studies” studies include attributes about the data and are heavier in terms of data load
Bases: object
A biological study, with associated metadata, expression, and splicing data.
Construct a biological study
This class only accepts data, no filenames. All data must already have been read in and exist as Python objects.
| Parameters: | sample_metadata : pandas.DataFrame
version : str
expression_data : pandas.DataFrame
expression_feature_data : pandas.DatFrame
expression_feature_rename_col : str
expression_log_base : float
thresh : float
expression_plus_one : bool
splicing_data : pandas.DataFrame
splicing_feature_data : pandas.DataFrame
splicing_feature_rename_col : str
mapping_stats_data : pandas.DataFrame
mapping_stats_number_mapped_col : str
spikein_data : pandas.DataFrame
spikein_feature_data : pandas.DataFrame
drop_outliers : bool
species : str
gene_ontology_data : pandas.DataFrame
metadata_pooled_col : str
|
|---|
remove samples labeled “outlier” in self.metadata, replace the data in self.expression and self.splicing with the smaller version
Given a name of a feature subset, get the associated feature ids
| Parameters: | data_type : str
feature_subset : str
|
|---|---|
| Returns: | feature_ids : list of strings
|
Create a study object from a datapackage dictionary
| Parameters: | datapackage : dict |
|---|---|
| Returns: | study : flotilla.Study
|
Create a study from a url of a datapackage.json file
| Parameters: | datapackage_url : str
species_data_pacakge_base_url : str
|
|---|---|
| Returns: | study : Study
|
| Raises: | AttributeError
|
User selects from columns that start with ‘outlier_‘ to merge multiple outlier classifications
Performs Jensen-Shannon Divergence on both splicing and expression study_data
Jensen-Shannon divergence is a method of quantifying the amount of change in distribution of one measurement (e.g. a splicing event or a gene expression) from one celltype to another.
Plot a predictor for the specified data type and trait(s)
| Parameters: | data_type : str
trait : str
|
|---|
Plot the violinplot and DataFrameNMF transitions of a splicing event
Plot the graph (network) of these data
| Parameters: | data_type : str
sample_subset : str or None
feature_subset : str or None
|
|---|
Performs DataFramePCA on both expression and splicing study_data
| Parameters: | data_type : str
x_pc : int
y_pc : int
sample_subset : str or None
feature_subset : str or None
title : str
plot_violins : bool
show_point_labels : bool
|
|---|
Make a scatterplot of two features’ data
| Parameters: | feature1 : str
feature2 : str
|
|---|
Plot a scatterplot of two samples’ data
| Parameters: | sample1 : str
sample2 : str
data_type : “expression” | “splicing”
Any other keyword arguments valid for seaborn.jointplot |
|---|---|
| Returns: | jointgrid : seaborn.axisgrid.JointGrid
See Also seaborn.jointplot |
Convert a string naming a subset of phenotypes in the data into sample ids
| Parameters: | phenotype_subset : str
|
|---|---|
| Returns: | sample_ids : list of strings
|