Common operations performed on all kinds of data types
Bases: object
Base class for biological data measurements. All data types in flotilla inherit from this
Attributes
| feature_subsets | Dict of feature subset names to their list of feature ids |
| variant | Genes whose variance among all cells is 2 standard deviations away |
| data | (pandas.DataFrame) A (n_samples, m_features) sized DataFrame of filtered input data, with features with too few samples (minimum_samples) detected at thresh removed. Compared to data_original, ``m_features <= n_features` |
| data_type | (str) String indicating what kind of data this is, e.g. “splicing” or “expression” |
| data_original | (pandas.DataFrame) A (n_samples, n_features) sized DataFrame of all input data, before removing features for having too few samples |
| feature_data | (pandas.DataFrame) A (k_features, n_features_about_features) sized DataFrame of features about the feature data. Notice that this DataFrame does not need to be the same size as the data, but must at least include all the features from data. Compared to data, k_features >= m_features |
| predictor_config_manager | (PredictorConfigManager) Manage different combinations of predictor on different data subtypes |
Methods
| maybe_renamed_to_feature_id(feature_id) | To be able to give a simple gene name, e.g. |
| feature_renamer | If feature_rename_col is specified in BaseData.__init__(), this will rename the feature ID to a new name. If feature_rename_col is not specified, then this will return the original id |
Abstract base class for biological measurements
| Parameters: | data : pandas.DataFrame
thresh : float, optional (default=-np.inf)
minimum_samples : int, optional (default=0)
feature_data : pandas.DataFrame, optional (default=None)
feature_rename_col : str, optional (default=None)
feature_ignore_subset_cols : list-like (default=None)
technical_outliers : list-like, optional (default=None)
outliers : list-like, optional (default=None)
pooled : list-like, optional (default=None)
predictor_config_manager : PredictorConfigManager, optional
data_type : str, optional (default=None)
|
|---|
Notes
Any cells not marked as “technical_outliers”, “outliers” or “pooled” are considered as single-cell samples.
Make and memoize a predictor on a categorical trait (associated with samples) subset of genes
| Parameters: | trait : pandas.Series
sample_ids : None or list of strings
feature_ids : None or list of strings
standardize : bool
predictor : flotilla.visualize.predict classifier
predictor_kwargs : dict or None
predictor_scoring_fun : function
score_cutoff_fun : function
|
|---|---|
| Returns: | predictor : flotilla.compute.predict.PredictorBaseViz
|
Convert a feature subset name to a list of feature ids
To be able to give a simple gene name, e.g. “RBFOX2” and get the official ENSG ids or MISO ids
| Parameters: | feature_id : str
|
|---|---|
| Returns: | feature_id : str or list-like
|
Classify samples on boolean or categorical traits
| Parameters: | trait : pandas.Series
sample_ids : list-like, optional (default=None)
feature_ids : list-like, optional (default=None)
predictor_name : str
standardize : bool, optional (default=True)
data_name : str, optional (default=None)
groupby : mappable, optional (default=None)
label_to_color : dict, optional (default=None)
label_to_marker : dict, optional (default=None)
order : list, optional (default=None)
color : list, optional (default=None)
plotting_kwargs : other keyword arguments
|
|---|---|
| Returns: | self : BaseData |
Principal component-like analysis of measurements
| Parameters: | x_pc : int, optional (default=1)
y_pc : int, optional (default=2)
sample_ids : list, optional (default=None)
feature_ids : list, optional (default=None)
featurewise : bool, optional (default=False)
reducer : DataFrameReducerBase, optional
plot_violins : bool, optional (default=True)
groupby : mappable, optional (default=None)
label_to_color : dict, optional (default=None)
label_to_marker : dict, optional (default=None)
order : list, optional (default=None)
color : list, optional (default=None)
plotting_kwargs : other keyword arguments
|
|---|---|
| Returns: | viz : DecompositionViz
|
Plot the violinplot of a splicing event (should also show NMF movement)
Plot the values of two features
| Parameters: | sample1 : str
sample2 : str
Any other keyword arguments valid for seaborn.jointplot |
|---|---|
| Returns: | jointgrid : seaborn.axisgrid.JointGrid
See Also seaborn.jointplot |
Make and memoize a reduced dimensionality representation of data
| Parameters: | data : pandas.DataFrame
sample_ids : None or list of strings
feature_ids : None or list of strings
featurewise : bool
standardize : bool
title : str
reducer_kwargs : dict
|
|---|---|
| Returns: | reducer_object : flotilla.compute.reduce.ReducerViz
|
| Parameters: | metadata : pandas.DataFrame
minimum : int
subset_type : str
ignore : list-like
|
|---|---|
| Returns: | subsets : dict
|