What’s new in the package¶

A catalog of new features, improvements, and bug-fixes in each release.

v0.2.1 (November 6th, 2014)¶

This is a patch release, with non-breaking changes from v0.2.0.

This is a minor release, with some breaking changes from v0.1.1.

Plot the expression or splicing of two samples with Study.plot_two_samples()
Plot the expression or splicing of two features with Study.plot_two_features()
Detect outliers with Study.interactive_choose_outliers() which performs a OneClassSVM on the PCA-reduced space of data (either expression or splicing), using the first three components
Study doesn’t filter out the pooled or outlier samples from the data, only technical outliers with fewer reads than specified in the argument mapping_stats_min_reads.
To filter expression or splicing data on the number of samples that must detect each feature, you can specify expression_thresh, and metadata_min_samples in the Study constructor.
- For example, if expression_thresh=1 and metadata_min_samples=3, then we will only take genes which have expression values greater than 1 in at least 3 samples. Additionally, we will also take splicing events which were detected in at least three cells, since metadata_min_samples applies to all data types.

The attribute data in BaseData (i.e. BaseData.data) now contains all the data, including pooled, singles, and outliers
The attribute data_original in BaseData (i.e. BaseData.data_original) contains the original, unfiltered data. For example, before removing features detected in fewer than 3 samples with expression > 1.
BaseData now has the attributes BaseData.singles, BaseData.pooled, and BaseData.outliers which are on-the-fly subsets of BaseData.data. This is to maintain data provenance, meaning if “outliers” is changed, this is also changed.
In Study, you now must specify expression_feature_rename_col, splicing_feature_rename_col, mapping_stats_number_mapped_col explicitly, they are no longer defaulting to, {splicing,expression}_feature_rename_col="gene_name" and mapping_stats_number_mapped_col="Uniquely mapped reads number"

Status messages in embark() have been moved to stdout instead of stderr to avoid confusion that something is going wrong
In embark(), user gets notified which samples are removed for having too few reads (default minimum number of reads is \(5\times 10^5\), or half a million reads).