What’s new in the package

A catalog of new features, improvements, and bug-fixes in each release.

v0.2.4 (November 23rd, 2014)

This is a patch release, with non-breaking changes from v0.2.3.

Plotting functions

  • New clustered heatmap and data_model.Study.plot_clustermap() and data_model.Study.plot_correlations()

API changes

  • data_model.Study.save() now saves relative instead of absolute paths, which makes for more portable datapackages
  • Underlying code for visualize.DecompositionViz and visualize.ClassifierViz now plots via plot()

v0.2.3 (November 17th, 2014)

This is a patch release, with non-breaking changes from v0.2.2.

Compute functions

  • Restore Study.detect_outliers(), Study.interactive_choose_outliers() and Study.interactive_reset_outliers()

Plotting functions

  • Add Study-level NMF space transitions/positions

Bug Fixes

  • embark() wouldn’t work if metadata didn’t have a pooled column, now it does
  • BaseData.drop_outliers() would actually drop samples from the data, but we never want to remove data, only mark it as something to be removed so all the original data is there
  • For all compute submodules, add a check to make sure the input data is truly a probability distribution (non-negative, sums to 1)
  • BaseData.plot_feature() now plots all features with the same name (e.g. all splicing events within that gene) onto a single fig object

Documentation

Other

  • Rename modalities that couldn’t be assigned when bootstrapped=True in compute.splicing.Modalities, from “unassigned” to “ambiguous”

v0.2.2 (November 7th, 2014)

This is a patch release, with non-breaking changes from v0.2.0.

Documentation updates

v0.2.1 (November 6th, 2014)

This is a patch release, with non-breaking changes from v0.2.0.

Documentation updates

v0.2.0 (November 5th, 2014)

This is a minor release, with some breaking changes from v0.1.1.

New features

  • Plot the expression or splicing of two samples with Study.plot_two_samples()
  • Plot the expression or splicing of two features with Study.plot_two_features()
  • Detect outliers with Study.interactive_choose_outliers() which performs a OneClassSVM on the PCA-reduced space of data (either expression or splicing), using the first three components
  • Study doesn’t filter out the pooled or outlier samples from the data, only technical outliers with fewer reads than specified in the argument mapping_stats_min_reads.
  • To filter expression or splicing data on the number of samples that must detect each feature, you can specify expression_thresh, and metadata_min_samples in the Study constructor.
    • For example, if expression_thresh=1 and metadata_min_samples=3, then we will only take genes which have expression values greater than 1 in at least 3 samples. Additionally, we will also take splicing events which were detected in at least three cells, since metadata_min_samples applies to all data types.

API changes

  • The attribute data in BaseData (i.e. BaseData.data) now contains all the data, including pooled, singles, and outliers
  • The attribute data_original in BaseData (i.e. BaseData.data_original) contains the original, unfiltered data. For example, before removing features detected in fewer than 3 samples with expression > 1.
  • BaseData now has the attributes BaseData.singles, BaseData.pooled, and BaseData.outliers which are on-the-fly subsets of BaseData.data. This is to maintain data provenance, meaning if “outliers” is changed, this is also changed.
  • In Study, you now must specify expression_feature_rename_col, splicing_feature_rename_col, mapping_stats_number_mapped_col explicitly, they are no longer defaulting to, {splicing,expression}_feature_rename_col="gene_name" and mapping_stats_number_mapped_col="Uniquely mapped reads number"

Other Changes

  • Status messages in embark() have been moved to stdout instead of stderr to avoid confusion that something is going wrong
  • In embark(), user gets notified which samples are removed for having too few reads (default minimum number of reads is \(5\times 10^5\), or half a million reads).
Olga B. Botvinnik is funded by the NDSEG fellowship and is a NumFOCUS John Hunter Technology Fellow.
Michael T. Lovci was partially funded by a fellowship from Genentech.
Partially funded by NIH grants NS075449 and HG004659 and CIRM grants RB4-06045 and TR3-05676 to Gene Yeo.