lib5c.algorithms.pca module¶
-
lib5c.algorithms.pca.
compute_pca
(matrix, scaled=True, logged=False, kernel=None, kernel_kwargs=None, variant='pca', pf=1)[source]¶ Performs PCA on a matrix.
- Parameters
matrix (np.ndarray) – The design matrix, whose rows are observations (replicates) and whose columns are features (interaction values at each position).
scaled (bool) – Pass True to scale the features to unit variance.
logged (bool) – Pass True to log the features before PCA.
kernel (Optional[str]) – Pass a kernel accepted by
sklearn.decomposition.KernelPCA()
to perform KPCA.kernel_kwargs (Optional[Dict[str, Any]]) – Kwargs to use for the kernel.
variant ({'pca', 'ica', 'fa', 'mds'}) – Select which variant of PCA to use.
pf (int) – Specify an integer number of pure polynomial features to use in the PCA.
- Returns
The first element is the matrix of PCA-projected replicates. The second element is the PVE for each component, or None if the PCA method selected doesn’t provide a PVE estimate. The third element is a matrix of the principle component vectors, or None if the PCA method selected doesn’t provide a set of principle component vectors.
- Return type
Tuple[np.ndarray]
-
lib5c.algorithms.pca.
compute_pca_from_counts_superdict
(counts_superdict, rep_order=None, **kwargs)[source]¶ Convenience function for performing PCA on a counts superdict data structure.
- Parameters
counts_superdict (Dict[str, Dict[str, np.ndarray]]) – The counts superdict structure to compute PCA on.
rep_order (Optional[List[str]]) – The order in which the replicates in
counts_superdict
should be considered when filling in the rows of the design matrix.kwargs (Dict[str, Any]) – Additional kwargs to be passed to
compute_pca()
.
- Returns
The first element is the matrix of PCA-projected replicates. The second element is the PVE for each component, or None if the PCA method selected doesn’t provide a PVE estimate. The third element is a matrix of the principle component vectors, or None if the PCA method selected doesn’t provide a set of principle component vectors.
- Return type
Tuple[np.ndarray]