Preprocessing¶
The sklearn.preprocessing module includes scaling, centering and
normalization methods.
- class skmatter.preprocessing.KernelNormalizer(with_center=True, with_trace=True)¶
Bases:
KernelCentererKernel centering method, similar to KernelCenterer, but with additional scaling and ability to pass a set of sample weights.
Let K(x, z) be a kernel defined by phi(x)^T phi(z), where phi is a function mapping x to a Hilbert space. KernelNormalizer centers (i.e., normalize to have zero mean) the data without explicitly computing phi(x). It is equivalent to centering and scaling phi(x) with sklearn.preprocessing.StandardScaler(with_std=False).
- Parameters:
with_center (bool, default=True) – If True, center the kernel matrix before scaling. If False, do not center the kernel
with_trace (bool, default=True) – If True, scale the kernel so that the trace is equal to the number of samples. If False, do not scale the kernel
- K_fit_rows_¶
Average of each column of kernel matrix.
- Type:
ndarray of shape (n_samples,)
- K_fit_all_¶
Average of kernel matrix.
- Type:
float
- sample_weight_¶
Sample weights (if provided during the fit)
- Type:
float
- scale_¶
Scaling parameter used when ‘with_trace’=True Calculated as np.trace(K) / K.shape[0]
- Type:
float
Examples
>>> from skmatter.preprocessing import KernelNormalizer >>> from sklearn.metrics.pairwise import pairwise_kernels >>> X = [[ 1., -2., 2.], ... [ -2., 1., 3.], ... [ 4., 1., -2.]] >>> K = pairwise_kernels(X, metric='linear') >>> K array([[ 9., 2., -2.], [ 2., 14., -13.], [ -2., -13., 21.]]) >>> transformer = KernelNormalizer().fit(K) >>> transformer KernelNormalizer() >>> transformer.transform(K) array([[ 0.39473684, 0. , -0.39473684], [ 0. , 1.10526316, -1.10526316], [-0.39473684, -1.10526316, 1.5 ]]) >>> transformer.scale_ * transformer.transform(K) array([[ 5., 0., -5.], [ 0., 14., -14.], [ -5., -14., 19.]]) >>>
- fit(K=None, y=None, sample_weight=None)¶
Fit KernelFlexibleCenterer
- Parameters:
K (ndarray of shape (n_samples, n_samples)) – Kernel matrix.
y (None) – Ignored.
sample_weight (ndarray of shape (n_samples,), default=None) – Weights for each sample. Sample weighting can be used to center (and scale) data using a weighted mean. Weights are internally normalized before preprocessing.
- Returns:
self (object) – Fitted transformer.
- fit_transform(K, y=None, sample_weight=None, copy=True, **fit_params)¶
Fit to data, then transform it.
- Parameters:
K (ndarray of shape (n_samples, n_samples)) – Kernel matrix.
y (None) – Ignored.
sample_weight (ndarray of shape (n_samples,), default=None) – Weights for each sample. Sample weighting can be used to center (and scale) data using a weighted mean. Weights are internally normalized before preprocessing.
**fit_params – necessary for compatibility with the functions of the TransformerMixin class
- Returns:
K_new (ndarray of shape (n_samples1, n_samples2)) – Transformed array
- transform(K, copy=True)¶
Center kernel matrix.
- Parameters:
K (ndarray of shape (n_samples1, n_samples2)) – Kernel matrix.
copy (bool, default=True) – Set to False to perform inplace computation.
- Returns:
K_new (ndarray of shape (n_samples1, n_samples2)) – Transformed array
- class skmatter.preprocessing.SparseKernelCenterer(with_center=True, with_trace=True, rcond=1e-12)¶
Bases:
TransformerMixin,BaseEstimatorKernel centering method for sparse kernels, similar to KernelFlexibleCenterer.
The main disadvantage of kernel methods, which is widely used in machine learning it is that they quickly grow in time and space complexity with the number of sample. It is clear that with a large dataset, not only do you need to store a huge amount of information, but you also need to use it constantly in calculations. In order to avoid this, so-called sparse kernel methods are used formulated from the low-dimensional (The Nystrom) approximation:
\[\mathbf{K} \approx \hat{\mathbf{K}}_{N N}=\mathbf{K}_{N M} \mathbf{K}_{M M}^{-1} \mathbf{K}_{N M}^{T}\]where the subscripts for $mathbf{K}$ denote the size of the sets of samples compared in each kernel, with $N$ being the size of the full data set and $M$ referring a small, active set containing $M$ samples. With this method it is only need to save and use the matrix $mathbf{K}_{NM}$, i.e. it is possible to get a $N/M$ times improvement in the asymptotic by memory.
- Parameters:
with_center (bool, default=True) – If True, center the kernel matrix before scaling. If False, do not center the kernel
with_trace (bool, default=True) – If True, scale the kernel so that the trace is equal to the number of samples. If False, do not scale the kernel
rcond (float, default 1E-12) – conditioning parameter to use when computing the Nystrom-approximated kernel for scaling
- K_fit_rows_¶
Average of each column of kernel matrix.
- Type:
ndarray of shape (n_samples,)
- K_fit_all_¶
Average of kernel matrix.
- Type:
float
- sample_weight_¶
Sample weights (if provided during the fit)
- Type:
float
- scale_¶
Scaling parameter used when ‘with_trace’=True Calculated as np.trace(K) / K.shape[0]
- Type:
float
- n_active_¶
size of active set
- Type:
int
- fit(Knm, Kmm, y=None, sample_weight=None)¶
Fit KernelFlexibleCenterer
- Parameters:
Knm (ndarray of shape (n_samples, n_active)) – Kernel matrix between the reference data set and the active set
Kmm (ndarray of shape (n_active, n_active)) – Kernel matrix between the active set and itself
y (None) – Ignored.
sample_weight (ndarray of shape (n_samples,), default=None) – Weights for each sample. Sample weighting can be used to center (and scale) data using a weighted mean. Weights are internally normalized before preprocessing.
- Returns:
self (object) – Fitted transformer.
- fit_transform(Knm, Kmm, y=None, sample_weight=None, **fit_params)¶
Fit to data, then transform it.
- Parameters:
Knm (ndarray of shape (n_samples, n_active)) – Kernel matrix between the reference data set and the active set
Kmm (ndarray of shape (n_active, n_active)) – Kernel matrix between the active set and itself
y (None) – Ignored.
sample_weight (ndarray of shape (n_samples,), default=None) – Weights for each sample. Sample weighting can be used to center (and scale) data using a weighted mean. Weights are internally normalized before preprocessing.
**fit_params – necessary for compatibility with the functions of the TransformerMixin class
- Returns:
K_new (ndarray of shape (n_samples, n_active)) – Transformed array
- transform(Knm, y=None)¶
Centering our Kernel. Previously you should fit data.
- Parameters:
Knm (ndarray of shape (n_samples, n_active)) – Kernel matrix between the reference data set and the active set
y (None) – Ignored.
- Returns:
K_new (ndarray of shape (n_samples, n_active)) – Transformed array
- class skmatter.preprocessing.StandardFlexibleScaler(with_mean=True, with_std=True, column_wise=False, rtol=0, atol=1e-12, copy=False)¶
Bases:
TransformerMixin,BaseEstimatorStandardize features by removing the mean and scaling to unit variance. Reduce the mean of the column to zero and, in the case of column_wise=True the variance of each column equal to one / number of columns. The standard score of a sample x is calculated as:
z = (x - u) / s
where u is the mean of the samples if with_mean, otherwise zero, and s is the standard deviation of the samples if with_std or one.
Centering and scaling can occur independently for each feature by calculating the appropriate statistics for the input or for the whole matrix (column_wise=False). The mean and standard deviation are then stored for use on later data using
transform().Standardization of a dataset is a common requirement for many machine learning estimators: an improperly scaled / centered dataset may result in anomalous behavior.
At the same time, depending on the conditions of the task, it may be necessary to preserve the ratio in the scale between the features (for example, in the case where the feature matrix is something like a covariance matrix), so the standardization should be carried out for the whole matrix, as opposed to the individual columns, as is done in sklearn.preprocessing.StandardScaler.
- Parameters:
with_mean (bool, default=True) – If True, center the data before scaling. If False, keep the mean intact
with_std (bool, default=True) – If True, scale the data to unit variance. If False, keep the variance intact
column_wise (bool, default=False) – If True, normalize each column separately. If False, normalize the whole matrix with respect to its total variance.
rtol (float, default=0) – The relative tolerance for the optimization: variance is considered zero when it is less than abs(mean) * rtol + atol.
atol (float, default=1.0E-12) – The relative tolerance for the optimization: variance is considered zero when it is less than abs(mean) * rtol + atol.
copy (bool, default=None) – Copy the input X or not.
- n_samples_seen_¶
Number of samples in the reference ndarray
- Type:
int
- n_features_¶
Number of features in the reference ndarray
- Type:
int
- mean_¶
The mean value for each feature in the training set. Equal to ndarray of zeros shape (n_features,) when
with_mean=False.- Type:
ndarray of shape (n_features,)
- scale_¶
The scaling factor, ndarray of shape (n_features,) when column_wise=True or float when column_wise = False.
- Type:
ndarray of shape (n_features,), float or None
- copy¶
Copy the input X or not.
- Type:
bool, default=None
Examples
>>> import numpy as np >>> from skmatter.preprocessing import StandardFlexibleScaler >>> X = np.array([[ 1., -2., 2.], ... [-2., 1., 3.], ... [ 4., 1., -2.]]) >>> transformer = StandardFlexibleScaler().fit(X) >>> transformer StandardFlexibleScaler() >>> transformer.transform(X) array([[ 0. , -0.56195149, 0.28097574], [-0.84292723, 0.28097574, 0.56195149], [ 0.84292723, 0.28097574, -0.84292723]]) >>> transformer.scale_ * transformer.transform(X) array([[ 0., -2., 1.], [-3., 1., 2.], [ 3., 1., -3.]]) >>> transformer.scale_ * transformer.transform(X) + transformer.mean_ array([[ 1., -2., 2.], [-2., 1., 3.], [ 4., 1., -2.]])
- fit(X, y=None, sample_weight=None)¶
Compute mean and scaling to be applied for subsequent normalization.
- Parameters:
X (ndarray of shape (n_samples, n_features)) – The data used to compute the mean and standard deviation used for later scaling along the features axis.
y (None) – Ignored.
sample_weight (ndarray of shape (n_samples,)) – Weights for each sample. Sample weighting can be used to center (and scale) data using a weighted mean. Weights are internally normalized before preprocessing.
- Returns:
self (object) – Fitted scaler.
- inverse_transform(X_tr)¶
Scale back the data to the original representation
- Parameters:
X_tr (ndarray of shape (n_samples, n_features)) – Transformed matrix
- Returns:
X (original matrix)
- transform(X, y=None, copy=None)¶
Normalize a vector based on previously computed mean and scaling.
- Parameters:
X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The data used to scale along the features axis.
y (None) – Ignored.
copy (bool, default=None) – Copy the input X or not.
- Returns:
X ({array-like, sparse matrix} of shape (n_samples, n_features)) – Transformed array.