Reconstruction Measures

Global Reconstruction Error

skmatter.metrics.pointwise_global_reconstruction_error(X, Y, train_idx=None, test_idx=None, scaler=None, estimator=None)

Computes the pointwise global reconstruction error using the source X to reconstruct the features or samples of target Y based on a minimization by linear regression:

\[GRE^{(i)}(X,Y) = \min_W ||y_i - x_iW||\]

If used with X and Y of shape (n_samples, n_features) it computes the pointwise global reconstruction error of the features as defined in Ref. [Goscinski2021]. In this case the number of samples of X and Y should agree with each other, but the number of features can be different. The error is expressed per sample.

If used with X and Y of shape(n_features, n_samples) it computes the reconstruction error of the samples. In this case the number of features of X and Y should agree with each other, but the number of samples can be different. The error is expressed per feature.

The default parameters mimics the ones of Ref. [Goscinski2021].

Parameters:
  • X (ndarray of shape (n_samples, X_n_features)) – Source data which reconstructs target Y. For feature reconstruction of Y using X use input shape (samples, features). For sample reconstruction of Y using X use input shape (features, samples).

  • Y (ndarray of shape (n_samples, Y_n_targets)) – Target data which is reconstructed with X. For feature reconstruction of Y using X use input shape (samples, features). For sample reconstruction of Y using X use input shape (features, samples).

  • train_idx (ndarray, dtype=int, default=None) – array of indices used for training, if None, If None, the complement of the test_idx is used. If train_size is also None, 2-fold split is taken.

  • test_idx (ndarray, dtype=int, default=None) – array of indices used for training, if None, If None, the complement of the train_idx is used. If test_size is also None, 2-fold split is taken.

  • scaler (object implementing fit/transfom) – Scales the X and Y before computing the reconstruction measure. The default value scales the features such that the reconstruction measure on the training set is upper bounded to 1.

  • estimator (object implementing fit/predict, default=None) – Sklearn estimator used to reconstruct features/samples.

Returns:

pointwise_global_reconstruction_error (ndarray) – The global reconstruction error for each sample/point

skmatter.metrics.global_reconstruction_error(X, Y, test_idx=None, train_idx=None, scaler=None, estimator=None)

Computes the global reconstruction error using the source X to reconstruct the features or samples of target Y based on a minimization by linear regression:

\[GRE(X,Y) = \min_W ||Y - XW||\]

If used with X and Y of shape (n_samples, n_features) it computes the global reconstruction error of the features as defined in Ref. [Goscinski2021]. In this case the number of samples of X and Y should agree with each other, but the number of features can be different. The error is expressed per sample.

If used with X and Y of shape(n_features, n_samples) it computes the reconstruction error of the samples. In this case the number of features of X and Y should agree with each other, but the number of samples can be different. The error is expressed per feature.

The default parameters mimics the ones of Ref. [Goscinski2021].

Parameters:
  • X (ndarray of shape (n_samples, X_n_features)) – Source data which reconstructs target Y. For feature reconstruction of Y using X use input shape (samples, features). For sample reconstruction of Y using X use input shape (features, samples).

  • Y (ndarray of shape (n_samples, Y_n_targets)) – Target data which is reconstructed with X. For feature reconstruction of Y using X use input shape (samples, features). For sample reconstruction of Y using X use input shape (features, samples).

  • train_idx (ndarray, dtype=int, default=None) – array of indices used for training, if None, If None, the complement of the test_idx is used. If train_size is also None, 2-fold split is taken.

  • test_idx (ndarray, dtype=int, default=None) – array of indices used for training, if None, If None, the complement of the train_idx is used. If test_size is also None, 2-fold split is taken.

  • scaler (object implementing fit/transfom) – Scales the X and Y before computing the reconstruction measure. The default value scales the features such that the reconstruction measure on the training set is upper bounded to 1.

  • estimator (object implementing fit/predict, default=None) – Sklearn estimator used to reconstruct features/samples.

Returns:

global_reconstruction_error (ndarray) – The global reconstruction error

Global Reconstruction Distortion

skmatter.metrics.pointwise_global_reconstruction_distortion(X, Y, test_idx=None, train_idx=None, scaler=None, estimator=None)

Computes the pointwise global reconstruction distortion using the source X to reconstruct the features or samples of target Y based on a minimization by orthogonal regression:

\[GRD^{(i)}(X,Y) = \min_Q ||y_i - x_iQ\|| \quad\mathrm{subject\ to}\quad Q^TQ=I\]

If used with X and Y of shape (n_samples, n_features) it computes the pointwise global reconstruction distortion of the features as defined in Ref. [Goscinski2021]. In this case the number of samples of X and Y should agree with each other, but the number of features can be different. The distortion is expressed per sample.

If used with X and Y of shape(n_features, n_samples) it computes the reconstruction distortion of the samples. In this case the number of features of X and Y should agree with each other, but the number of samples can be different. The distortion is expressed per feature.

The default parameters mimics the ones of Ref. [Goscinski2021].

Parameters:
  • X (ndarray of shape (n_samples, X_n_features)) – Source data which reconstructs target Y. For feature reconstruction of Y using X use input shape (samples, features). For sample reconstruction of Y using X use input shape (features, samples).

  • Y (ndarray of shape (n_samples, Y_n_targets)) – Target data which is reconstructed with X. For feature reconstruction of Y using X use input shape (samples, features). For sample reconstruction of Y using X use input shape (features, samples).

  • train_idx (ndarray, dtype=int, default=None) – array of indices used for training, if None, If None, the complement of the test_idx is used. If train_size is also None, 2-fold split is taken.

  • test_idx (ndarray, dtype=int, default=None) – array of indices used for training, if None, If None, the complement of the train_idx is used. If test_size is also None, 2-fold split is taken.

  • scaler (object implementing fit/transfom) – Scales the X and Y before computing the reconstruction measure. The default value scales the features such that the reconstruction measure on the training set is upper bounded to 1.

  • estimator (object implementing fit/predict, default=None) – Sklearn estimator used to reconstruct features/samples.

Returns:

pointwise_global_reconstruction_distortion (ndarray) – The global reconstruction distortion for each sample/point

skmatter.metrics.global_reconstruction_distortion(X, Y, test_idx=None, train_idx=None, scaler=None, estimator=None)

Computes the global reconstruction distortion using the source X to reconstruct the features or samples of target Y based on a minimization by orthogonal regression:

\[GRD(X,Y) = \min_Q ||y - XQ\|| \quad\mathrm{subject\ to}\quad Q^TQ=I\]

If used with X and Y of shape (n_samples, n_features) it computes the global reconstruction distortion of the features as defined in Ref. [Goscinski2021]. In this case the number of samples of X and Y should agree with each other, but the number of features can be different. The distortion is expressed per sample.

If used with X and Y of shape(n_features, n_samples) it computes the reconstruction distortion of the samples. In this case the number of features of X and Y should agree with each other, but the number of samples can be different. The distortion is expressed per feature.

The default parameters mimics the ones of Ref. [Goscinski2021].

Parameters:
  • X (ndarray of shape (n_samples, X_n_features)) – Source data which reconstructs target Y. For feature reconstruction of Y using X use input shape (samples, features). For sample reconstruction of Y using X use input shape (features, samples).

  • Y (ndarray of shape (n_samples, Y_n_targets)) – Target data which is reconstructed with X. For feature reconstruction of Y using X use input shape (samples, features). For sample reconstruction of Y using X use input shape (features, samples).

  • train_idx (ndarray, dtype=int, default=None) – array of indices used for training, if None, If None, the complement of the test_idx is used. If train_size is also None, 2-fold split is taken.

  • test_idx (ndarray, dtype=int, default=None) – array of indices used for training, if None, If None, the complement of the train_idx is used. If test_size is also None, 2-fold split is taken.

  • scaler (object implementing fit/transfom) – Scales the X and Y before computing the reconstruction measure. The default value scales the features such that the reconstruction measure on the training set is upper bounded to 1.

  • estimator (object implementing fit/predict, default=None) – Sklearn estimator used to reconstruct features/samples.

Returns:

global_reconstruction_distortion (ndarray) – The global reconstruction distortion

Local Reconstruction Error

skmatter.metrics.pointwise_local_reconstruction_error(X, Y, n_local_points, test_idx=None, train_idx=None, scaler=None, estimator=None, n_jobs=None)

Computes the pointwise local reconstruction error using the source X to reconstruct the features or samples of target Y based on a minimization by linear regression:

\[\tilde{\mathbf{x}}'_i = \bar{\mathbf{x}} + (\mathbf{x}_i - \bar{\mathbf{x}})\mathbf{P}^{(i)}\]
\[LRE^{(i)}(X,Y) = \|\mathbf{x}'_i - \tilde{\mathbf{x}}'_i\|^2\]

If used with X and Y of shape (n_samples, n_features) it computes the pointwise local reconstruction error of the features as defined in Ref. [Goscinski2021]. In this case the number of samples of X and Y should agree with each other, but the number of features can be different. The error is expressed per sample.

If used with X and Y of shape(n_features, n_samples) it computes the reconstruction error of the samples. In this case the number of features of X and Y should agree with each other, but the number of samples can be different. The error is expressed per feature.

The default parameters mimics the ones of Ref. [Goscinski2021].

Parameters:
  • X (ndarray of shape (n_samples, X_n_features)) – Source data which reconstructs target Y. For feature reconstruction of Y using X use input shape (samples, features). For sample reconstruction of Y using X use input shape (features, samples).

  • Y (ndarray of shape (n_samples, Y_n_targets)) – Target data which is reconstructed with X. For feature reconstruction of Y using X use input shape (samples, features). For sample reconstruction of Y using X use input shape (features, samples).

  • n_local_points (int,) – Number of neighbour points used to compute the local reconstruction weight for each sample/point.

  • train_idx (ndarray, dtype=int, default=None) – array of indices used for training, if None, If None, the complement of the test_idx is used. If train_size is also None, 2-fold split is taken.

  • test_idx (ndarray, dtype=int, default=None) – array of indices used for training, if None, If None, the complement of the train_idx is used. If test_size is also None, 2-fold split is taken.

  • scaler (object implementing fit/transfom) – Scales the X and Y before computing the reconstruction measure. The default value scales the features such that the reconstruction measure on the training set is upper bounded to 1.

  • estimator (object implementing fit/predict, default=None) – Sklearn estimator used to reconstruct features/samples.

Returns:

pointwise_local_reconstruction_error (ndarray) – The local reconstruction error for each sample/point

skmatter.metrics.local_reconstruction_error(X, Y, n_local_points, test_idx=None, train_idx=None, scaler=None, estimator=None, n_jobs=None)

Computes the local reconstruction error using the source X to reconstruct the features or samples of target Y based on a minimization by linear regression:

\[LRE(X,Y) = \sqrt{\sum_i LRE^{(i)}(X,Y)}/\sqrt{n_\text{test}}\]

If used with X and Y of shape (n_samples, n_features) it computes the local reconstruction error of the features as defined in Ref. [Goscinski2021]. In this case the number of samples of X and Y should agree with each other, but the number of features can be different. The error is expressed per sample.

If used with X and Y of shape(n_features, n_samples) it computes the reconstruction error of the samples. In this case the number of features of X and Y should agree with each other, but the number of samples can be different. The error is expressed per feature.

The default parameters mimics the ones of Ref. [Goscinski2021].

Parameters:
  • X (ndarray of shape (n_samples, X_n_features)) – Source data which reconstructs target Y. For feature reconstruction of Y using X use input shape (samples, features). For sample reconstruction of Y using X use input shape (features, samples).

  • Y (ndarray of shape (n_samples, Y_n_targets)) – Target data which is reconstructed with X. For feature reconstruction of Y using X use input shape (samples, features). For sample reconstruction of Y using X use input shape (features, samples).

  • n_local_points (int,) – Number of neighbour points used to compute the local reconstruction weight for each sample/point.

  • train_idx (ndarray, dtype=int, default=None) – array of indices used for training, if None, If None, the complement of the test_idx is used. If train_size is also None, 2-fold split is taken.

  • test_idx (ndarray, dtype=int, default=None) – array of indices used for training, if None, If None, the complement of the train_idx is used. If test_size is also None, 2-fold split is taken.

  • scaler (object implementing fit/transfom) – Scales the X and Y before computing the reconstruction measure. The default value scales the features such that the reconstruction measure on the training set is upper bounded to 1.

  • estimator (object implementing fit/predict, default=None) – Sklearn estimator used to reconstruct features/samples.

Returns:

local_reconstruction_error (ndarray) – The local reconstruction error