Metrics

class softsensor.metrics.WassersteinDistance(data, weights=None)[source]

WassersteinDistance class to calculate the Wasserstein distance and the wasserstein Fourier distance.

data

The input data as a pandas DataFrame.

Type:: pd.DataFrame

weights_u

The weights associated with the data as a pandas DataFrame.

Type:: pd.DataFrame

sort_data()[source]: Sort the data and weights in ascending order.

weighted_hist()[source]: Calculate the weighted histogram of the data.

pdf()[source]: Calculate the probability density function of the data.

cdf()[source]: Calculate the cumulative density function of the data.

inverse_cdf()[source]: Calculate the inverse cumulative density function of the data.

wasserstein_distance_p()[source]: Calculate the p-th Wasserstein distance.

cdf(n_points=1000)[source]

inverse_cdf(n_points=1000)[source]

pdf(n_points=1000)[source]

sort_data()[source]

Sorts the data and weights based on the values in the data.

This method sorts the data and weights along the first axis (rows) based on the values in the data. It returns the sorted indices, sorted data, and sorted weights.

Returns:

A tuple containing:

sorted_indices (numpy.ndarray): The indices that would sort the data.
sorted_data (pandas.DataFrame): The data sorted according to the sorted indices.
sorted_weights (pandas.DataFrame): The weights sorted according to the sorted indices.

Return type:

tuple

wasserstein_distance_p(p=1)[source]

Compute the Wasserstein distance (p-th order) based on the inverse cumulative distribution function (CDF).

This method calculates the Wasserstein distance, a measure of the distance between two probability distributions, using the inverse CDF of the distribution. The distance is computed for a specified order p.

Parameters:: p (int, optional) – The order of the Wasserstein distance. Defaults to 1.
Returns:: The computed Wasserstein distance of order p.
Return type:: float

wasserstein_fourier_distance(p=2)[source]

Compute the Wasserstein Fourier distance for the given data.

This method calculates the Wasserstein Fourier distance by first obtaining the inverse cumulative distribution function (CDF) using the normalized power spectral density and then computing the distance using the specified metric.

Parameters:: p (int, optional) – The order of the norm used in the distance calculation. Defaults to 2.
Returns:: The computed Wasserstein Fourier distance.
Return type:: float

weighted_hist(bins=100, nfft=2048, nperseg=2048)[source]

Compute weighted histograms for each column in the data.

This method calculates the weighted histogram for each column in the dataset using the specified number of bins. The weights for each column are used to compute the histogram. The resulting histograms are stored in the self.hist attribute, and the corresponding probability distributions are stored in the self.distribution attribute.

Parameters: bins (int): The number of bins to use for the histogram. Default is 100.

Attributes: self.hist (dict): A dictionary where keys are column names and values

are tuples containing the histogram values and bin edges.

self.distribution (dict): A dictionary where keys are column names and: values are rv_histogram objects representing the probability distributions of the histograms.

wsf_cdf(n_points=1000)[source]

wsf_inverse_cdf(n_points=1000)[source]

wsf_pdf(n_points=1000)[source]

softsensor.metrics.compute_quantile_metrics(model, test_loader, output_names=['x'], expected_quantiles=array([0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95]))[source]

Compute metrics that are compatible with quantile models

Parameters:

model (QuantileNARX) – Quantile model to evaluate
test_loader (list[Dataloader]) – Test dataset
output_names (list[str], optional) – Output sensors to consider. The default is [“x”]
expected_quantiles (list[float]) – Quantiles to evaluate Expected to be of the form [lb0, lb1, …, median, …, ub1, ub0]

Returns:

mean_scores

Return type:

dict[str, float]

softsensor.metrics.crps(mu, targets, var) → float[source]

The negatively oriented continuous ranked probability score for Gaussians.

Computes CRPS for held out data (y_true) given predictive uncertainty with mean (y_pred) and standard-deviation (y_std). Each test point is given equal weight in the overall score over the test set.

Negatively oriented means a smaller value is more desirable.

Adapted from https://github.com/uncertainty-toolbox/uncertainty-toolbox

Parameters:

mu (torch.Tensor) – Predicted means
targets (torch.Tensor) – Target values
var (torch.Tensor) – Predicted variances

Returns:

crps

Return type:

float

softsensor.metrics.distance_calc(inverse_cdf, p)[source]

softsensor.metrics.ece(mu, targets, var, quantiles=array([0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95]))[source]

Expected Calibration Error (ECE) measures the mean absolute calibration error of multiple PICPs

See “Accurate Uncertainties for Deep Learning Using Calibrated Regression” [Kuleshov et al. 2018 https://arxiv.org/abs/1807.00263]

See https://stackoverflow.com/questions/20864847/probability-to-z-score-and-vice-versa

Parameters:

mu (torch.Tensor) – Predicted means
targets (torch.Tensor) – Target values
var (torch.Tensor) – Predicted variances
quantiles (list[x], x in (0,1)) – Quantiles to evaluate

Returns:

pi_width

Return type:

torch.Tensor

softsensor.metrics.heteroscedasticity(mu, targets, var)[source]

Heteroscedasticity of uncertainty estimate as std of std

Parameters:

mu (torch.Tensor) – Predicted means
targets (torch.Tensor) – Target values
var (torch.Tensor) – Predicted variances

Returns:

heteroscedasticity

Return type:

torch.Tensor

softsensor.metrics.log_area_error(psd_original, psd_targets, f)[source]

softsensor.metrics.mae(mu, targets, var)[source]

Mean Absolute Error (MAE)

Parameters:

mu (torch.Tensor) – Predicted means
targets (torch.Tensor) – Target values
var (torch.Tensor) – Predicted variances

Returns:

mae

Return type:

torch.Tensor

softsensor.metrics.mpiw(mu, targets, var, z=1.96)[source]

Mean Prediction Interval Width (MPIW) measures the width of a specific PI

Parameters:

mu (torch.Tensor) – Predicted means
targets (torch.Tensor) – Target values
var (torch.Tensor) – Predicted variances
z (float, optional) – Z-score for the specific quantile, the default is 1.96 (95% interval)

Returns:

pi_width

Return type:

torch.Tensor

softsensor.metrics.nll(mu, targets, var)[source]

Gaussian Negative Log Likelihood Loss (NLL)

Parameters:

mu (torch.Tensor) – Predicted means
targets (torch.Tensor) – Target values
var (torch.Tensor) – Predicted variances

Returns:

nll

Return type:

torch.Tensor

softsensor.metrics.nll_statistic(mu, targets, var)[source]

Gaussian Negative Log Likelihood (rather than the score of the optimization objective): We mainly report the NLL loss as its minimization is equivalent to NLL minimization but add this metric for comparability

Parameters:

mu (torch.Tensor) – Predicted means
targets (torch.Tensor) – Target values
var (torch.Tensor) – Predicted variances

Returns:

nll

Return type:

torch.Tensor

softsensor.metrics.pearson(mu, targets, var)[source]

Pearson correlation coefficient

Parameters:

mu (torch.Tensor) – Predicted means
targets (torch.Tensor) – Target values
var (torch.Tensor) – Predicted variances

Returns:

pearson

Return type:

float

softsensor.metrics.picp(mu, targets, var, z=1.96)[source]

Prediction Interval Coverage Probability (PICP) measures the coverage of a specific PI

Parameters:

mu (torch.Tensor) – Predicted means
targets (torch.Tensor) – Target values
var (torch.Tensor) – Predicted variances
z (float, optional) – Z-score for the specific quantile, the default is 1.96 (95% interval)

Returns:

pi_coverage

Return type:

torch.Tensor

softsensor.metrics.quantile_ece(predicted_quantiles, targets, expected_quantiles=array([0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95]))[source]

Expected Calibration Error (ECE) measures the mean absolute calibration error of multiple PICPs

See “Accurate Uncertainties for Deep Learning Using Calibrated Regression” [Kuleshov et al. 2018 https://arxiv.org/abs/1807.00263]

Parameters:

predicted_quantiles (list[torch.Tensor]) – Expected to be of the form [median, lb0, ub0, lb1, ub1, …]
targets (torch.Tensor) – Ground truth for median
expected_quantiles (Quantiles to evaluate) – Expected to be of the form [lb0, lb1, …, median, …, ub1, ub0]

Returns:

pi_width

Return type:

torch.Tensor

softsensor.metrics.r2(mu, targets, var)[source]

R2 score

Parameters:

mu (torch.Tensor) – Predicted means
targets (torch.Tensor) – Target values
var (torch.Tensor) – Predicted variances

Returns:

r2

Return type:

float

softsensor.metrics.rmse(mu, targets, var)[source]

Root Mean Square Error (RMSE)

Parameters:

mu (torch.Tensor) – Predicted means
targets (torch.Tensor) – Target values
var (torch.Tensor) – Predicted variances

Returns:

rmse

Return type:

torch.Tensor

softsensor.metrics.rmv(mu, targets, var)[source]

Root Mean Variance (RMV) measures the sharpness of the uncertainty distributions

Parameters:

mu (torch.Tensor) – Predicted means
targets (torch.Tensor) – Target values
var (torch.Tensor) – Predicted variances

Returns:

sharpness

Return type:

torch.Tensor

class softsensor.metrics.wf_distribution(frequency, norm_psd_cumsum, *args, **kwargs)[source]