Evaluation Tools

Created on Tue May 3 17:41:37 2022

@author: WET2RNG

softsensor.eval_tools.comp_batch(models, data_handle, tracks, names, device='cpu', batch_size=256, n_samples=5, reduce=False, sens_analysis=None)[source]

Compute the prediction for a list of models and tracks

Parameters:
  • models (list of models) – list of models for computation. MOdels need to be defined in _model_pred.

  • data_handle (Meas_handling class) – Meas handling class that has internal functions to get Dataloader and Dataset.

  • tracks (list of str) – Names of the tracks to predict.

  • names (list of str) – Names that are added to the column name.

  • device (str, optional) – device for computation. The default is ‘cpu’.

  • batch_size (int, optional) – Batch size for models where batching is possible. The default is 256.

  • reduce (bool, optional) – If True, reduce the epistemic and aleatoric uncertainty to a total one. Only relevant for ensembles and Evidence Estimation ARNN. The default is False.

  • sens_analysis (dict, optional) –

    Dictionary that defines if and how a sensitivity analysis is computed for the prediction. If key ‘method’ is valid, the sensitivity analysis is computed either ‘gradient’ or ‘perturbation’-based. If key ‘comp’ is given & True, gradients of the prediction w.r.t. inputs are computed. If key ‘plot’ is given & True, postprocessing results of the gradients are visualized. If key ‘sens_length’ is given, the prediction is only computed for the n ‘sens_length’

    samples in the time series.

    The default is None, i.e. no sensitivity analysis is computed.

Returns:

pred_df – pred_dfs with additional columns for the predictions.

Return type:

list of pd.Dataframe

Examples

Define Data

>>> import softsensor.meas_handling as ms
>>> import numpy as np
>>> import pandas as pd
>>> t = np.linspace(0, 1.0, 101)
>>> d = {'inp1': np.random.randn(101),
         'inp2': np.random.randn(101),
         'out': np.random.randn(101)}
>>> handler = ms.Meas_handling([pd.DataFrame(d, index=t)], ['train'],
                               ['inp1', 'inp2'], ['out'], fs=100)

Compute Prediction

>>> from softsensor.eval_tools import comp_batch
>>> import softsensor.autoreg_models
>>> params = {'input_channels': 2,
              'pred_size': 1,
              'window_size': 10,
              'rnn_window': 10}
>>> m = softsensor.autoreg_models.ARNN(**params, hidden_size=[16, 8])
>>> dataframes = comp_batch([m], handler, handler.train_names,
                            ['ARNN'], device='cpu')
>>> list(dataframes[0].columns)
['inp1', 'inp2', 'out', 'out_ARNN']

Compute Prediciton wth uncertainty

>>> import softsensor.homoscedastic_model as hm
>>> vars = hm.fit_homoscedastic_var(dataframes, ['out'], ['out_ARNN'])
>>> homosc_m = hm.HomoscedasticModel(m, vars)
>>> sepmve = softsensor.autoreg_models.SeparateMVEARNN(**params,mean_model=m,
                                                       var_hidden_size=[16, 8])
>>> dataframes = comp_batch([m, homosc_m, sepmve], handler, handler.train_names,
                            ['ARNN', 'Homosc_ARNN', 'SepMVE'], device='cpu')
>>> list(dataframes[0].columns)
['inp1','inp2','out','out_ARNN','out_Homosc_ARNN','out_Homosc_ARNN_var','out_SepMVE','out_SepMVE_var']
softsensor.eval_tools.comp_error(test_df, out_sens, fs=None, names=['pred'], metrics=['MSE'], freq_metrics=None, freq_range=None, bins=20)[source]

Computes the Error from a df with specific Names in the column

Parameters:
  • test_df (pandas DataFrame) – DataFrame that must include the original output and the prediction column name of the prediction must look like: ‘out_sens_name’.

  • out_sens (list of str) – column names to observe

  • fs (float) – sampling rate of the df for psd error computation.

  • names (list of str, optional) – list of names that are appended to the original column. The default is [‘pred’].

  • metrics (list of str, optional) – Metrics to Evaluate in Time domain [‘MSE’, ‘MAE’, ‘MAPE’]. The default is [‘MSE’].

  • freq_range (tuple of float, optional) – range in which the psd error is computed. The default is None.

Returns:

result_df – DataFrame with errors as index and names as columns.

Return type:

pandas DataFrame

Examples

Based on the examples in comp_batch. Prediction of point Metrics

>>> from softsensor.eval_tools import comp_error
>>> comp_error(dataframes[0], out_sens=['out'], names=['ARNN', 'Homosc_ARNN', 'SepMVE'],
               metrics=['MSE', 'MAE'], freq_range=None)
         ARNN  Homosc_ARNN    SepMVE
out_MSE  1.297152     1.297152  1.297152
out_MAE  0.924926     0.924926  0.924926

Prediction of Distributional Metrics

>>> comp_error(dataframes[0], out_sens=['out'], names=['Homosc_ARNN', 'SepMVE'],
               metrics=['NLL', 'ECE'], freq_range=None)
         Homosc_ARNN    SepMVE
out_NLL     0.630110  0.678553
out_ECE     0.023137  0.083637

Prediction of Statistical Metrics

>>> comp_error(dataframes[0], out_sens=['out'], names=['ARNN', 'Homosc_ARNN', 'SepMVE'],
               metrics=['JSD', 'Wasserstein'], freq_range=None)
                     ARNN  Homosc_ARNN    SepMVE
out_JSD          0.680494     0.680494  0.680494
out_Wasserstein  0.073267     0.073267  0.073267

Prediction of Metrics in frequency domain (PSD)

>>> comp_error(dataframes[0], out_sens=['out'], names=['ARNN', 'Homosc_ARNN', 'SepMVE'],
               metrics=None, freq_metrics=['MSLE'], fs=100, freq_range=(5, 25))
                  ARNN  Homosc_ARNN    SepMVE
out_PSD_MSLE  0.000864     0.000864  0.000864
softsensor.eval_tools.comp_mean_metrics(models, data_handle, fs, model_names, metrics, freq_range=None)[source]

Compute the mean metric scores of models on the test tracks

Parameters:
  • models (list[nn.Module]) – Torch models to evaluate

  • data_handle (Datahandle) – Datahandle that contains track

  • fs (float) – Sampling rate of the df for psd error computation.

  • model_names (list[string]) – Names of the models

  • metrics (list[string]) – Names of the metrics to evaluate

  • freq_range (tuple of float, optional) – range in which the psd error is computed. The default is None.

  • reduce (bool, optional) – If True we compute the mean scores over all output variables. The default is False.

Returns:

scores_df

Return type:

dict[str, float]

softsensor.eval_tools.comp_metrics(test_df, out_sens, names=['pred'], metric='MSE', bins=20)[source]

Compute the Errors in Time domain

Parameters:
  • test_df (pandas DataFrame) –

    DataFrame that must include the original output and the prediction column name of the prediction must look like: ‘out_sens_name’.

    if metric is uncertainty metric:

    column name of the uncertainty must look like: ‘uncertainty_{out_sens}_{name}’.

  • out_sens (list of str) – column names to observe

  • names (list of str, optional) – list of names that are appended to the original column. The default is [‘pred’].

  • metrics (list of str, optional) – Metrics to Evaluate in Time domain [‘MSE’, ‘MAE’, ‘MAPE’]. The default is [‘MSE’].

Returns:

error

Return type:

matrix of shape [len(out_sens), len(names)]

softsensor.eval_tools.comp_pred(models, data_handle, track, names=None, batch_size=256, reduce=False, sens_analysis=None)[source]

Computes the predictions for given a DataFrame and a Track for the specific models

Parameters:
  • models (list of Models) – DESCRIPTION.

  • data_handle (data handle of the Meas Handling class) – data handle that should include the track.

  • track (str) – name of the track to observe.

  • names (list of str.) – str should include ‘NARX’ if its a NARX model, ‘AR’ if its an autoregressive model and ‘RNN’ if is an Recurrent Network

  • batch_size (int, optional) – Batch size for models where batching is possible. The default is 256.

  • reduce (bool, optional) – If True, reduce the epistemic and aleatoric uncertainty to a total one. Only relevant for ensembles and Evidence Estimation ARNN. The default is False.

  • sens_analysis (dict, optional) –

    Dictionary that defines if and how a sensitivity analysis is computed for the prediction. If key ‘method’ is valid, the sensitivity analysis is computed either ‘gradient’ or ‘perturbation’-based. If key ‘comp’ is given & True, gradients of the prediction w.r.t. inputs are computed. If key ‘plot’ is given & True, postprocessing results of the gradients are visualized. If key ‘sens_length’ is given, the prediction is only computed for the n ‘sens_length’

    samples in the time series.

    The default is None, i.e. no sensitivity analysis is computed.

Returns:

  • pred_df (pd.DataFrame) – New columns are Added for each Model prediction.

  • if comp_sens

    sens_dictdict

    Additional list of dictionaries with the sensitivity analysis results for each model.

Examples

>>> t = np.linspace(0, 1.0, 101)
>>> xlow = np.sin(2 * np.pi * 100 * t)       # 100Hz Signal
>>> xhigh = np.sin(2 * np.pi * 3000 * t)     # 3000Hz Signal
>>> d = {'sine_inp': xlow + xhigh,
>>>      'cos_inp': np.cos(2 * np.pi * 50 * t),
>>>      'out': np.linspace(0, 1.0, 101)}
>>> list_of_df = [pd.DataFrame(d), pd.DataFrame(d)]
>>> test_df = {'sine_inp': 10*xlow + xhigh,
>>>            'cos_inp': np.cos(2 * np.pi * 50 * t),
>>>            'out': np.linspace(0, 1.0, 101)}
>>> test_df = [pd.DataFrame(test_df)]
>>> data_handle = Meas_handling(list_of_df, train_names=['sine1', 'sine2'],
>>>                         input_sensors=['sine_inp', 'cos_inp'],
>>>                         output_sensors=['out'], fs=100,
>>>                         test_dfs=test_df, test_names=['test'])
>>> model = ARNN(input_channels=2, pred_size=1, window_size=10, rnn_window=10)
>>> pred_df = comp_pred([model], data_handle, track, names=['ARNN']) # without sensitivity analysis
>>> pred_df, sens_dict = comp_pred([model], data_handle, track, names=['ARNN'], sens_analysis={'method': 'gradient', 'params': {'comp': True, 'plot': True}}) # with sensitivity analysis
softsensor.eval_tools.comp_psd(test_df, out_sens, fs, names=['pred'], freq_range=None)[source]

Compute the MLPE of the PSD’s

Parameters:
  • test_df (pandas DataFrame) – DataFrame that must include the original output and the prediction column name of the prediction must look like: ‘out_sens_name’.

  • out_sens (list of str) – column names to observe

  • fs (float) – sampling rate of the df for psd error computation.

  • names (list of str, optional) – list of names that are appended to the original column. The default is [‘pred’].

  • freq_range (tuple of float, optional) – range in which the psd error is computed. The default is None.

Returns:

psd_error

Return type:

matrix of shape [len(out_sens), len(names)]

softsensor.eval_tools.load_model(Type, path)[source]

load function of models :param Type: Type of Network [‘CNN_DNN’, ‘NARX’, ‘CNN_NARX’, ‘AR_CNN’, ‘RNN’]. :type Type: str :param path: path of the Model :type path: str

Returns:

  • model (nn.Model with nn.Modules)

  • result_df (pandas.DataFrame) – Results of the hyperparameter optimization.