Measurment Handling

Created on Mon May 31 13:08:01 2021

@author: WET2RNG

class softsensor.meas_handling.Meas_handling(train_dfs, train_names, input_sensors, output_sensors, fs, test_dfs=None, test_names=None, pre_comp_cols=None)[source]

Measurement handling class that can be used for the whole data preprocessing

Parameters:
  • train_dfs (List of pd.DataFrames) – List of pd.DataFrame for every Measurement used as Training data for subsequent models.

  • train_names (List of str) – List of Track Names corresponding to the train_dfs. Need to have the same length as train_dfs

  • input_sensors (List of str) – Input sensors for the subsequent models, The order of the str defines the order in which the data is present inside the loader

  • output_sensors (List of str) – Output sensors for the subsequent models, The order of of the str defines the order in which the data is present inside the loader

  • fs (int) – sample frequency.

  • test_dfs (List of pd.DataFrames, optional) – List of pd.DataFrame for every Measurement used as Testing data for subsequent models. The default is None

  • test_names (List of str, optional) – List of Track Names corresponding to the test_dfs. Need to have the same length as test_dfs. The default is None

  • pre_comp_cols (List of str, optional) – Defines wheather and which a precomputed solution is in the dataset. Precomputed solutions might be helpfull in certain training tasks The default is None

Return type:

None.

Examples

Define a Measurment Handling class

>>> import softsensor.meas_handling as ms
>>> import pandas as pd
>>> import numpy as np
>>> t = np.linspace(0, 1.0, 10001)
>>> xlow = np.sin(2 * np.pi * 100 * t)
>>> xhigh = 0.2 * np.sin(2 * np.pi * 3000 * t)
>>> d = {'sine_inp': xlow + .1 * xhigh,
         'cos_inp': np.cos(2 * np.pi * 50 * t),
         'out': np.linspace(0, 1.0, 10001)}
>>> t = np.linspace(0, 1.0, 10001)
>>> test_df = {'sine_inp': 10*xlow + .1 * xhigh,
               'cos_inp': np.cos(2 * np.pi * 50 * t),
               'out': np.linspace(0, 1.0, 10001)}
>>> handler = ms.Meas_handling([pd.DataFrame(d, index=t), pd.DataFrame(d, index=t)],
                            ['sine1', 'sine2'],
                            ['sine_inp', 'cos_inp'], ['out'], 10000,
                            [pd.DataFrame(test_df, index=t)], ['test'])
Scale(scaler=StandardScaler(), predef_scaler=False)[source]

Scale the Data. The scaler is fitted only on the traindata. Afterwards train and testdata is transformed.

Parameters:
  • scaler (sklearn.preprocessing scaler, optional) – The default is StandardScaler().

  • predef_scaler (bool, optional) – if True, scaler needs to be fitted already and will be used to scale data. This might come in handy if the data used for scaling is no longer available or special scaling procedures are required The default is False

Return type:

None.

Examples

Based on the Example from class initialisation

>>> df = handler.give_dataframe('sine1')
>>> print(np.var(df['sine_inp'].values))
>>> handler.Scale()
>>> print(np.var(df['sine_inp'].values))
1.0
Resample(fs, kind='linear')[source]

Resample self.train_df and self.test_df to fs using Fourier method along the given axis.

Parameters:
  • fs (float or int) – fs to resample data to.

  • kind (str) – Specifies the kind of interpolation as a string or as an integer specifying the order of the spline interpolator to use. The string has to be one of ‘linear’, ‘nearest’, ‘nearest-up’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘previous’, or ‘next’. ‘zero’, ‘slinear’, ‘quadratic’ and ‘cubic’ refer to a spline interpolation of zeroth, first, second or third order; ‘previous’ and ‘next’ simply return the previous or next value of the point; ‘nearest-up’ nd ‘nearest’ differ when interpolating half-integers (e.g. 0.5, 1.5) in that ‘nearest-up’ rounds up and ‘nearest’ down. Default is ‘linear’. see interpolate.interp1d docu

Return type:

None.

Filter(freq_lim)[source]

Function to bandpass Filter the Sensor data

Parameters:

freq_lim ((low_cut, high cut)) – Defining the low and high cut for bandpass filtering

Return type:

None.

Examples

Based on the Example from class initialisation, Result should be roughly zero

>>> handler.Filter(freq_lim=(10, 700))
>>> filtered_sine = handler.train_df[0]['sine_inp'].values
>>> dev = xlow - filtered_sine
>>> print(np.mean(dev))
0.0
fade_in(window_length, window_type='hanning', columns=None)[source]

Apply the defined window to the train and test data to fade in.

Parameters:
  • window_type (str) – Defining the window function type. Supported window types are: hanning, hamming, blackman, bartlette

  • window_length (int) – Defining the window length of the half window.

Examples

Based on the Example from class initialisation, Result should be roughly zero

>>> print(handler.test_df[0]['cos_inp'][0])
1.0
>>> handler.fade_in(10)
>>> print(handler.test_df[0]['cos_inp'][0])
0.0
give_torch_loader(window_size, keyword='training', train_ratio=0.8, batch_size=32, rnn_window=None, shuffle=False, Add_zeros=True, forecast=1, full_ds=False, pre_comp=False, n_samples=[5000, 1000])[source]

Gives back a torch dataloader for training, evaluation or testing purpose

Parameters:
  • window_size (int) – Window size for input and output series.

  • keyword (str, optional) – possibilities are ‘training’, ‘testing’, ‘short’ or [‘Name’]. ‘training’ gives a training and validation dataloader using all training data. ‘testing’ gives a single dataloader using all test data. ‘short’ gives a training loader with 5000 samples and a validation loader with 1000 samples. [‘Name’], list of names, names must be present in either train_names or test_names The default is ‘training’.

  • train_ratio (float, optional) – only needed for keyword ‘training’. Defines the ration of training data compared to validation data. The default is .8.

  • batch_size (int, optional) – Batchsize for the dataloader. The default is 32.

  • rnn_window (int, optional) – Window size of the recurrent window. The default is None.

  • shuffle (bool, optional) – Loader is shuffled or not. The default is False.

  • Add_zeros (bool, optional) – only needed if rnn_window is False, Adds zeros to the beginning of the time series. The default is True

  • forecast (int, optional) – forcasting horizon

  • pre_comp (bool, optional) – using precomputed solution or not. The default is False.

Returns:

one dataloader if train_ratio = 1, otherwise two dataloaders for training and validation

Return type:

torch.dataloader

give_list(window_size, keyword='testing', batch_size=32, Add_zeros=False, rnn_window=None, forecast=1, full_ds=False)[source]

Gives List of DataLoader

Parameters:
  • window_size (int) – Window size for input and output series.

  • keyword (str, optional) – possibilities are ‘training’, ‘testing’, ‘short’ or [Name]. ‘training’ gives a training and validation dataloader using all training data. ‘testing’ gives a single dataloader using all test data. ‘short’ gives a training loader with 5000 samples and a validation loader with 1000 samples. [Name], gives back a unshuffled loader corresponding to the track name define in init The default is ‘training’.

  • batch_size (int, optional) – Batchsize for the dataloader. The default is 32.

  • Add_zeros (bool, optional) – Appends zeros at the beginning for Autoregressive models. The default is False.

  • rnn_window (int, optional) – Window size of the recurrent window. The default is None.

  • forecast (int, optional) – forecasting horizon

Returns:

list_loader – list of dataloaders with individual Measurements.

Return type:

list of torch.dataloader

give_Datasets(window_size, keyword='training', rnn_window=None, Add_zeros=True, forecast=1, full_ds=False, pre_comp=False)[source]

Gives List of DataSets

Parameters:
  • window_size (int) – Window size for input and output series.

  • keyword (str, optional) – possibilities are ‘training’, ‘testing’, ‘short’ or [Name]. ‘training’ gives a training and validation dataloader using all training data. ‘testing’ gives a single dataloader using all test data. ‘short’ gives a training loader with 5000 samples and a validation loader with 1000 samples. [Name], gives back a unshuffled loader corresponding to the track name define in init The default is ‘training’.

  • Add_zeros (bool, optional) – Appends zeros at the beginning for Autoregressive models. The default is False.

  • rnn_window (int, optional) – Window size of the recurrent window. The default is None.

  • forecast (int, optional) – forcasting horizon

Returns:

set_list – list of Datsets with individual Measurements.

Return type:

list of SlidingWindow Datasets

give_dataframes(keywords)[source]

Returns list f dataframes

Parameters:

keywords (list of str or 'training' or 'testing') – rdefines which dataframes will be returned. list of str return list of same length with dataframes ‘training’ or ‘testing’ return list of dfs in training / testing

Returns:

dfs – List of Dataframes.

Return type:

list of dfs

give_dataframe(Name)[source]

Gives back Dataframe corresponding to the specific name

Parameters:

Name (str) – String that matches train or test name defined in init.

Returns:

df – DataFrame that corresponds to the given Name.

Return type:

pd.DataFrame