Measurment Handling
Created on Mon May 31 13:08:01 2021
@author: WET2RNG
- class softsensor.meas_handling.Meas_handling(train_dfs, train_names, input_sensors, output_sensors, fs, test_dfs=None, test_names=None, pre_comp_cols=None)[source]
Measurement handling class that can be used for the whole data preprocessing
- Parameters:
train_dfs (List of pd.DataFrames) – List of pd.DataFrame for every Measurement used as Training data for subsequent models.
train_names (List of str) – List of Track Names corresponding to the train_dfs. Need to have the same length as train_dfs
input_sensors (List of str) – Input sensors for the subsequent models, The order of the str defines the order in which the data is present inside the loader
output_sensors (List of str) – Output sensors for the subsequent models, The order of of the str defines the order in which the data is present inside the loader
fs (int) – sample frequency.
test_dfs (List of pd.DataFrames, optional) – List of pd.DataFrame for every Measurement used as Testing data for subsequent models. The default is None
test_names (List of str, optional) – List of Track Names corresponding to the test_dfs. Need to have the same length as test_dfs. The default is None
pre_comp_cols (List of str, optional) – Defines wheather and which a precomputed solution is in the dataset. Precomputed solutions might be helpfull in certain training tasks The default is None
- Return type:
None.
Examples
Define a Measurment Handling class
>>> import softsensor.meas_handling as ms >>> import pandas as pd >>> import numpy as np >>> t = np.linspace(0, 1.0, 10001) >>> xlow = np.sin(2 * np.pi * 100 * t) >>> xhigh = 0.2 * np.sin(2 * np.pi * 3000 * t) >>> d = {'sine_inp': xlow + .1 * xhigh, 'cos_inp': np.cos(2 * np.pi * 50 * t), 'out': np.linspace(0, 1.0, 10001)} >>> t = np.linspace(0, 1.0, 10001) >>> test_df = {'sine_inp': 10*xlow + .1 * xhigh, 'cos_inp': np.cos(2 * np.pi * 50 * t), 'out': np.linspace(0, 1.0, 10001)} >>> handler = ms.Meas_handling([pd.DataFrame(d, index=t), pd.DataFrame(d, index=t)], ['sine1', 'sine2'], ['sine_inp', 'cos_inp'], ['out'], 10000, [pd.DataFrame(test_df, index=t)], ['test'])
- Scale(scaler=StandardScaler(), predef_scaler=False)[source]
Scale the Data. The scaler is fitted only on the traindata. Afterwards train and testdata is transformed.
- Parameters:
scaler (sklearn.preprocessing scaler, optional) – The default is StandardScaler().
predef_scaler (bool, optional) – if True, scaler needs to be fitted already and will be used to scale data. This might come in handy if the data used for scaling is no longer available or special scaling procedures are required The default is False
- Return type:
None.
Examples
Based on the Example from class initialisation
>>> df = handler.give_dataframe('sine1') >>> print(np.var(df['sine_inp'].values)) >>> handler.Scale() >>> print(np.var(df['sine_inp'].values)) 1.0
- Resample(fs, kind='linear')[source]
Resample self.train_df and self.test_df to fs using Fourier method along the given axis.
- Parameters:
kind (str) – Specifies the kind of interpolation as a string or as an integer specifying the order of the spline interpolator to use. The string has to be one of ‘linear’, ‘nearest’, ‘nearest-up’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘previous’, or ‘next’. ‘zero’, ‘slinear’, ‘quadratic’ and ‘cubic’ refer to a spline interpolation of zeroth, first, second or third order; ‘previous’ and ‘next’ simply return the previous or next value of the point; ‘nearest-up’ nd ‘nearest’ differ when interpolating half-integers (e.g. 0.5, 1.5) in that ‘nearest-up’ rounds up and ‘nearest’ down. Default is ‘linear’. see interpolate.interp1d docu
- Return type:
None.
- Filter(freq_lim)[source]
Function to bandpass Filter the Sensor data
- Parameters:
freq_lim ((low_cut, high cut)) – Defining the low and high cut for bandpass filtering
- Return type:
None.
Examples
Based on the Example from class initialisation, Result should be roughly zero
>>> handler.Filter(freq_lim=(10, 700)) >>> filtered_sine = handler.train_df[0]['sine_inp'].values >>> dev = xlow - filtered_sine >>> print(np.mean(dev)) 0.0
- fade_in(window_length, window_type='hanning', columns=None)[source]
Apply the defined window to the train and test data to fade in.
- Parameters:
Examples
Based on the Example from class initialisation, Result should be roughly zero
>>> print(handler.test_df[0]['cos_inp'][0]) 1.0 >>> handler.fade_in(10) >>> print(handler.test_df[0]['cos_inp'][0]) 0.0
- give_torch_loader(window_size, keyword='training', train_ratio=0.8, batch_size=32, rnn_window=None, shuffle=False, Add_zeros=True, forecast=1, full_ds=False, pre_comp=False, n_samples=[5000, 1000])[source]
Gives back a torch dataloader for training, evaluation or testing purpose
- Parameters:
window_size (int) – Window size for input and output series.
keyword (str, optional) – possibilities are ‘training’, ‘testing’, ‘short’ or [‘Name’]. ‘training’ gives a training and validation dataloader using all training data. ‘testing’ gives a single dataloader using all test data. ‘short’ gives a training loader with 5000 samples and a validation loader with 1000 samples. [‘Name’], list of names, names must be present in either train_names or test_names The default is ‘training’.
train_ratio (float, optional) – only needed for keyword ‘training’. Defines the ration of training data compared to validation data. The default is .8.
batch_size (int, optional) – Batchsize for the dataloader. The default is 32.
rnn_window (int, optional) – Window size of the recurrent window. The default is None.
shuffle (bool, optional) – Loader is shuffled or not. The default is False.
Add_zeros (bool, optional) – only needed if rnn_window is False, Adds zeros to the beginning of the time series. The default is True
forecast (int, optional) – forcasting horizon
pre_comp (bool, optional) – using precomputed solution or not. The default is False.
- Returns:
one dataloader if train_ratio = 1, otherwise two dataloaders for training and validation
- Return type:
torch.dataloader
- give_list(window_size, keyword='testing', batch_size=32, Add_zeros=False, rnn_window=None, forecast=1, full_ds=False)[source]
Gives List of DataLoader
- Parameters:
window_size (int) – Window size for input and output series.
keyword (str, optional) – possibilities are ‘training’, ‘testing’, ‘short’ or [Name]. ‘training’ gives a training and validation dataloader using all training data. ‘testing’ gives a single dataloader using all test data. ‘short’ gives a training loader with 5000 samples and a validation loader with 1000 samples. [Name], gives back a unshuffled loader corresponding to the track name define in init The default is ‘training’.
batch_size (int, optional) – Batchsize for the dataloader. The default is 32.
Add_zeros (bool, optional) – Appends zeros at the beginning for Autoregressive models. The default is False.
rnn_window (int, optional) – Window size of the recurrent window. The default is None.
forecast (int, optional) – forecasting horizon
- Returns:
list_loader – list of dataloaders with individual Measurements.
- Return type:
list of torch.dataloader
- give_Datasets(window_size, keyword='training', rnn_window=None, Add_zeros=True, forecast=1, full_ds=False, pre_comp=False)[source]
Gives List of DataSets
- Parameters:
window_size (int) – Window size for input and output series.
keyword (str, optional) – possibilities are ‘training’, ‘testing’, ‘short’ or [Name]. ‘training’ gives a training and validation dataloader using all training data. ‘testing’ gives a single dataloader using all test data. ‘short’ gives a training loader with 5000 samples and a validation loader with 1000 samples. [Name], gives back a unshuffled loader corresponding to the track name define in init The default is ‘training’.
Add_zeros (bool, optional) – Appends zeros at the beginning for Autoregressive models. The default is False.
rnn_window (int, optional) – Window size of the recurrent window. The default is None.
forecast (int, optional) – forcasting horizon
- Returns:
set_list – list of Datsets with individual Measurements.
- Return type:
list of SlidingWindow Datasets