Hyperparameter Optimization

Created on Tue May 10 11:00:36 2022

@author: WET2RNG

Grid search class to efficiently optimize hyperparameters. Returns parameters and best model for evaluation

Parameters:
  • data_handle (Meas Handling class) – used for getting the training and evaluation data

  • criterion (nn.Loss function) – Loss function e.g. nn.MSELoss()

  • model_type (str) – Describing the Model Type: currently implemented. [‘ARNN’, ‘MCDO’, ‘MVE’, ‘Sep_MVE’, ‘MVE_MCDO’, ‘MVE_Student_Forced’, ‘QR’, ‘QR_Student_Forced’, ‘BNN’, ‘MVE_BNN’, ‘RNN’].

  • parameters (dict) – dictionary of static parameters in the grid search.

  • grid_params (dict) – dict of grid parameters with grid options as list.

  • pretrained_model (str, optional) – path to pretrained model to load as base model. The default is None

  • reconfigure_criterion (bool, optional) – if True, the criterion is reconfigured with params from the grid. The default is False

  • val_criterion (nn.Loss function, optional) – val_criterion to be used for validation instead of criterion. The default is None

  • val_prediction (bool, optional) – if True, prediction on testing tracks in data_hanlde is used for hyperparameter evaluation

  • device (str, optional) – device to run training on. The default is ‘cpu’.

  • key (str, optional) – ‘training’ or ‘short’. Training uses whole dataloader, short just subset for training. default is training

  • print_results (bool, optional) – True prints results for every epoch. default is False

Returns:

  • result_df (pd.DataFrame) – parameters and corrosponding results for each grid search step.

  • best_model (torch Model) – best performing model.

Examples

Data Preprocessing

>>> import softsensor.meas_handling as ms
>>> import numpy as np
>>> import pandas as pd
>>> t = np.linspace(0, 1.0, 101)
>>> d = {'sine_inp': np.sin(2 * np.pi * 100 * t) ,
         'cos_inp': np.cos(2 * np.pi * 50 * t),
         'out': np.linspace(0, 1.0, 101)}
>>> list_of_df = [pd.DataFrame(d), pd.DataFrame(d)]
>>> test_df = {'sine_inp': np.sin(2 * np.pi * 100 * t),
               'cos_inp': np.cos(2 * np.pi * 50 * t),
               'out': np.linspace(0, 1.0, 101)}
>>> test_df = [pd.DataFrame(test_df)]
>>> handler = ms.Meas_handling(list_of_df, train_names=['sine1', 'sine2'],
                               input_sensors=['sine_inp', 'cos_inp'],
                               output_sensors=['out'], fs=100,
                               test_dfs=test_df, test_names=['test'])

Optimize an ARNN

>>> from softsensor.hyperparameter_optimization import grid_search
>>> import torch.nn as nn
>>> grid_params = {'lr': [0.0001, 0.001],
                   'optimizer': ['Adam', 'SGD']}
>>> model_type = 'ARNN'
>>> model_params = {'input_channels': 2,
                    'pred_size': 1,
                    'window_size': 50,
                    'rnn_window': 10,
                    'max_epochs': 3,
                    'patience': 3,
                    'hidden_size': [8],
                    }
>>> criterion = nn.MSELoss()
>>> df, model = grid_search(handler, criterion,
                            model_type, model_params, grid_params,
                            val_prediction=True)
run 1/4 finishes with loss 0.06058402732014656 and parameters {'lr': 0.0001, 'optimizer': 'Adam'}, time=0s
run 2/4 finishes with loss 0.155076265335083 and parameters {'lr': 0.0001, 'optimizer': 'SGD'}, time=0s
run 3/4 finishes with loss 0.14059486985206604 and parameters {'lr': 0.001, 'optimizer': 'Adam'}, time=0s
run 4/4 finishes with loss 0.542301595211029 and parameters {'lr': 0.001, 'optimizer': 'SGD'}, time=0s

Optimize an ARNN with Mean Variance Estimation

>>> from softsensor.hyperparameter_optimization import grid_search
>>> import torch.nn as nn
>>> from softsensor.losses import DistributionalMSELoss, HeteroscedasticNLL
>>> model_type = 'MVE'
>>> model_params = {'input_channels': 2,
                    'pred_size': 1,
                    'window_size': 10,
                    'rnn_window': 10,
                    'max_epochs': 3,
                    'patience': 3,
                    'hidden_size': [8],
                    'var_hidden_size': [8],
                    }
>>> df, model = grid_search(handler, DistributionalMSELoss(),
                            model_type, model_params, grid_params,
                            val_prediction=True, val_criterion=HeteroscedasticNLL())
run 1/4 finishes with loss 0.07786498963832855 and parameters {'lr': 0.0001, 'optimizer': 'Adam'}, time=0s
run 2/4 finishes with loss -0.11223804950714111 and parameters {'lr': 0.0001, 'optimizer': 'SGD'}, time=0s
run 3/4 finishes with loss 0.06112978607416153 and parameters {'lr': 0.001, 'optimizer': 'Adam'}, time=0s
run 4/4 finishes with loss -0.05090484023094177 and parameters {'lr': 0.001, 'optimizer': 'SGD'}, time=0s

Hyperparameter optimization using bayesian optimization. Returns parameters and best model for evaluation. Algorithm: https://www.researchgate.net/publication/216816964_Algorithms_for_Hyper-Parameter_Optimization

Parameters:
  • data_handle (Meas Handling class) – used for getting the training and evaluation data

  • optimizer (str) – Algorithm to use for training of the models possibilities are [‘Adam’, ‘SGD’].

  • criterion (nn.Loss function) – Loss function e.g. nn.MSELoss()

  • model_type (str) – Describing the Model Type: currently implemented. [‘ARNN’, ‘MCDO’, ‘MVE’, ‘Sep_MVE’, ‘MVE_MCDO’, ‘MVE_Student_Forced’, ‘QR’, ‘QR_Student_Forced’, ‘BNN’, ‘MVE_BNN’, ‘RNN’].

  • parameters (dict) – dictionary of static parameters in the grid search.

  • grid_params (dict) – dict of grid parameters with hyperopt distribution for each parameter.

  • max_iterations (int, optional) – number of iterations. The default is 3

  • pretrained_model (str, optional) – path to pretrained model to load as base model. The default is None

  • reconfigure_criterion (bool, optional) – if True, the criterion is reconfigured with params from the grid. The default is False

  • val_criterion (nn.Loss function, optional) – val_criterion to be used for validation instead of criterion. The default is None

  • val_prediction (bool, optional) – if True, prediction on testing tracks in data_handle is used for hyperparameter evaluation

  • device (str, optional) – device to run training on. The default is ‘cpu’.

  • key (str, optional) – ‘training’ or ‘short’. Training uses whole dataloader, short just subset for training. default is training

  • print_results (bool, optional) – True prints results for every epoch. default is False

Returns:

  • result_df (pd.DataFrame) – parameters and corresponding results for each grid search step.

  • best_model (torch Model) – best performing model.

Examples

Data Preprocessing

>>> import softsensor.meas_handling as ms
>>> import numpy as np
>>> import pandas as pd
>>> t = np.linspace(0, 1.0, 101)
>>> d = {'sine_inp': np.sin(2 * np.pi * 100 * t) ,
         'cos_inp': np.cos(2 * np.pi * 50 * t),
         'out': np.linspace(0, 1.0, 101)}
>>> list_of_df = [pd.DataFrame(d), pd.DataFrame(d)]
>>> test_df = {'sine_inp': np.sin(2 * np.pi * 100 * t),
               'cos_inp': np.cos(2 * np.pi * 50 * t),
               'out': np.linspace(0, 1.0, 101)}
>>> test_df = [pd.DataFrame(test_df)]
>>> handler = ms.Meas_handling(list_of_df, train_names=['sine1', 'sine2'],
                               input_sensors=['sine_inp', 'cos_inp'],
                               output_sensors=['out'], fs=100,
                               test_dfs=test_df, test_names=['test'])

Optimize an ARNN

>>> from softsensor.hyperparameter_optimization import hyperopt_search
>>> import torch.nn as nn
>>> from hyperopt import hp
>>> grid_params = {'lr': hp.uniform('lr', 1e-5, 1e-3),
                   'activation': hp.choice('actiavtion', ['relu', 'sine'])}
>>> model_type = 'ARNN'
>>> model_params = {'input_channels': 2,
                    'pred_size': 1,
                    'window_size': 50,
                    'rnn_window': 10,
                    'max_epochs': 3,
                    'patience': 3,
                    'hidden_size': [8],
                    'optimizer': 'SGD'
                    }
>>> criterion = nn.MSELoss()
>>> df, model = hyperopt_search(handler, criterion, model_type, model_params,
                                grid_params, max_iterations=3,
                                val_prediction=True)
100%|██████████| 3/3 [00:00<00:00, 10.76trial/s, best loss: 0.25191113352775574]

Optimize an ARNN with Mean Variance Estimation

>>> from softsensor.hyperparameter_optimization import grid_search
>>> import torch.nn as nn
>>> from softsensor.losses import DistributionalMSELoss
>>> model_type = 'MVE_ARNN'
>>> grid_params = {'lr': hp.uniform('lr', 1e-5, 1e-3),
                   'activation': hp.choice('actiavtion', ['relu', 'sine'])}
>>> model_params = {'input_channels': 2,
                    'pred_size': 1,
                    'window_size': 10,
                    'rnn_window': 10,
                    'max_epochs': 3,
                    'patience': 3,
                    'hidden_size': [8],
                    'var_hidden_size': [8],
                    'optimizer': 'SGD'
                    }
>>> criterion = DistributionalMSELoss()
>>> df, model = hyperopt_search(handler, criterion, model_type, model_params,
                                grid_params, max_iterations=3,
                                val_prediction=True)
100%|██████████| 3/3 [00:00<00:00,  7.42trial/s, best loss: -0.4127456843852997]

random search to efficiently optimize hyperparameters. Returns parameters and best model for evaluation

Parameters:
  • data_handle (Meas Handling class) – used for getting the training and evaluation data

  • criterion (nn.Loss function) – Loss function e.g. nn.MSELoss()

  • model_type (str) – Describing the Model Type: currently implemented [‘ARNN’, ‘MCDO’, ‘MVE’, ‘Sep_MVE’, ‘MVE_MCDO’, ‘MVE_Student_Forced’, ‘QR’, ‘QR_Student_Forced’, ‘BNN’, ‘MVE_BNN’, ‘RNN’].

  • parameters (dict) – dictionary of static parameters in the grid search.

  • grid_params (dict) – dict of grid parameters with grid options as list or scipy stats function. See examples:

  • max_iterations (int, optional) – number of iterations. The default is 3

  • pretrained_model (str, optional) – path to pretrained model to load as base model. The default is None

  • reconfigure_criterion (bool, optional) – if True, the criterion is reconfigured with params from the grid. The default is False

  • val_criterion (nn.Loss function, optional) – val_criterion to be used for validation instead of criterion. The default is None

  • val_prediction (bool, optional) – if True, prediction on testing tracks in data_handle is used for hyperparameter evaluation

  • device (str, optional) – device to run training on. The default is ‘cpu’.

  • key (str, optional) – ‘training’ or ‘short’. Training uses whole dataloader, short just subset for training. default is training

  • print_results (bool, optional) – True prints results for every epoch. default is False

Returns:

  • result_df (pd.DataFrame) – parameters and corrosponding results for each grid search step.

  • best_model (torch Model) – best performing model.

Examples

Data Preprocessing

>>> import softsensor.meas_handling as ms
>>> import numpy as np
>>> import pandas as pd
>>> t = np.linspace(0, 1.0, 101)
>>> d = {'sine_inp': np.sin(2 * np.pi * 100 * t) ,
         'cos_inp': np.cos(2 * np.pi * 50 * t),
         'out': np.linspace(0, 1.0, 101)}
>>> list_of_df = [pd.DataFrame(d), pd.DataFrame(d)]
>>> test_df = {'sine_inp': np.sin(2 * np.pi * 100 * t),
               'cos_inp': np.cos(2 * np.pi * 50 * t),
               'out': np.linspace(0, 1.0, 101)}
>>> test_df = [pd.DataFrame(test_df)]
>>> handler = ms.Meas_handling(list_of_df, train_names=['sine1', 'sine2'],
                               input_sensors=['sine_inp', 'cos_inp'],
                               output_sensors=['out'], fs=100,
                               test_dfs=test_df, test_names=['test'])

Optimize an ARNN

>>> from softsensor.hyperparameter_optimization import random_search
>>> import scipy.stats as stats
>>> import torch.nn as nn
>>> grid_params = {'lr': stats.loguniform(1e-4, 1e-1),
                   'optimizer': ['Adam', 'SGD']}
>>> model_type = 'ARNN'
>>> model_params = {'input_channels': 2,
                    'pred_size': 1,
                    'window_size': 50,
                    'rnn_window': 10,
                    'max_epochs': 3,
                    'patience': 3,
                    'hidden_size': [8],
                    }
>>> criterion = nn.MSELoss()
>>> df, model = random_search(handler, criterion,
                              model_type, model_params, grid_params,
                              max_iterations=4, val_prediction=True)
run 1/4 finishes with loss 0.1794743835926056 and parameters {'lr': 0.0009228490219458666, 'optimizer': 'Adam'}, time=0s
run 2/4 finishes with loss 0.16934655606746674 and parameters {'lr': 0.00040497789386739904, 'optimizer': 'SGD'}, time=0s
run 3/4 finishes with loss 0.09789121896028519 and parameters {'lr': 0.0033839123896820455, 'optimizer': 'Adam'}, time=0s
run 4/4 finishes with loss 0.0789249911904335 and parameters {'lr': 0.00022746356317548106, 'optimizer': 'Adam'}, time=0s

Optimize an ARNN with Mean Variance Estimation

>>> from softsensor.hyperparameter_optimization import grid_search
>>> import torch.nn as nn
>>> from softsensor.losses import DistributionalMSELoss
>>> model_type = 'MVE'
>>> model_params = {'input_channels': 2,
                    'pred_size': 1,
                    'window_size': 10,
                    'rnn_window': 10,
                    'max_epochs': 3,
                    'patience': 3,
                    'hidden_size': [8],
                    'var_hidden_size': [8],
                    }
>>> df, model = random_search(handler, DistributionalMSELoss(),
                              model_type, model_params, grid_params, max_iterations=4,
                              val_prediction=True, val_criterion=DistributionalMSELoss())
run 1/4 finishes with loss 0.11428132653236389 and parameters {'lr': 0.00036427297324473465, 'optimizer': 'SGD'}, time=0s
run 2/4 finishes with loss 0.6519314646720886 and parameters {'lr': 0.0023511068144571627, 'optimizer': 'SGD'}, time=0s
run 3/4 finishes with loss 0.14868338406085968 and parameters {'lr': 0.0003292040157588834, 'optimizer': 'Adam'}, time=0s
run 4/4 finishes with loss 0.0994466096162796 and parameters {'lr': 0.01498455784730249, 'optimizer': 'Adam'}, time=0s