Hyper parameter tuning
Machine learning is on a roll and has become an integral part of many niches such as robotics, e-commerce, spam filtering, etc. The experts use machine learning to develop models based on training data only to deploy them later on an unseen one to check it’s functionality. Every model has parameters associated with it, which can be estimated. Though, there are some, which impacts the entire performance of the model but cannot be estimated. Such parameters are referred to as hyperparameters.
Experts and data scientists believe that deploying machine learning solutions becomes extremely challenging when the models are not optimized. In order to minimize the errors, the models have to be modified regularly by tuning these hyperparameters that heavily influence the entire functioning.
What are hyperparameters?
The concept of hyperparameter comes from statistics, which can be best understood with an example. Suppose an expert is modelling a data that’s following Gaussian distribution to estimate its mean and variance. If the mean estimated is following the Poisson distribution with a different parameter (lambda), then it is said to be a hyperparameter. When it comes to machine learning, these are more like setting, which are tuned in the algorithm to control the overall behaviour. The hyperparameters are orthogonal as they have a direct relationship with the model without being in it.
How are hyperparameters tuned?
Defining hyperparameters can be flexible. In some cases, they might be established (for instance, the learning rate of a model) while in others they might be very specific for a model. If an expert is optimizing a particular mode, then there would be specific hyperparameters associated with it. There are instances when a ‘setting’ in the model (let’s say set that control the model capacity) is a hyperparameter. They cannot learn from the training data else they will overfit when implemented in unseen data.
In such cases, the input data set is split into 80/20 ratio, where 80 % is training data, and 20% is validation data (as observed in classic machine learning models). This validation set will tune the hyperparameters before they are optimized.
Though hyperparameters are specific to each model, there are certain classics, which every individual involved in machine learning must watch out such as learning rate, number of hidden units and random states. Thus, hyperparameters are very crucial for any model and it’s functioning, and therefore they have to be tuned and optimized to get the desired results.