Machine learning **algorithms/models** can have many **parameters** and finding the best combination is a problem. **Hyperparameter optimization** or **tuning** is the problem of **searching** a set of optimal hyperparameters for a learning algorithm.

**Grid search** is a tuning technique that simply performs an exhaustive searching through a manually specified subset of the **hyperparameter space** of a learning algorithm.

In this example, we are using **Ridge Regression** model where **alpha** is a **hyperparameter** which denotes regularization strength(must be a positive float). **Regularization** improves the conditioning of the problem and reduces the variance of the estimates.

Link: scikit-learn: Ridge documentation

This **recipe** includes the following topics:

- Load the
**classification problem**dataset (Pima Indians) from github - Split columns into the usual feature columns(X) and target column(Y)
- Create a
**param_grid**dictionary with parameters names - Instantiate the classification algorithm:
**Ridge** - Instantiate the
**GridSearchCV**class with estimator and param_grid - Find the mean cross-validated
**score** - Find the (set of)
**parameter**that achieved the best score

```
# import modules
import pandas as pd
import numpy as np
from sklearn.linear_model import Ridge
from sklearn.model_selection import GridSearchCV
# read data file from github
# dataframe: pimaDf
gitFileURL = 'https://raw.githubusercontent.com/andrewgurung/data-repository/master/pima-indians-diabetes.data.csv'
cols = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
pimaDf = pd.read_csv(gitFileURL, names = cols)
# convert into numpy array for scikit-learn
pimaArr = pimaDf.values
# Let's split columns into the usual feature columns(X) and target column(Y)
# Y represents the target 'class' column whose value is either '0' or '1'
X = pimaArr[:, 0:8]
Y = pimaArr[:, 8]
# create a param_grid dictionary with parameters names
alphas = np.array([1,0.1,0.01,0.001,0.0001,0])
param_grid = {'alpha': alphas}
# instantiate the classification algorithm: Ridge()
model = Ridge()
# perform a Grid Search to find the best (combination) hyperparameters
grid = GridSearchCV(estimator=model, param_grid=param_grid)
# call fit() to train the grid search using X and Y data
grid.fit(X, Y)
# Find the mean cross-validated score of the best_estimator
bestScore = grid.best_score_
# Find the (set of) parameter that achieved the best score
bestAlpha = grid.best_estimator_.alpha
print("Best Score: %.5f, Best Alpha(Hyperparameter): %f" % (bestScore, bestAlpha))
```

```
Best Score: 0.27962, Best Alpha(Hyperparameter): 1.000000
```