Matplotlib: Histogram

Histograms are useful to quickly check the distribution of data in each column.
The shapes of Histograms are mostly:
– Gaussian (Normal distribution) or
– Skewed (Left or right)

Note: Most machine learning algorithm focuses on Gaussian distribution.
This recipe includes the following topics:

  • Draw a Histogram for a particular column
  • Draw Histograms for all columns
  • Increase histogram’s size


# import module
import pandas as pd
import matplotlib.pyplot as plt

fileGitURL = 'https://raw.githubusercontent.com/andrewgurung/data-repository/master/pima-indians-diabetes.data.csv'

# define column names
cols = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']

# load file as a Pandas DataFrame
pimaDf = pd.read_csv(fileGitURL, names=cols)

# Histogram of a single column 'mass'
pimaDf['mass'].hist()

# Histogram of a all columns
pimaDf.hist(figsize=(10,10))

plt.show()

Matplotlib Histogram
Matplotlib Histogram for ‘mass’ column
Matplotlib Histogram All Columns
Matplotlib Histogram for all Columns

Leave a Reply

Your email address will not be published. Required fields are marked *