Calculate skew of columns

Skewness is the measure of which side the Bell curve (normal distribution) is shifted.
Value near 0 represents less skewness.

This recipe includes the following topics:

  • Calculate skew


# import module
import pandas as pd

fileGitURL = 'https://raw.githubusercontent.com/andrewgurung/data-repository/master/pima-indians-diabetes.data.csv'

# define column names
cols = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']

# load file as a Pandas DataFrame
pimaDf = pd.read_csv(fileGitURL, names=cols)

# calculate skewness of columns
# skip null values
skew = pimaDf.skew(axis=0, skipna=True)
print(skew)
preg     0.902
plas     0.174
pres    -1.844
skin     0.109
test     2.272
mass    -0.429
pedi     1.920
age      1.130
class    0.635
dtype: float64

Leave a Reply

Your email address will not be published. Required fields are marked *