Pandas DataFrame Basics

A DataFrame is a multidimensional array where both the rows and columns can be labeled.

This recipe includes the following topics:

  • Create a Pandas DataFrame
  • Access entire DataFrame
  • Access single column data
  • Access top 3 rows
  • Find unique values in a given column


# import modules
import pandas as pd
import numpy as np

# create row data
salary_ds = [70000, 85000, 150000]
salary_web = [65000, 90000, 120000]
salary_x = [75000, 81000, 110000]
salary_y = [85000, 93000, 100000]
salary_z = [65000, 75000, 90000]
salary_a = [55000, 68000, 990000]
salary_b = [70000, 91000, 110000]
salaries = np.array([salary_ds, salary_web, salary_x, salary_y, salary_z, salary_a, salary_b])

# define row name
rownames = ['Data Science', 'Web Development', 'Career X', 'Career Y', 'Career Z', 'Career A', 'Career B']

# define column name
colnames = ['1 year', '3 years', '5 years']

# create DataFrame
df = pd.DataFrame(salaries, index=rownames, columns=colnames)

# display DataFrame
print('Display entire DataFrame')
print(df)

# Select single column
print('Display single column')
print(df['1 year'])

# Select first 3 rows
print('Display first 3 rows')
print(df.head(3))

# Find unique values in a given column
print('Display unique values in a given column')
print(df['5 years'].value_counts())
Display entire DataFrame
                 1 year  3 years  5 years
Data Science      70000    85000   150000
Web Development   65000    90000   120000
Career X          75000    81000   110000
Career Y          85000    93000   100000
Career Z          65000    75000    90000
Career A          55000    68000   990000
Career B          70000    91000   110000

Display single column
Data Science       70000
Web Development    65000
Career X           75000
Career Y           85000
Career Z           65000
Career A           55000
Career B           70000
Name: 1 year, dtype: int64

Display first 3 rows
                 1 year  3 years  5 years
Data Science      70000    85000   150000
Web Development   65000    90000   120000
Career X          75000    81000   110000

Display unique values in a given column
110000    2
990000    1
100000    1
90000     1
120000    1
150000    1
Name: 5 years, dtype: int64

Leave a Reply

Your email address will not be published. Required fields are marked *