Data Visualization with Pandas: A Complete Guide to Plotting in Python

Last updated 3 weeks, 6 days ago | 97 views 75     5

Tags:- Python Pandas

Visualizing data is an essential part of data analysis. Python’s Pandas library offers built-in plotting capabilities, making it easier than ever to create insightful charts with just a few lines of code.

In this guide, you’ll learn how to use Pandas for plotting, understand different types of plots, and customize your visualizations. We’ll also walk through a full working example.


What is Pandas Plotting?

Pandas leverages Matplotlib behind the scenes to provide simple and effective plotting through the .plot() method, available on Series and DataFrame objects.


Prerequisites

Before you begin, install Pandas and Matplotlib if you haven’t already:

pip install pandas matplotlib

Getting Started

Let’s begin by importing necessary libraries and creating a sample dataset:

import pandas as pd
import matplotlib.pyplot as plt

data = {
    'Month': ['Jan', 'Feb', 'Mar', 'Apr', 'May'],
    'Sales': [250, 300, 400, 350, 500],
    'Expenses': [200, 220, 250, 270, 300]
}

df = pd.DataFrame(data)
print(df)

1. Line Plot

This is the default plot when you call .plot().

df.plot(x='Month', y=['Sales', 'Expenses'], kind='line', marker='o', title='Sales and Expenses Over Time')
plt.ylabel('Amount in USD')
plt.grid(True)
plt.show()

✅ Use for showing trends over time.


2. Bar Plot

df.plot(x='Month', y=['Sales', 'Expenses'], kind='bar', title='Monthly Sales vs Expenses')
plt.ylabel('USD')
plt.show()

✅ Great for comparing categories.


3. Horizontal Bar Plot

df.plot(x='Month', y=['Sales', 'Expenses'], kind='barh', title='Horizontal Bar Chart')
plt.show()

✅ Useful for long category names.


4. Histogram

import numpy as np

data = pd.DataFrame({'Income': np.random.normal(5000, 1000, 200)})
data.plot(kind='hist', bins=20, title='Income Distribution')
plt.xlabel("Income")
plt.show()

✅ Use to visualize frequency distributions.


5. Pie Chart

sales_by_product = pd.Series([350, 200, 150, 300], index=['A', 'B', 'C', 'D'])
sales_by_product.plot(kind='pie', autopct='%1.1f%%', title='Product Sales Share')
plt.ylabel('')
plt.show()

✅ Good for showing proportions.
⚠️ Use sparingly; not ideal for comparisons.


6. Box Plot

df[['Sales', 'Expenses']].plot(kind='box', title='Statistical Summary')
plt.show()

✅ Useful for identifying outliers and distribution.


7. Area Plot

df.plot(x='Month', y=['Sales', 'Expenses'], kind='area', alpha=0.5, title='Area Plot')
plt.show()

✅ Shows cumulative values or part-to-whole relationships.


Customizing Plots

Since Pandas uses Matplotlib under the hood, you can further enhance your charts:

plt.title("Custom Title")
plt.xlabel("Month")
plt.ylabel("Values")
plt.legend(loc='upper left')
plt.grid(True)

You can also save the plot:

plt.savefig('plot.png')

Full Working Example

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Sample data
data = {
    'Month': ['Jan', 'Feb', 'Mar', 'Apr', 'May'],
    'Sales': [250, 300, 400, 350, 500],
    'Expenses': [200, 220, 250, 270, 300]
}
df = pd.DataFrame(data)

# Line plot
df.plot(x='Month', y=['Sales', 'Expenses'], kind='line', marker='o', title='Sales and Expenses Over Time')
plt.ylabel('Amount')
plt.grid(True)
plt.show()

# Bar plot
df.plot(x='Month', y=['Sales', 'Expenses'], kind='bar', title='Monthly Comparison')
plt.show()

# Histogram
hist_data = pd.DataFrame({'Revenue': np.random.normal(10000, 1500, 100)})
hist_data.plot(kind='hist', bins=15, title='Revenue Distribution')
plt.show()

# Pie chart
pie_data = pd.Series([30, 45, 25], index=['Product A', 'Product B', 'Product C'])
pie_data.plot(kind='pie', autopct='%1.1f%%', title='Product Share')
plt.ylabel('')
plt.show()

Tips for Better Plotting

Tip Why It Matters
Always label axes and titles Improves readability
Use appropriate chart types Enhances clarity
Don't overcrowd plots Focus on key insights
Use colors and markers consistently Better visual appeal
Use plt.tight_layout() Fix layout spacing issues

⚠️ Common Pitfalls

Problem Fix
Plot doesn't show Add plt.show() at the end
Data not showing correctly Ensure x and y columns are correct
Pie chart looks bad Use for a few categories only
Overlapping labels Rotate x-axis with plt.xticks(rotation=45)

Summary

Pandas makes data visualization quick and easy with .plot(), allowing you to generate a variety of charts using just your DataFrames. For more advanced visuals, you can build upon this using Seaborn or Plotly.

Most Common Plot Types:

Chart Type Use Case
Line Trend over time
Bar Category comparison
Histogram Distribution
Pie Proportions
Box Plot Statistical summary
Area Cumulative data