Python Matplotlib Histograms – A Complete Guide

Last updated 4 weeks, 1 day ago | 99 views 75     5

A histogram is a powerful plot for visualizing the distribution of numerical data. It displays data by grouping values into bins and showing the frequency of values within each bin.

With Matplotlib, creating and customizing histograms is straightforward and flexible. This article covers everything from basic usage to advanced customization.


What Is a Histogram?

A histogram plots the frequency of values falling within certain ranges (called bins). It's ideal for:

  • Understanding data distribution

  • Detecting skewness, modality, and outliers

  • Comparing datasets

Unlike bar charts (which show categorical data), histograms show the distribution of continuous numerical data.


Creating a Basic Histogram in Matplotlib

import matplotlib.pyplot as plt

data = [22, 87, 5, 43, 56, 73, 55, 54, 11, 20, 51, 5, 79, 31, 27]

plt.hist(data)
plt.title("Basic Histogram")
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.show()

By default, Matplotlib chooses 10 bins.


Customizing Bins

Set Number of Bins

plt.hist(data, bins=5)

Custom Bin Edges

bins = [0, 20, 40, 60, 80, 100]
plt.hist(data, bins=bins)

Customizing Appearance

Bar Color

plt.hist(data, color='skyblue')

Edge Color

plt.hist(data, edgecolor='black')

Transparency

plt.hist(data, color='orange', alpha=0.7)

Full Example with Labels and Grid

data = [22, 87, 5, 43, 56, 73, 55, 54, 11, 20, 51, 5, 79, 31, 27]

plt.hist(data, bins=8, color='teal', edgecolor='black')
plt.title("Data Distribution")
plt.xlabel("Data Range")
plt.ylabel("Frequency")
plt.grid(axis='y', linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()

Histogram with Density (Probability) Plot

Show normalized frequencies using density=True:

plt.hist(data, bins=8, density=True, color='coral', edgecolor='black')
plt.title("Probability Distribution")

Comparing Multiple Histograms

Use label and alpha for overlayed histograms:

import numpy as np

data1 = np.random.normal(60, 10, 1000)
data2 = np.random.normal(70, 15, 1000)

plt.hist(data1, bins=20, alpha=0.5, label='Dataset 1')
plt.hist(data2, bins=20, alpha=0.5, label='Dataset 2')
plt.title("Histogram Comparison")
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.legend()
plt.show()

Histogram Orientation: Horizontal

plt.hist(data, orientation='horizontal')

Cumulative Histogram

plt.hist(data, bins=8, cumulative=True, color='slateblue')

Accessing Histogram Data

You can unpack the returned values:

counts, bin_edges, _ = plt.hist(data, bins=5)
print("Counts:", counts)
print("Bin Edges:", bin_edges)

Histogram with KDE (Using Seaborn)

For smoother density visualization:

import seaborn as sns

sns.histplot(data, kde=True)
plt.title("Histogram with KDE")

Tips for Using Histograms

Tip Benefit
Use custom bins Reveals more or less granularity
Use density=True For probability distributions
Add edgecolor Improves bar clarity
Use alpha when overlaying Helps distinguish histograms
Use KDE Smoothes out the distribution

⚠️ Common Pitfalls

Pitfall Solution
Too few/many bins Try different bins values
Misinterpreted histogram Add grid, labels, and legends
Overlayed histograms hard to read Use alpha and different colors
Histogram instead of bar chart Use plt.bar() for categorical data

Summary Table

Parameter Description Example
data Input data plt.hist(data)
bins Bin count or edges bins=10 or bins=[0,20...]
color Fill color color='skyblue'
edgecolor Outline color edgecolor='black'
alpha Transparency (0 to 1) alpha=0.6
density=True Normalize to probability density=True
orientation 'vertical' or 'horizontal' orientation='horizontal'
cumulative Accumulate values cumulative=True

✅ Complete Example

import matplotlib.pyplot as plt
import numpy as np

# Simulated exam scores
scores = np.random.normal(75, 10, 1000)

plt.figure(figsize=(10, 6))
plt.hist(scores, bins=20, color='steelblue', edgecolor='black', alpha=0.7)
plt.title("Distribution of Exam Scores")
plt.xlabel("Score")
plt.ylabel("Number of Students")
plt.grid(axis='y', linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()

Conclusion

Histograms in Matplotlib provide a clear view of how your data is distributed. With simple customizations and flexible parameters, you can craft professional and insightful visualizations for both basic and advanced use cases.


What's Next?

  • Try stacked histograms with histtype='stepfilled'

  • Combine histograms with box plots for richer analysis

  • Use interactive widgets with ipywidgets for bin control