Python Matplotlib Scatter – A Complete Guide

Last updated 4 weeks, 1 day ago | 94 views 75     5

Certainly! Here's a detailed article on Python Matplotlib Scatter, covering how to create scatter plots, customize appearance, add labels and colors, as well as tips, code examples, and common pitfalls.


Python Matplotlib Scatter – A Complete Guide

A scatter plot is one of the most useful types of plots in data analysis and visualization. It displays values for two variables as a collection of points, revealing patterns, correlations, and distributions in datasets.

This article walks you through creating and customizing scatter plots in Matplotlib with code examples, advanced options, tips, and common pitfalls.


What Is a Scatter Plot?

A scatter plot (also known as an XY plot) shows the relationship between two continuous variables. Each point in the plot represents a single data observation with an X (horizontal) and Y (vertical) coordinate.


Basic Scatter Plot in Matplotlib

To create a scatter plot, use:

matplotlib.pyplot.scatter(x, y)

Example

import matplotlib.pyplot as plt

x = [5, 7, 8, 7, 2, 17, 2, 9]
y = [99, 86, 87, 88, 100, 86, 103, 87]

plt.scatter(x, y)
plt.title("Basic Scatter Plot")
plt.xlabel("X Axis")
plt.ylabel("Y Axis")
plt.grid(True)
plt.show()

Customizing Scatter Plot Appearance

Change Marker Color

plt.scatter(x, y, color='red')

Change Marker Size

Use the s parameter:

plt.scatter(x, y, s=100)  # size in points^2

You can even vary the size by data:

sizes = [20, 50, 100, 80, 60, 150, 40, 70]
plt.scatter(x, y, s=sizes)

Change Marker Style

Use the marker parameter:

Marker Description
'o' Circle (default)
'^' Triangle Up
's' Square
'*' Star
'D' Diamond
plt.scatter(x, y, marker='^')

Set Marker Transparency

Use the alpha parameter (0 to 1):

plt.scatter(x, y, alpha=0.6)

Color Mapping Based on Data

Use a colormap to color points based on a third variable.

import numpy as np

x = np.random.rand(50)
y = np.random.rand(50)
colors = np.random.rand(50)  # Values between 0 and 1

plt.scatter(x, y, c=colors, cmap='viridis')
plt.colorbar()  # Show color scale
plt.title("Color-Mapped Scatter Plot")
plt.show()

Adding Labels and Titles

plt.title("Scatter Example")
plt.xlabel("Age")
plt.ylabel("Income")

Full Example: Multi-Dimensional Scatter Plot

import matplotlib.pyplot as plt
import numpy as np

# Random data
x = np.random.randint(10, 100, 50)
y = np.random.randint(20, 200, 50)
colors = np.random.rand(50)
sizes = np.random.randint(50, 300, 50)

plt.figure(figsize=(10, 6))
scatter = plt.scatter(x, y, c=colors, s=sizes, alpha=0.7, cmap='plasma', edgecolors='black')

plt.colorbar(scatter, label="Color Intensity")
plt.title("Multi-Dimensional Scatter Plot")
plt.xlabel("X Value")
plt.ylabel("Y Value")
plt.grid(True)
plt.show()

Scatter Plot with Annotations

You can add labels to individual points:

for i in range(len(x)):
    plt.text(x[i] + 1, y[i] + 1, f"({x[i]}, {y[i]})")

Comparison: plt.plot() vs plt.scatter()

Feature plt.plot() plt.scatter()
Joins points? Yes (lines between) No (points only)
Vary size/color? Limited Fully supported
Suited for Line graphs Correlation/distribution

Tips for Using Scatter Plots

Tip Why It Helps
Use transparency (alpha) Prevents overlapping clutter
Use color maps (cmap) Visualize third variable dimension
Add edgecolors for better point visibility Improves clarity
Use plt.grid(True) Enhances readability
Use plt.tight_layout() Prevents layout issues

⚠️ Common Pitfalls

Pitfall Solution
Overlapping points Use smaller s or alpha for transparency
Marker size too big/small Adjust s accordingly
Points not colored Ensure c and cmap are used correctly
Marker edges not showing Use edgecolors='black'
Missing colorbar Add with plt.colorbar()

Summary Table

Parameter Description Example
x, y Coordinates plt.scatter(x, y)
s Size s=50
c Color c=values
alpha Transparency alpha=0.6
marker Shape marker='^'
cmap Color map cmap='viridis'
edgecolors Border edgecolors='black'

Conclusion

The scatter plot is an essential visualization tool in Python for revealing correlations, distributions, and clusters. With Matplotlib, you can easily create scatter plots, customize their appearance, and even add multidimensional data using color and size encodings.


What’s Next?

  • Try scatter plots with Pandas DataFrames

  • Overlay regression lines using libraries like Seaborn or NumPy

  • Create interactive scatter plots using Plotly