NumPy is a powerful library for numerical computing in Python. One of its most useful submodules is numpy.random
, which provides tools for generating random numbers and performing probabilistic simulations.
In this article, we’ll explore the numpy.random
module step-by-step, understand its key functions, and see how to use it in real-world scenarios.
Why Use numpy.random
?
Python’s built-in random
module is good for simple tasks, but NumPy's version offers:
-
Faster performance
-
Multi-dimensional array support
-
More complex distributions
-
Reproducibility using seeds
Getting Started
First, you need to import NumPy:
import numpy as np
To use the random functionalities:
rng = np.random.default_rng()
This creates a Generator instance, which is now the recommended way (since NumPy 1.17+) to produce random numbers.
Basic Random Number Generation
1. Random Float in [0.0, 1.0)
rng = np.random.default_rng()
print(rng.random()) # e.g., 0.845395197
2. Random Integers
# Generate one random integer from 0 to 9
print(rng.integers(0, 10))
# Generate a 1D array of 5 random integers
print(rng.integers(0, 100, size=5))
# Generate a 2D array
print(rng.integers(0, 5, size=(2, 3)))
3. Random from a Normal Distribution
# Mean = 0, Std dev = 1
print(rng.normal(loc=0, scale=1, size=5))
4. Random from a Uniform Distribution
print(rng.uniform(low=0.0, high=1.0, size=5))
Common Random Functions
choice()
: Random Selection
# Randomly pick one element
print(rng.choice([10, 20, 30, 40]))
# Pick 3 elements without replacement
print(rng.choice([1, 2, 3, 4, 5], size=3, replace=False))
shuffle()
: Shuffle an array in-place
arr = np.array([1, 2, 3, 4, 5])
rng.shuffle(arr)
print(arr) # e.g., [3, 1, 5, 4, 2]
Reproducibility with Seeds
To make your random numbers reproducible:
rng = np.random.default_rng(seed=42)
print(rng.integers(0, 100, size=5))
Using the same seed will always give you the same sequence of numbers — important in simulations, experiments, and debugging.
Step-by-Step: Practical Example
Let’s say we want to simulate 1000 dice rolls and analyze how many times each face appears.
import numpy as np
# Step 1: Initialize RNG with seed
rng = np.random.default_rng(seed=123)
# Step 2: Simulate 1000 rolls (values 1 to 6)
rolls = rng.integers(1, 7, size=1000)
# Step 3: Count occurrences using np.bincount (offset by 1 for 1-based index)
counts = np.bincount(rolls)[1:] # Skip index 0
# Step 4: Display the results
for face, count in enumerate(counts, start=1):
print(f"Face {face}: {count} times")
Tips
-
Use
default_rng()
instead of legacy methods likenp.random.rand()
ornp.random.randint()
— it's the modern, better practice. -
Set seeds for reproducibility when debugging or sharing code.
-
Use
np.bincount()
for fast frequency analysis of integer arrays. -
Prefer vectorized operations over loops for performance and clarity.
⚠️ Common Pitfalls
Pitfall | Explanation |
---|---|
❌ Using np.random without default_rng() |
Legacy code may behave differently in future versions. |
❌ Forgetting to set replace=False in choice() |
You may get repeated elements when you don’t want them. |
❌ Off-by-one error in integers() |
Remember: integers(low, high) includes low , excludes high . |
❌ Using np.random.seed() with default_rng() |
It has no effect! Use seed argument when calling default_rng() . |
✅ Full Code Example
import numpy as np
# Create a random number generator with a fixed seed
rng = np.random.default_rng(seed=2025)
# Generate random floats
floats = rng.random(5)
print("Random floats:", floats)
# Generate random integers (1D and 2D)
ints_1d = rng.integers(10, 100, size=5)
ints_2d = rng.integers(1, 7, size=(3, 3))
print("Random integers (1D):", ints_1d)
print("Random integers (2D):\n", ints_2d)
# Random choice from a list
choices = rng.choice(['red', 'green', 'blue'], size=2)
print("Random choices:", choices)
# Shuffle array
arr = np.array([1, 2, 3, 4, 5])
rng.shuffle(arr)
print("Shuffled array:", arr)
# Simulate 1000 dice rolls and count results
rolls = rng.integers(1, 7, size=1000)
counts = np.bincount(rolls)[1:] # Skip index 0
for face, count in enumerate(counts, start=1):
print(f"Face {face}: {count} times")
Conclusion
NumPy's random
module is an essential tool for anyone working in data science, simulations, or statistical modeling. Whether you're generating simple random values or simulating complex distributions, understanding numpy.random
will help you write faster, clearer, and more reproducible code.
If you're just getting started, practice with small experiments and gradually incorporate distributions and random sampling into your projects.