Splitting arrays is a crucial operation when working with large datasets, image processing, or preparing data for machine learning. With NumPy, you can easily divide arrays into multiple sub-arrays using built-in functions like split()
, array_split()
, hsplit()
, vsplit()
, and dsplit()
.
This article will walk you through:
-
✅ What array splitting is
-
✂️ The different ways to split arrays in NumPy
-
Syntax and parameters
-
Full working code examples
-
Tips and common pitfalls
What Is Array Splitting?
Splitting an array means breaking it into two or more sub-arrays. This is useful in:
-
Data preprocessing
-
Feature extraction
-
Batch training in ML models
-
Dividing large arrays for parallel processing
NumPy Split Methods
Function | Description |
---|---|
np.split() |
Split an array into equal-sized chunks |
np.array_split() |
Split an array into nearly equal chunks |
np.hsplit() |
Split horizontally (column-wise) |
np.vsplit() |
Split vertically (row-wise) |
np.dsplit() |
Split along the third dimension (depth) |
1️⃣ numpy.split()
Syntax:
numpy.split(array, indices_or_sections, axis=0)
-
array
: The array to split -
indices_or_sections
: Number of splits (must divide evenly) -
axis
: Axis along which to split (default = 0)
✅ Example:
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6])
result = np.split(arr, 3)
print(result)
Output:
[array([1, 2]), array([3, 4]), array([5, 6])]
❗ Must split evenly; otherwise, it throws a
ValueError
.
2️⃣ numpy.array_split()
This is more flexible than split()
. It allows uneven splits.
✅ Example:
arr = np.array([1, 2, 3, 4, 5])
result = np.array_split(arr, 3)
print(result)
Output:
[array([1, 2]), array([3, 4]), array([5])]
3️⃣ numpy.hsplit()
– Horizontal Split
Splits along columns (axis=1).
✅ Example:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
result = np.hsplit(arr, 2)
print(result)
Output:
[array([[1, 2],
[5, 6]]),
array([[3, 4],
[7, 8]])]
4️⃣ numpy.vsplit()
– Vertical Split
Splits along rows (axis=0).
✅ Example:
arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
result = np.vsplit(arr, 2)
print(result)
Output:
[array([[1, 2],
[3, 4]]),
array([[5, 6],
[7, 8]])]
5️⃣ numpy.dsplit()
– Depth Split
Works on 3D arrays. Splits along the third axis (depth).
✅ Example:
arr = np.array([
[[1, 2], [3, 4]],
[[5, 6], [7, 8]]
])
result = np.dsplit(arr, 2)
print(result)
Output:
[array([[[1],
[3]],
[[5],
[7]]]),
array([[[2],
[4]],
[[6],
[8]]])]
Full Working Code Example
import numpy as np
arr1d = np.array([10, 20, 30, 40, 50, 60])
arr2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
arr3d = np.array([
[[1, 2], [3, 4]],
[[5, 6], [7, 8]]
])
# Split 1D array
print("Split 1D Array:", np.split(arr1d, 3))
# Uneven split
print("Array Split (uneven):", np.array_split(arr1d, 4))
# Horizontal split (columns)
print("Horizontal Split:", np.hsplit(arr2d, 2))
# Vertical split (rows)
print("Vertical Split:", np.vsplit(arr2d, 2))
# Depth split (3D)
print("Depth Split:", np.dsplit(arr3d, 2))
✅ Tips & Best Practices
Tip | Why It Helps |
---|---|
Use array_split() for uneven splits |
Avoids ValueError if array isn’t divisible |
Check array shape before using hsplit , vsplit , or dsplit |
These require specific dimensions |
Use list comprehension to further manipulate each chunk | Great for batch processing or ML training |
⚠️ Common Pitfalls
Pitfall | Solution |
---|---|
split() throws an error on uneven splits |
Use array_split() instead |
Using hsplit() on 1D array |
Only works with 2D arrays (with at least 2 columns) |
Assuming split() returns a NumPy array |
It returns a list of arrays |
Forgetting to specify axis |
Always set axis if working with multidimensional arrays |
Conclusion
NumPy makes it incredibly easy to split arrays into smaller chunks, which is essential for many real-world applications like:
-
Data preprocessing
-
Mini-batch training
-
Parallel processing
Choose the right split method based on your data structure and goal:
-
Use
split()
orarray_split()
for general use -
Use
hsplit()
,vsplit()
, ordsplit()
for direction-specific splits
What's Next?