Python Multiprocessing: A Complete Guide for Developers
Last updated 2 weeks, 3 days ago | 78 views 75 5

Introduction
Python is popular for data science, AI, and backend development—but it often struggles with performance bottlenecks due to the Global Interpreter Lock (GIL).
When dealing with CPU-bound tasks like data processing, mathematical simulations, or image rendering, Python threads can’t fully leverage multiple CPU cores.
This is where Python’s Multiprocessing module comes in:
-
It allows you to bypass the GIL by running separate processes.
-
Each process has its own Python interpreter and memory space.
-
Perfect for tasks that require true parallel execution.
Tutorial Section (Step-by-step Guide)
Step 1: Import the Module
import multiprocessing
Step 2: Creating a Simple Process
from multiprocessing import Process
def worker(name):
print(f"Hello from process {name}")
if __name__ == "__main__":
p1 = Process(target=worker, args=("A",))
p1.start() # Start the process
p1.join() # Wait until process completes
✅ Output:
Hello from process A
Step 3: Using Multiple Processes
from multiprocessing import Process
def square(n):
print(f"{n} squared is {n*n}")
if __name__ == "__main__":
numbers = [1, 2, 3, 4]
processes = []
for num in numbers:
p = Process(target=square, args=(num,))
processes.append(p)
p.start()
for p in processes:
p.join()
Step 4: Multiprocessing with Pool
from multiprocessing import Pool
def cube(x):
return x**3
if __name__ == "__main__":
with Pool(4) as pool:
result = pool.map(cube, [1, 2, 3, 4, 5])
print(result)
✅ Output:
[1, 8, 27, 64, 125]
⚡ Comparison: Multithreading vs Multiprocessing
Feature | Multithreading | Multiprocessing |
---|---|---|
Execution | Concurrent (not true parallel) | True parallel execution |
Best for | I/O-bound tasks | CPU-bound tasks |
Memory | Shared memory space | Separate memory for each process |
Overhead | Low | Higher (process creation cost) |
GIL Effect | Affected by GIL | Not affected by GIL |
✅ Complete Functional Example
from multiprocessing import Pool
import time
def compute_square(n):
time.sleep(1)
return n * n
if __name__ == "__main__":
numbers = [1, 2, 3, 4, 5]
start = time.time()
with Pool(processes=5) as pool:
results = pool.map(compute_square, numbers)
end = time.time()
print("Squares:", results)
print("Time Taken:", round(end - start, 2), "seconds")
✅ Output:
Squares: [1, 4, 9, 16, 25]
Time Taken: ~1 second
Without multiprocessing, it would take 5 seconds—this shows the power of parallel execution.
Tips & Common Pitfalls
Best Practices ✅
-
Use multiprocessing for CPU-bound tasks (e.g., computation-heavy).
-
Use multithreading for I/O-bound tasks (network, file I/O).
-
Use Pool.map() for simple parallelism.
-
Always protect code with
if __name__ == "__main__":
on Windows.
Common Mistakes ❌
-
Forgetting
if __name__ == "__main__"
→ leads to infinite process spawning. -
Using multiprocessing for small tasks → overhead outweighs performance gain.
-
Sharing state incorrectly → must use
multiprocessing.Queue
orManager
.
FAQ Section
Q1. Does multiprocessing bypass the GIL?
✅ Yes, because each process runs in its own interpreter.
Q2. When should I use multiprocessing instead of threading?
-
Multiprocessing → CPU-bound tasks
-
Threading → I/O-bound tasks
Q3. How do processes share data?
Use Queue, Pipe, or Manager.
from multiprocessing import Queue
q = Queue()
q.put("Hello")
print(q.get())
Q4. Is multiprocessing always faster?
❌ No. For small tasks, the process creation overhead may make it slower.
Q5. Can I use multiprocessing in Jupyter Notebooks?
⚠️ It’s tricky. Best to run from a .py
file due to how Jupyter handles processes.
Cheat Sheet Section
Feature | Syntax / Usage |
---|---|
Create process | Process(target=func, args=(arg,)) |
Start process | p.start() |
Wait for process | p.join() |
Pool creation | with Pool(n) as p: |
Map tasks | p.map(func, data_list) |
Queue | q = multiprocessing.Queue() |
Shared Value | Value('i', 0) |
Shared Array | Array('i', [1,2,3]) |
Interview Questions Section
Q1. What is the difference between threading and multiprocessing?
-
Threading → I/O-bound
-
Multiprocessing → CPU-bound, bypasses GIL
Q2. How does multiprocessing overcome the GIL?
Each process runs on its own Python interpreter + memory space.
Q3. Example: Use Pool to calculate factorials in parallel.
from multiprocessing import Pool
import math
nums = [5, 6, 7, 8]
with Pool(4) as pool:
print(pool.map(math.factorial, nums))
Q4. What are some inter-process communication methods?
-
Queue
-
Pipe
-
Manager
Q5. What are common pitfalls with multiprocessing?
-
Forgetting
if __name__ == "__main__"
-
Using it for small tasks (slower).
Q6. How to handle shared state safely?
Use multiprocessing.Manager()
for shared dictionaries/lists.
Q7. What’s the difference between Pool
and Process
?
-
Process → Fine-grained control.
-
Pool → Easier, automatic worker management.
8. Conclusion / Summary
-
Python’s multiprocessing is the go-to solution for CPU-bound tasks.
-
It bypasses the GIL and enables true parallel execution.
-
Use Pool for simplicity, Process for fine control.
-
Remember: Threads for I/O-bound, Processes for CPU-bound.
✅ Best Practice Takeaway: Always benchmark—multiprocessing shines with heavy CPU tasks, but can be overkill for lightweight jobs.