Python PostgreSQL Tutorial – Bulk Inserts with psycopg2 and SQLAlchemy

Last updated 1 month ago | 92 views 75     5

Inserting data one row at a time into a PostgreSQL database is inefficient, especially when working with large datasets. Bulk inserts allow you to insert thousands (or millions) of rows in a single command, dramatically improving performance.

This tutorial explains how to perform bulk inserts using:

  • psycopg2

  • SQLAlchemy


Table of Contents

  1. Why Bulk Inserts Matter

  2. Prerequisites

  3. Bulk Inserts Using psycopg2

  4. Bulk Inserts Using SQLAlchemy

  5. Performance Tips

  6. Full Working Examples

  7. Common Pitfalls


1. Why Bulk Inserts Matter

Instead of executing a single INSERT per row:

cur.execute("INSERT INTO table (col1, col2) VALUES (%s, %s)", (val1, val2))

Bulk inserts send many rows at once, minimizing database roundtrips.


✅ 2. Prerequisites

  • PostgreSQL and Python installed

  • Install dependencies:

pip install psycopg2-binary sqlalchemy

3. Bulk Inserts Using psycopg2

Step 1: Connect to PostgreSQL

import psycopg2

conn = psycopg2.connect(
    dbname="your_db",
    user="your_user",
    password="your_pass",
    host="localhost",
    port="5432"
)
cur = conn.cursor()

Step 2: Prepare Your Data

data = [
    ('Alice', 24),
    ('Bob', 30),
    ('Carol', 22),
]

Step 3: Use executemany()

cur.executemany("INSERT INTO students (name, age) VALUES (%s, %s)", data)
conn.commit()

✅ Or, use execute_values() for better performance

from psycopg2.extras import execute_values

execute_values(cur,
    "INSERT INTO students (name, age) VALUES %s",
    data
)
conn.commit()

4. Bulk Inserts Using SQLAlchemy

Step 1: Setup SQLAlchemy Engine and Table

from sqlalchemy import create_engine, Table, Column, Integer, String, MetaData

engine = create_engine('postgresql://your_user:your_pass@localhost/your_db')
metadata = MetaData()

students = Table('students', metadata,
    Column('id', Integer, primary_key=True),
    Column('name', String),
    Column('age', Integer),
)

Step 2: Insert Multiple Rows

with engine.connect() as conn:
    conn.execute(students.insert(), [
        {'name': 'Alice', 'age': 24},
        {'name': 'Bob', 'age': 30},
        {'name': 'Carol', 'age': 22},
    ])

⚡ 5. Performance Tips

Tip Why it Helps
Use execute_values() in psycopg2 Much faster than executemany()
Wrap inserts in a single transaction Reduces I/O overhead
Use indexes only after bulk insert Speeds up initial load
Avoid triggers if not needed Triggers slow down inserts

6. Full Working Example with psycopg2

import psycopg2
from psycopg2.extras import execute_values

data = [
    ('John', 21),
    ('Jane', 23),
    ('Mike', 25)
]

conn = psycopg2.connect(
    dbname="school",
    user="postgres",
    password="your_pass",
    host="localhost",
    port="5432"
)

cur = conn.cursor()
execute_values(cur,
    "INSERT INTO students (name, age) VALUES %s",
    data
)
conn.commit()
cur.close()
conn.close()

❗ 7. Common Pitfalls

Pitfall Fix
❌ Using executemany() on large data ✅ Use execute_values()
❌ Forgetting conn.commit() ✅ Always commit your transaction
❌ Inserting too much in one go ✅ Chunk your data if needed
❌ Ignoring exceptions ✅ Use try-except around inserts

Conclusion

Bulk inserts in PostgreSQL can significantly speed up your data import workflows. For best performance:

  • Use execute_values() with psycopg2

  • Use conn.execute() with SQLAlchemy

  • Always commit your transaction