Python PostgreSQL – Bulk Insert from CSV File

Last updated 6 months ago | 494 views 75 5

Python PostgreSQL – Bulk Insert from CSV File

Bulk inserting data from CSV files is a common and efficient method to load large datasets into PostgreSQL databases. This guide walks you through importing CSV data using two popular libraries:

✅ psycopg2 (COPY and execute_values)
✅ SQLAlchemy with Pandas

Prerequisites
Sample CSV Data
Using psycopg2 with COPY
Using psycopg2.extras.execute_values()
Using SQLAlchemy + pandas.to_sql()
Full Examples
Common Pitfalls

✅ 1. Prerequisites

Install required packages:

pip install psycopg2-binary sqlalchemy pandas

Prepare a PostgreSQL database and a table to insert into.

2. Sample CSV Data: `students.csv`

name,age
Alice,22
Bob,24
Carol,23

3. Method 1 – `psycopg2` + COPY (Fastest for Pure Bulk)

import psycopg2

conn = psycopg2.connect(
    dbname="school",
    user="postgres",
    password="your_pass",
    host="localhost",
    port="5432"
)
cur = conn.cursor()

with open('students.csv', 'r') as f:
    # Skip header line with next(f)
    next(f)
    cur.copy_from(f, 'students', sep=',', columns=('name', 'age'))

conn.commit()
cur.close()
conn.close()

⚠️ Ensure the table students(name, age) already exists and matches the CSV structure.

⚙️ 4. Method 2 – `psycopg2.extras.execute_values()` (Flexible)

This reads the CSV and inserts in batch:

import csv
import psycopg2
from psycopg2.extras import execute_values

conn = psycopg2.connect(
    dbname="school",
    user="postgres",
    password="your_pass",
    host="localhost",
    port="5432"
)
cur = conn.cursor()

with open('students.csv', 'r') as f:
    reader = csv.DictReader(f)
    data = [(row['name'], int(row['age'])) for row in reader]

execute_values(cur,
    "INSERT INTO students (name, age) VALUES %s",
    data
)

conn.commit()
cur.close()
conn.close()

5. Method 3 – Using `SQLAlchemy` and `Pandas`

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine('postgresql://postgres:your_pass@localhost/school')

df = pd.read_csv('students.csv')
df.to_sql('students', engine, if_exists='append', index=False)

✅ This method is very readable and integrates well with data processing.

6. Full Working Example with COPY

import psycopg2

conn = psycopg2.connect(
    dbname="school",
    user="postgres",
    password="your_pass",
    host="localhost",
    port="5432"
)

cur = conn.cursor()

# Ensure table exists:
cur.execute("""
CREATE TABLE IF NOT EXISTS students (
    id SERIAL PRIMARY KEY,
    name TEXT,
    age INTEGER
)
""")

with open('students.csv', 'r') as f:
    next(f)  # Skip the header
    cur.copy_from(f, 'students', sep=',', columns=('name', 'age'))

conn.commit()
cur.close()
conn.close()

⚠️ 7. Common Pitfalls

Problem	Solution
CSV has header row	Use `next(f)` to skip header
Incorrect column types	Validate CSV data before insert
Encoding issues (e.g., UTF-8)	Use `open(file, encoding='utf-8')` if needed
Mismatched column names	Ensure CSV headers match DB column names
`to_sql` doesn't auto-create PK	Use SQLAlchemy model for full control (if needed)

Conclusion

For large data loads:

Use COPY for best performance
Use execute_values() for more control
Use pandas.to_sql() for quick integration with data analysis

From The Article

How to Add CSS Files in Django — Step-by-Step Guide

Python MySQL Tutorial – How to Use ORDER BY to Sort Data

Django Master Template (Base Template) – A Complete Guide

What are the difference between mysqli_fetch_assoc and mysqli_fetch_array?

What is the difference between BETWEEN and IN operator in SQL?

A Comprehensive Guide to Working with Spatial Data in Python using SciPy

Python PostgreSQL – Bulk Insert from CSV File

Python PostgreSQL – Bulk Insert from CSV File

Table of Contents

✅ 1. Prerequisites

2. Sample CSV Data: `students.csv`

3. Method 1 – `psycopg2` + COPY (Fastest for Pure Bulk)

⚙️ 4. Method 2 – `psycopg2.extras.execute_values()` (Flexible)

5. Method 3 – Using `SQLAlchemy` and `Pandas`

6. Full Working Example with COPY

⚠️ 7. Common Pitfalls

Conclusion

From The Article

Trending View All

How to show data values on top of each bar …

A non-numeric value encountered in PHP

The view account.views.register did not return an HttpResponse object. It …

Input type number maxlength not working

Uncaught TypeError: e.indexOf is not a function in JQuery

How to start array index from 1 in PHP

Interview Questions

PHP Interview Question

PayPal Interview Question

MySQL Interview Question

PHP-MySQL Interview Question

SQL Interview Question

CodeIgniter Interview Question

JQuery Interview Question

htaccess Interview Question

JavaScript Interview Question

HTML Interview Question

Python Interview Question

Django Interview Question

Python PostgreSQL – Bulk Insert from CSV File

Python PostgreSQL – Bulk Insert from CSV File

Table of Contents

✅ 1. Prerequisites

2. Sample CSV Data: students.csv

3. Method 1 – psycopg2 + COPY (Fastest for Pure Bulk)

⚙️ 4. Method 2 – psycopg2.extras.execute_values() (Flexible)

5. Method 3 – Using SQLAlchemy and Pandas

6. Full Working Example with COPY

⚠️ 7. Common Pitfalls

Conclusion

From The Article

Trending View All

How to show data values on top of each bar …

A non-numeric value encountered in PHP

The view account.views.register did not return an HttpResponse object. It …

Input type number maxlength not working

Uncaught TypeError: e.indexOf is not a function in JQuery

How to start array index from 1 in PHP

Interview Questions

PHP Interview Question

PayPal Interview Question

MySQL Interview Question

PHP-MySQL Interview Question

SQL Interview Question

CodeIgniter Interview Question

JQuery Interview Question

htaccess Interview Question

JavaScript Interview Question

HTML Interview Question

Python Interview Question

Django Interview Question

2. Sample CSV Data: `students.csv`

3. Method 1 – `psycopg2` + COPY (Fastest for Pure Bulk)

⚙️ 4. Method 2 – `psycopg2.extras.execute_values()` (Flexible)

5. Method 3 – Using `SQLAlchemy` and `Pandas`