Python BigQuery: How to Insert Data into a Table

Last updated 1 month, 4 weeks ago | 147 views 75     5

Tags:- Python BigQuery

Once your BigQuery table is created, the next step is to insert data. Using Python and the BigQuery client library, you can insert rows programmatically—perfect for ETL pipelines, data automation, or ingestion workflows.

In this article, you’ll learn:

  • Different methods to insert data

  • How to insert single or multiple rows

  • Best practices for batch inserts

  • Full code example

  • Tips and common pitfalls


✅ Prerequisites

Before starting:

  • Google Cloud Project with billing enabled

  • BigQuery API enabled

  • Python installed

  • Service account with BigQuery Data Editor or Admin role

Install Required Library

pip install google-cloud-bigquery

Step 1: Authenticate and Initialize Client

import os
from google.cloud import bigquery

# Authenticate using your service account key
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "path/to/your-key.json"

# Initialize the BigQuery client
client = bigquery.Client()

Step 2: Define Table ID

Use the format: project_id.dataset_id.table_id

table_id = "your-project-id.my_dataset.customers"

Step 3: Insert Data Rows

Method: insert_rows_json()

This method inserts JSON-like Python dictionaries (rows) into the table.

Example: Insert Multiple Rows

rows_to_insert = [
    {"id": 1, "name": "Alice", "email": "[email protected]", "signup_date": "2023-01-01"},
    {"id": 2, "name": "Bob", "email": "[email protected]", "signup_date": "2023-02-15"},
]

Insert the Rows

errors = client.insert_rows_json(table_id, rows_to_insert)

if errors == []:
    print("✅ Data inserted successfully.")
else:
    print("❌ Errors occurred while inserting rows:")
    print(errors)

Full Example: Insert Data into BigQuery Table

import os
from google.cloud import bigquery

# Set credentials
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "path/to/your-key.json"

# Initialize client
client = bigquery.Client()

# Define table ID
table_id = "your-project-id.my_dataset.customers"

# Define rows to insert
rows_to_insert = [
    {"id": 1, "name": "Alice", "email": "[email protected]", "signup_date": "2023-01-01"},
    {"id": 2, "name": "Bob", "email": "[email protected]", "signup_date": "2023-02-15"},
    {"id": 3, "name": "Charlie", "email": "[email protected]", "signup_date": "2023-03-10"},
]

# Insert data
errors = client.insert_rows_json(table_id, rows_to_insert)

# Check for errors
if errors == []:
    print("✅ Data inserted successfully.")
else:
    print("❌ Errors occurred:")
    for error in errors:
        print(error)

⚙️ Alternative: Load Data from a CSV/JSON File

You can load data from a local file using a load_job.

from google.cloud import bigquery

# Configure load job
job_config = bigquery.LoadJobConfig(
    source_format=bigquery.SourceFormat.CSV, skip_leading_rows=1, autodetect=True
)

with open("customers.csv", "rb") as source_file:
    load_job = client.load_table_from_file(source_file, table_id, job_config=job_config)

load_job.result()  # Wait for the job to complete

print("✅ Loaded data from CSV file.")

Tips for Inserting Data into BigQuery

Tip Why It Matters
Use insert_rows_json() for small batches Best for real-time inserts
Use load_table_from_file() for bulk inserts More efficient for large datasets
Validate data schema Data must match the table schema exactly
Use autodetect=True only when loading files BigQuery will infer types
Handle and log errors Always check the errors list

⚠️ Common Pitfalls

Issue Solution
Invalid field type Ensure each field matches the table schema
403 Permission Denied Grant BigQuery Data Editor or Admin role
Row insert fails silently Always check the errors returned
Too many rows in one insert Keep batches small (<10,000 rows)
Timezone mismatch in date/time Use ISO 8601 format (e.g., 2023-01-01T12:00:00Z)

Optional: Use Pandas DataFrame to Insert Data

import pandas as pd

df = pd.DataFrame([
    {"id": 4, "name": "Diana", "email": "[email protected]", "signup_date": "2023-04-01"},
])

job = client.load_table_from_dataframe(df, table_id)
job.result()
print("✅ Inserted data from DataFrame.")

Conclusion

Inserting data into BigQuery using Python is simple and scalable. Whether you're working with real-time data, bulk loading, or streaming inserts, Python and the BigQuery API provide robust tools to get the job done.

This foundational knowledge will help you build pipelines that insert clean, structured data into your cloud warehouse automatically.