In Google BigQuery, you can delete specific rows from a table using the DELETE
statement. Although BigQuery is traditionally optimized for append-only data operations (like logs or analytics data), it does support row-level deletes.
This article covers:
-
Prerequisites and setup
-
Syntax of
DELETE
in BigQuery -
How to use
DELETE
in Python -
Parameterized queries
-
Tips and common pitfalls
-
A full working example
✅ Prerequisites
Before running DELETE
queries:
-
You must have write permissions to the dataset/table.
-
Your table must not be a view or external table.
-
The BigQuery API should be enabled.
Install the BigQuery Python Client
pip install google-cloud-bigquery
Step 1: Authenticate and Create a Client
import os
from google.cloud import bigquery
# Set the path to your service account key
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "path/to/your-key.json"
# Create a BigQuery client
client = bigquery.Client()
BigQuery DELETE
Statement Syntax
DELETE FROM `project_id.dataset_id.table_id`
WHERE condition
Example:
DELETE FROM `my-project.sales.customers`
WHERE status = 'inactive'
▶️ Step 2: Execute a DELETE Statement in Python
query = """
DELETE FROM `my-project.sales.customers`
WHERE status = 'inactive'
"""
query_job = client.query(query)
query_job.result() # Wait for the job to complete
print("Rows matching condition deleted successfully.")
Step 3: Use Parameterized DELETE Queries
To avoid SQL injection and improve maintainability, use query parameters.
query = """
DELETE FROM `my-project.sales.customers`
WHERE signup_date < @cutoff_date
"""
job_config = bigquery.QueryJobConfig(
query_parameters=[
bigquery.ScalarQueryParameter("cutoff_date", "DATE", "2022-01-01")
]
)
query_job = client.query(query, job_config=job_config)
query_job.result()
print("Old rows deleted successfully.")
Full Example: DELETE Rows Where Age Is Over a Threshold
import os
from google.cloud import bigquery
# Authenticate
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "path/to/your-key.json"
client = bigquery.Client()
# DELETE query with a parameter
query = """
DELETE FROM `my-project.user_data.customers`
WHERE age > @max_age
"""
# Configure parameters
job_config = bigquery.QueryJobConfig(
query_parameters=[
bigquery.ScalarQueryParameter("max_age", "INT64", 100)
]
)
# Run the query
query_job = client.query(query, job_config=job_config)
query_job.result()
print("Deleted rows where age > 100.")
Tips for Using DELETE in BigQuery
Tip | Why it matters |
---|---|
Always test with a SELECT first |
Avoid unintentional data loss |
Use LIMIT with caution |
BigQuery does not support LIMIT in DELETE |
Use partitioned tables |
Makes deleting ranges (e.g., by date) faster |
Keep an audit log | Back up or log deleted rows for traceability |
Use query parameters | Prevents SQL injection and is easier to maintain |
⚠️ Common Pitfalls
Issue | Solution |
---|---|
Cannot delete from view | Ensure you are using a table, not a view |
Missing permissions | Grant BigQuery Data Editor or Owner role |
DELETE takes too long | Use partitioning and targeted WHERE clauses |
No rows deleted | Check your WHERE condition logic carefully |
Modifying streaming buffer data | Wait for streaming data to become available in storage |
How to Verify Deleted Rows
You can check if the rows are gone using a SELECT
query:
query = """
SELECT * FROM `my-project.sales.customers`
WHERE status = 'inactive'
"""
results = client.query(query).result()
if list(results):
print("Some inactive users still exist.")
else:
print("All inactive users have been deleted.")
Conclusion
The DELETE
operation in BigQuery using Python is straightforward when used properly:
✅ Use safe filtering
✅ Leverage parameters
✅ Avoid deleting large volumes of unpartitioned data
BigQuery is optimized for append-only workloads, but when deletions are needed (e.g., for GDPR compliance, cleanup, or corrections), the tools are powerful and flexible.