Amazon DynamoDB is a fully managed NoSQL database that offers fast and predictable performance with seamless scalability. In this tutorial, you'll learn how to query data from a DynamoDB table using Python and Boto3, the AWS SDK for Python.
Table of Contents
-
What is a DynamoDB Query?
-
Prerequisites
-
Set Up: Install Boto3 & Configure AWS
-
DynamoDB Query Syntax
-
Query Using Partition Key
-
Query Using Partition and Sort Key
-
Query with FilterExpression
-
Query with Pagination
-
Complete Example
-
Tips and Best Practices
-
Common Pitfalls
-
Conclusion
1. What is a DynamoDB Query?
A Query in DynamoDB retrieves all items that match a partition key value and optionally filters by sort key and other attributes.
-
More efficient than
scan()
because it targets specific partitions. -
Returns one or more items that match the key condition.
-
You can apply additional filters using
FilterExpression
.
⚙️ 2. Prerequisites
Ensure you have:
-
An AWS account
-
A DynamoDB table (e.g.,
Users
) with a partition key (e.g.,user_id
) and optional sort key -
Python 3.7+
-
AWS CLI configured with access keys
3. Set Up: Install Boto3 & Configure AWS
Install Boto3
pip install boto3
Configure AWS Credentials
aws configure
4. DynamoDB Query Syntax (with Boto3)
Basic usage:
response = table.query(
KeyConditionExpression=Key('partition_key').eq('value')
)
Advanced:
response = table.query(
KeyConditionExpression=Key('partition_key').eq('value') & Key('sort_key').begins_with('prefix'),
FilterExpression=Attr('age').gt(30),
Limit=10
)
5. Query Using Partition Key Only
Assume your table is called Users
and has:
-
Partition Key:
user_id
-
Sort Key:
timestamp
(optional)
Example
import boto3
from boto3.dynamodb.conditions import Key
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('Users')
response = table.query(
KeyConditionExpression=Key('user_id').eq('001')
)
for item in response['Items']:
print(item)
6. Query Using Partition and Sort Key
Suppose you want to get all login records of a user where timestamp
begins with '2024-12'
.
response = table.query(
KeyConditionExpression=Key('user_id').eq('001') & Key('timestamp').begins_with('2024-12')
)
Other sort key operators include:
-
eq()
— Equals -
lt()
,lte()
,gt()
,gte()
— Comparison -
between(val1, val2)
-
begins_with('prefix')
7. Query with FilterExpression
Use filters to narrow down results after the key condition is applied.
from boto3.dynamodb.conditions import Attr
response = table.query(
KeyConditionExpression=Key('user_id').eq('001'),
FilterExpression=Attr('age').gt(25)
)
for item in response['Items']:
print(item)
⚠️
FilterExpression
is applied after items are retrieved — it does not reduce read capacity usage.
8. Query with Pagination
DynamoDB query responses may be paginated if results exceed 1MB or specified Limit
.
items = []
last_evaluated_key = None
while True:
if last_evaluated_key:
response = table.query(
KeyConditionExpression=Key('user_id').eq('001'),
ExclusiveStartKey=last_evaluated_key
)
else:
response = table.query(
KeyConditionExpression=Key('user_id').eq('001')
)
items.extend(response['Items'])
last_evaluated_key = response.get('LastEvaluatedKey')
if not last_evaluated_key:
break
print("Total items retrieved:", len(items))
✅ 9. Complete Example
import boto3
from boto3.dynamodb.conditions import Key, Attr
def query_users(user_id, min_age=None):
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('Users')
key_expr = Key('user_id').eq(user_id)
if min_age:
response = table.query(
KeyConditionExpression=key_expr,
FilterExpression=Attr('age').gte(min_age)
)
else:
response = table.query(KeyConditionExpression=key_expr)
return response['Items']
users = query_users('001', min_age=30)
for user in users:
print(user)
10. Tips and Best Practices
Tip | Explanation |
---|---|
✅ Use query() over scan() |
Faster and cheaper |
✅ Design efficient partition/sort keys | Enables precise queries |
✅ Use begins_with() for time-series data |
Good for log/event use cases |
✅ Use pagination for large results | Prevents timeouts and resource usage spikes |
Secure access with IAM policies | Protects from overuse and data leaks |
⚠️ 11. Common Pitfalls
Pitfall | Solution |
---|---|
❌ Using query() without partition key |
Required for all queries |
❌ Expecting FilterExpression to save costs |
It doesn't reduce reads |
❌ Not handling pagination | Use LastEvaluatedKey to paginate results |
❌ Assuming query returns a single item | Use get_item() for that case |
❌ Misunderstanding sort key use | Sort key enables range queries, not filters |
12. Conclusion
DynamoDB’s query()
operation is the most efficient way to fetch data when you know the partition key. By combining it with sort keys and filters, you can retrieve complex datasets with minimal overhead.
Next Steps
-
Learn how to use Global Secondary Indexes (GSI) to query by non-key attributes
-
Explore Update and Delete operations
-
Set up DynamoDB Streams for real-time change tracking