Python DynamoDB: Querying Data with Boto3

Last updated 1 month, 4 weeks ago | 132 views 75     5

Tags:- Python DynamoDB

Amazon DynamoDB is a fully managed NoSQL database that offers fast and predictable performance with seamless scalability. In this tutorial, you'll learn how to query data from a DynamoDB table using Python and Boto3, the AWS SDK for Python.


Table of Contents

  1. What is a DynamoDB Query?

  2. Prerequisites

  3. Set Up: Install Boto3 & Configure AWS

  4. DynamoDB Query Syntax

  5. Query Using Partition Key

  6. Query Using Partition and Sort Key

  7. Query with FilterExpression

  8. Query with Pagination

  9. Complete Example

  10. Tips and Best Practices

  11. Common Pitfalls

  12. Conclusion


1. What is a DynamoDB Query?

A Query in DynamoDB retrieves all items that match a partition key value and optionally filters by sort key and other attributes.

  • More efficient than scan() because it targets specific partitions.

  • Returns one or more items that match the key condition.

  • You can apply additional filters using FilterExpression.


⚙️ 2. Prerequisites

Ensure you have:

  • An AWS account

  • A DynamoDB table (e.g., Users) with a partition key (e.g., user_id) and optional sort key

  • Python 3.7+

  • AWS CLI configured with access keys


3. Set Up: Install Boto3 & Configure AWS

Install Boto3

pip install boto3

Configure AWS Credentials

aws configure

4. DynamoDB Query Syntax (with Boto3)

Basic usage:

response = table.query(
    KeyConditionExpression=Key('partition_key').eq('value')
)

Advanced:

response = table.query(
    KeyConditionExpression=Key('partition_key').eq('value') & Key('sort_key').begins_with('prefix'),
    FilterExpression=Attr('age').gt(30),
    Limit=10
)

5. Query Using Partition Key Only

Assume your table is called Users and has:

  • Partition Key: user_id

  • Sort Key: timestamp (optional)

Example

import boto3
from boto3.dynamodb.conditions import Key

dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('Users')

response = table.query(
    KeyConditionExpression=Key('user_id').eq('001')
)

for item in response['Items']:
    print(item)

6. Query Using Partition and Sort Key

Suppose you want to get all login records of a user where timestamp begins with '2024-12'.

response = table.query(
    KeyConditionExpression=Key('user_id').eq('001') & Key('timestamp').begins_with('2024-12')
)

Other sort key operators include:

  • eq() — Equals

  • lt(), lte(), gt(), gte() — Comparison

  • between(val1, val2)

  • begins_with('prefix')


7. Query with FilterExpression

Use filters to narrow down results after the key condition is applied.

from boto3.dynamodb.conditions import Attr

response = table.query(
    KeyConditionExpression=Key('user_id').eq('001'),
    FilterExpression=Attr('age').gt(25)
)

for item in response['Items']:
    print(item)

⚠️ FilterExpression is applied after items are retrieved — it does not reduce read capacity usage.


8. Query with Pagination

DynamoDB query responses may be paginated if results exceed 1MB or specified Limit.

items = []
last_evaluated_key = None

while True:
    if last_evaluated_key:
        response = table.query(
            KeyConditionExpression=Key('user_id').eq('001'),
            ExclusiveStartKey=last_evaluated_key
        )
    else:
        response = table.query(
            KeyConditionExpression=Key('user_id').eq('001')
        )

    items.extend(response['Items'])

    last_evaluated_key = response.get('LastEvaluatedKey')
    if not last_evaluated_key:
        break

print("Total items retrieved:", len(items))

✅ 9. Complete Example

import boto3
from boto3.dynamodb.conditions import Key, Attr

def query_users(user_id, min_age=None):
    dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
    table = dynamodb.Table('Users')

    key_expr = Key('user_id').eq(user_id)
    if min_age:
        response = table.query(
            KeyConditionExpression=key_expr,
            FilterExpression=Attr('age').gte(min_age)
        )
    else:
        response = table.query(KeyConditionExpression=key_expr)

    return response['Items']

users = query_users('001', min_age=30)
for user in users:
    print(user)

10. Tips and Best Practices

Tip Explanation
✅ Use query() over scan() Faster and cheaper
✅ Design efficient partition/sort keys Enables precise queries
✅ Use begins_with() for time-series data Good for log/event use cases
✅ Use pagination for large results Prevents timeouts and resource usage spikes
Secure access with IAM policies Protects from overuse and data leaks

⚠️ 11. Common Pitfalls

Pitfall Solution
❌ Using query() without partition key Required for all queries
❌ Expecting FilterExpression to save costs It doesn't reduce reads
❌ Not handling pagination Use LastEvaluatedKey to paginate results
❌ Assuming query returns a single item Use get_item() for that case
❌ Misunderstanding sort key use Sort key enables range queries, not filters

12. Conclusion

DynamoDB’s query() operation is the most efficient way to fetch data when you know the partition key. By combining it with sort keys and filters, you can retrieve complex datasets with minimal overhead.


Next Steps

  • Learn how to use Global Secondary Indexes (GSI) to query by non-key attributes

  • Explore Update and Delete operations

  • Set up DynamoDB Streams for real-time change tracking