Coconote
AI notes
AI voice & video notes
Try for free
📊
DynamoDB Data Modeling Overview
Jul 29, 2024
Notes on DynamoDB Data Modeling
Introduction
DynamoDB is AWS's NoSQL database service.
Data modeling in DynamoDB differs from traditional relational database modeling.
Newcomers often try to simulate relational data modeling, leading to higher costs.
Proper design can reduce AWS bills and ensure millisecond latency at scale (1 GB to 10 TB of data).
What is Data Modeling?
Data modeling refers to how an application stores data related to real-world entities.
Two types of databases:
Relational Databases (SQL)
: e.g., MySQL, Oracle, Microsoft SQL Server
NoSQL Databases
: Optimized for different use cases.
Relational Databases
Use data normalization to split data across multiple tables to reduce redundancy.
Performance degrades as the database scales due to complex queries and multiple joins.
Typically have a strict schema.
NoSQL Databases
Optimized for compute rather than storage (which is cheaper now).
Allow duplicates and minimize table joins, reducing compute power for data retrieval.
Provide flexible schemas that can accommodate diverse data structures.
Scale horizontally very well.
Five-Step Process for DynamoDB Data Modeling
Draw an Entity Diagram
: Identifies main entities for your application.
Identify Relationships
: Understand how entities are related (one-to-many, many-to-many).
List Access Patterns
: Identify CRUD operations and data retrieval needs.
Decide on Primary Keys and Indexes
: Choose effective primary keys that satisfy access patterns.
Identify Secondary Indexes if Needed
: Use Global Secondary Indexes (GSIs) or Local Secondary Indexes (LSIs) for additional access patterns.
Practical Example: Multi-Tenant Project Management Tool
Entities
: Organization, Projects, Employees
Attributes
:
Organization: ID, Name, Tier
Projects: ID, Name, Type (Agile/Fixed Bid), Status
Employees: ID, Name, Date of Birth, Email
Relationships
:
One-to-many: Organization to Projects
One-to-many: Organization to Employees
Many-to-many: Employees to Projects (requires an additional entity: Project-Employees).
Access Patterns Example
Organization
: CRUD operations, find projects/staff.
Projects
: CRUD, filter by type/name/status.
Employees
: CRUD, find projects associated.
Project-Employees
: Allow querying for both employee and project relationships.
Choosing Primary Keys
Use composite keys (Partition Key + Sort Key).
Pattern:
Organization ID as Partition Key and Hash Metadata as Sort Key.
Ensure keys uniquely identify items to facilitate efficient queries.
Utilizing Indexes
GSI
: Allows additional access patterns and flexible queries.
May require specific attributes for basic queries, like employee IDs or project types.
Sparse Indexing and Filtering
Sparse indexing using GSIs on specific attributes (e.g., "on-hold" projects) can optimize retrieval.
Take care not to use filter conditions if aiming for efficiency, as they can lead to read capacity consumption issues.
Implementing in AWS
Utilize AWS CLI and SDK for seamless integration with DynamoDB.
Sample CRUD operations and queries showcased through Node.js SDK examples.
Focused on how to model and query within the limits and capabilities of DynamoDB.
Conclusion
Effective data modeling in DynamoDB requires understanding application access patterns and entity relationships.
Utilize indexing smartly to minimize costs while optimizing performance.
DynamoDB’s flexible schema allows for efficient data handling if modeled correctly.
📄
Full transcript