A FULLY-MANAGED DATABASE WITH HIGH PERFORMANCE AT ANY SCALE
Why should you care about DynamoDB? Because it's fully managed, highly scalable & scales on-demand with low latencies.
For getting you hooked, at Prime Days 2021 DynamoDB served 89.2 million requests per second at its peak!
You can choose between those two types, which can be changed any time:
The best fit highly depends on your traffic patterns.
Go with On-Demand if you got unpredictable traffic, as it scales immediately and you're only paying for what you actually use.
With steady load or known patterns, use provisioned as it can be up to almost 7 times less expensive.
Rule of thumb:
Your first year includes 25 Read & Write Capacity Units each for provisioned capacity mode every month.
In comparison to SQL, a document in DynamoDB doesn't have a fixed schema but is defined by the table, the primary key, which uniquely identifies a single document.
It's is a document's unique identifier & must be provided when inserting a new item.
There are two different types of primary keys:
Internally, DynamoDB consists of different partitions and your partition key will determine the partition where an item will be stored at.
Your provisioned read & write capacity units will be distributed among all partitions.
If your partition keys are not well-distributed, it will be easier to get your requests throttled as a subset of your partitions (or worse: a single) can receive the majority of read and/or write requests.
DynamoDB will increase capacity for hot partitions as long as the table-level throughput is not exceeded Adaptive Capacity.
But more concernness: If there's disproportionally high traffic to one or several items in the same partition, DynamoDB will isolance partitions such that those items don't reside on the same partition.
Using sort keys comes with benefits:
Keys should be well distributed to avoid hot partitions.
You can create two types of indexes, which are specifying alternative keys that can be used for queries:
Besides your primary key, your document can contain other data in different Scalar, Document and Set.
SCALAR:
DOCUMENT:
SET:
That's where it gets interesting and makes differences to SQL or other NoSQL solutions.
You can only query via your partition key (and sort key) condition, if there's any) of your main or secondary indexes.
Everything else needs scan.
With query, you're only paying for the retrieved items. It's only looking for the items at a specific partition. So generally speaking: query is way faster and cheaper.
A scan is just running through your table looking for items that are matching your expression.
You'll be charged for the items that are scanned, not the items that are retrieved.
Often, there are possible race conditions due to multi-tenancy where writes can be lost.
Example:
DynamoDB's Optimistic Locking allows us to verify that we're really updating the item we're expecting by using a dedicated version field.
In the previous example, both read operations would read version 1 and also expect to write version 2. The first operation would succeed, but the second would fail as the expected version would not match with the expectations.
Asynchronously trigger invocations of other services like Lambda when an item is created, updated or deleted.
AWS offers you a lot of options for backing up your tables.
Safely modify one or multiple fields of a complex document:
Distribute your data globally across tables for redundancy and faster latencies.
A bi-directional synchronization will replicate your data between different regions.
CloudWatch comes with default metrics like used read & write capacity units or the number of throttles. Helps to analyse DynamoDB traffic patterns so you can optimise for low costs.
Third-party tools like Dashbird.io help you with Slack notifications for critical events like throttles & give you general guidance with well-architected tips.