William Liu

Dynamo DB

AWS Meetup for AWS Dynamo DB.

New in AWS

Summary

Amazon DynamoDB is a fast, NoSQL database server that supports both document and key-value store models.

Differences from No-SQL vs Traditional RDMBS Entities

NoSQL advantages

NoSQL disadvantages

RDBMS advantages

RDBMS disadvantages

Key Components

Tables and Items

In a table, you need the following:

Dungeons and Dragons Example

Table: DungeonsDragonsPlayers

Example:

Parition Key | Sort Key | Attributes

Data Type Storage

You can store the following:

Partitions

Allocation of storage for a table is based on the Partition key (and when needed, the range key).

Max parition size is 10 gigs. When a partition hits max capcaity, it splits.

Indexes

We have two kinds of indexes:

Local Index

Global Index

Pre-Filter with Indexes

If the record does not contain a value that the GSI uses as its partition key, then the item will not be created in the GSI.

Capcity Planning

How can I know how many partitions I am going to need?

FYI: If you are querying very small data, you can use BATCH operations to keep cost down.

Careful increasing RCU/WCU

Four partitions with 1,250 read capacity units and 500 write capacity units each.

Will then split to:

Eight partitions with 1,000 read capacity units and 250 write capacity units each.

Querying Data

Note: Filters do not help! Filters are evaluated after the data is already returned.

Careful choosing Partition Key

High Cardinality means that a large percentage of values has lots of unique values. You want your keys to be unique and evenly distributed.

Say you picked City, State as the Partition Key. New York may have a lot more access than a smaller city.

Advice on data

Backups and Restores

You can use Data Pipeline to look at DynamoDB, can backup to S3 and restore as a lambda event (create, delete) to DynamoDB. Be careful, this is expensive since it is a LOT of RCUs and WCUs.

Dynamo Stream

You can use Dynamo Stream to populate an S3 bucket. Can use with S3 versioning to see exactly what changed. You can also use DynamoDB Stream to sync to out of region.

Monitoring

Monitor the following with Cloud Watch: