Monday, August 20, 2018

DynamoDB best practices for scalability

DynamoDB can be taken as a glorified hash map. It's essentially a key-value pair. This brings an important difference in terms of how to design an RDBMS schema and a DynamoDB schema. DynamoDB schema has to be approached via a query mindset. How you are going to query to the system should be basically driving the structure of your schema.

Think about queries and not about the data model and design around queries.


The first important concept to understand in DynamoDB is partition key. It's like the key in a hash map and all the good practices apply here. Make sure that the partition key has variety built into it. As the data is stored on multiple partitions, with variety in partition key, all partitions will be touched more evenly.

Sometimes a partition key is just a primary key or it can be a combination of primary key and sort key. Sort keys also help in keeping the data of the same type together. In case of sort key, the data is identified by the combination of primary key and sort key.

For example, for the meteorological data of the city on a given date can be arranged with city name as primary key and date as the sort key. This means the queries are mostly around cities. However, if the queries are mostly around dates, you might want to make the date as primary key and city as the sort key.

Indexes are another important thing to take care of. They are double-edged sort. Indexes will improve the read but can significantly impact the writes by making writes slower. Also, indexes do consume storage. Bring those attribute as part of the index which is queried. Not putting an attribute in the index will result in the whole item being read on the table.

While querying the data from dynamoDB try to use query then scan. Scan results in costly operations.

No comments:

Post a Comment