Sunday, February 17, 2019

Migrating from DynamoDB to Postgres

The mass popularity of NoSQL databases has also resulted in it being used in all use cases without understanding the suitability for the use case. One fundamental rule that is usually forgotten is that the NoSQL databases are designed around queries. Initially, the schema is evolved based on the need for initial business use cases. However, the business use cases evolve and change in multiple ways and soon the new needs of interacting with the database become unwieldy. This is the fundamental problem that people usually hit wiht NoSQL databases.

Where DynamoDB gets into trouble:
  • As the business use cases evolve and change the need to query the database in multiple ways arise. DynamoDB is not easy to query if it is not queried based on the partition key. One can build indexes but there is a cost associated with it. Filters can be used to query the data but that involves scan and as the data grows it starts becoming costly both in terms of money and time.
  • DynamoDB is schemaless so with time the data evolve in multiple ways. In older data records, it is quite possible that the fields might be missing or might have a different interpretation. Developers keep handling them in the deep layers of code to keep the world moving. However, soon it results in too many if-else statements. Migration is a pain to handle such cases, however, one has to be ready for missing fields and handle them with suitable defaults.
  • There is no relationship integrity so it's easy to put wrong data in relationships and it's very difficult to figure out even if something like that has happened. In SQL also it's possible to put a wrong key with a valid foreign relationship but still in terms of integrity SQL provides much better primitives.
  • This will be a repetition of the above points but a different perspective. As it's fine to add any kind of data in the table, people start putting all kind of data in it. Imagine that everyone in the world is given all kind of freedom. Sounds romantic. However, soon ti will be chaos as everyone is living in all different ways. 
Sample code for migration

import boto3
import psycopg2

# Create a connection to DynamoDB. Please fill the required keys. A better way to do is 
# to put it in config file and pass it through. In AWS environment it's better to use Roles
dynamo = boto3.client('dynamodb',aws_access_key_id='<access-key>', \
                      aws_secret_access_key='<access-secrte>', \
                      region_name='<region>')

# Create the database connection
db = psycopg2.connect(host="<db_host>,database="<db>", user="<user>", password="<pass>")
dbCurr = db.cursor()

#Assume we have a user table in dynamoDB and insert it into the postgres Users table
user = dynamo.query(TableName="User", KeyConditionExpression ="Email = :email", \
                               ExpressionAttributeValues = {":email": { 'S': email }})  

email = user.Items[0]['Email']

#Note the returning id so that we can use the id of the newly persisted record.
#This can be used to create foreign key relationships for further table
userSql = 'INSERT INTO users(email) VALUES(%s) returning id'
userValues=(email)

dbCurr.execute(userSql,userValues)
dbUserId = int(dbCurr.fetchone()[0]) 

db.commit()
dbCurr.close()

Some more stories from the web:

Thursday, December 27, 2018

Learning a programming language

There are tons of programming language and new ones keep popping and old ones going out. There are some who withstand the test of time and continue to flourish. Being in the information technology industry, it's imperative to keep learning new languages and keep sharpening the skills. Learning a new language is not always easy as many times there are paradigm shifts and it takes time to grasp them. 

In this blog, I will delve on the art of learning a language and what has helped me while learning nuances of Java, Python, Javascript and couple of more. I still would not claim myself as an expert in any of this but have done a fair amount of coding in those languages and couple of more. Whenever I had to pick a new language, I try to understand the language on the following dimensions to get a good understanding of what the language offers. And be careful, the languages are not sometimes plain programs, they are ecosystems in themselves. 

Sunday, October 7, 2018

AWS pricing model

Many organizations have adopted cloud services as an inherent part of their IT infrastructure.  With the cloud infrastructure, the upfront capital cost is not needed. One can hire the services on a need basis. However, one still has to be careful about the cost that gets incurred as part of hosting the IT infrastructure on the cloud. Every second a service is up and running, it makes the dollar meter to rotate.

Tuesday, September 11, 2018

SQL vs NoSQL

Image Source: Pixabay
SQL and NoSQL are two important choices for application data storage needs. There is a lot of confusion about which one to choose and what is a good fit. There are buzzwords on both sides which make the choices more confusing. How to choose one over other?

Saturday, September 1, 2018

Meetup - Pune Develope community - IoT and AI

Spoke about IoT and AI with a focus on the technical architecture and various elements of it. IoT and AI are bound to change the world in multiple ways. IoT is a perfect setup to generate a lot of data about various things and AI is data hungry to build better insights.

Slides at https://www.slideshare.net/LalitMohanChandraBha/iot-and-ai-112716904
Meetup Link: https://www.meetup.com/Pune-Developers-Community/events/252807697/

Thanks to a wonderful and engaging audience.

Wednesday, August 22, 2018

AWS IoT - Registering CA certificate

Use openSSL to generate the root key
 
      Generate the key:
          2048 - Encryption strength. AWS needs minimum 2048

               openssl genrsa -out rootCA.key 2048
     
      Generate the pem file
            Put the appropriate days for the certificate to be valid

             openssl req -x509 -new -nodes -key rootCA.key -sha256 -days 10000 -out rootCA.pem
   
             This will ask a set of questions. Answer them appropriately. An example is
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields, there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]:IN
State or Province Name (full name) [Some-State]:MH
Locality Name (eg, city) []:Pune
Organization Name (eg, company) [Internet Widgits Pty Ltd]: My company
Organizational Unit Name (eg, section) []: mycompany
Common Name (e.g. server FQDN or YOUR name) []:admin.mycompany
Email Address []:admin.mycompany@whatever
             
         
Now go to AWS IoT Service
Navigate to Secure -> CA
Click on Register on the right-hand side. This will open a page. Click on Register CA and follow the instructions. Make sure that in Step 3 in details for FQDN you have to put the key as mentioned in Step 2.
At Step 5 and 6 upload the required files.
Check "Activate CA certificate"
Check "Enable auto-registration of device certificates"

Monday, August 20, 2018

DynamoDB best practices for scalability

DynamoDB can be taken as a glorified hash map. It's essentially a key-value pair. This brings an important difference in terms of how to design an RDBMS schema and a DynamoDB schema. DynamoDB schema has to be approached via a query mindset. How you are going to query to the system should be basically driving the structure of your schema.

Think about queries and not about the data model and design around queries.