In the previous post we looked into the architecture of Big Data framework. In this post, we will look into Lambda architecture. Lambda architecture is an architectural framework proposed by Nathan Marz to deal with the need of long term analytics of big data and at the same time processing the data for real time needs. More details can be seen at http://lambda-architecture.net/. The basic building blocks of Lambda architecture are:
- Data Stream - Data stream layer is the pipeline from which the data is flowing. The data is send to both batch layer and serving layer. If we think in the traditional way, consider it to be a message bus on which message is flowing. The message is being listened by both batch layer and speed layer.
- Batch Layer - It's the storage layer where the messages are kept for long term analytics. If we look back into the introductory post then this is the HDFS storage system.
- Serving Layer - Serving layer processes the data stored in batch layer and creates analytics view. The processed data is stored in a data warehouse tool so that down stream applications can use that data to derive business value.
- Speed Layer - One of the subscriber of data stream layer is speed layer. This layer processes the data generates actionable output. This is quite possible that the speed layer may remember it's history also to an extent to make the output more meaningful. For example to find the trends in twitter, the speed layer has to keep the history till certain time period in past.
- Querying Layer - Querying layer is responsible for merging the output from speed layer and information from serving layer. This builds a much more meaningful and actionable output.
In the case of our blood pressure example, the serving layer can tell us that the overshooting of BP is fifth event in less than a week and would require some serious attention. We can find multiple analogies of lambda architecture in our day to day life. Our experiences in life are like the serving layer. However how we react to everyday moment is defined by our speed layer.