Thursday, March 10, 2016

The Refined and Optimized Data Architecture for Enterprise Needs

We all are by now familiar with the hype of fancy terms like "Big Data", "Hadoop", "No SQL Databases" , but having said that there is one principle behind all that technology expenses when we take a look at this from the enterprise perspective and this is "How much Value is added into the Enterprise".

Believe it or not at the end of the day  I.T services are a supporting platform to benefit business users and stake holders to achieve business goals and forecast the future projections. The main points can be short listed to following

Cost : EDW Typical Storage (usually any MPP architecture solution e.g. Teradata , Netteza , Oracle Exadata etc ) is quite an expensive one and also has a constraint when it comes to keep historical data or unstructured data /sensor data , there is an alternate and that is to store the "Cold" storage data into cheap commodity hardware e.g. Hadoop

Value of return on investment :  Replacing all EDW infrastructure and legacy system with new architecture and technologies is quite an expensive idea however looking at long term prospects and return of investment the best approach is slowly to resolve the limitation of existing EDW infrastructure and make a hybrid architecture to get the maximum value out of it.

Solution Linear Scale-ability and long term solution design :  No doubt , long term solution design and a flexible architecture to handle the growing data and type of data ( social media , sensor , click stream data) is a challenge so data architecture is to be refined to keep long term prospects in mind.

Below is a somewhat close hybrid solution design of a new data architecture for an enterprize.



Above is a reference from a white paper published by Hadoop only one argument which is my personal opinion is that with shifting all ETL to the Hadoop. Instead i believe we should keep it hybrid ( at least for some time to run a parallel architecture) and Keep ONLY Non structured data feeds /sensor/click stream detailed data ETL work on Hadoop platform and let the traditional sources be running into the existing infrastructure.

Reference  : http://info.hortonworks.com/rs/549-QAL-086/images/hortonworks-data-architecture-optimization.pdf?mkt_tok=3RkMMJWWfF9wsRonvKTKc%2B%2FhmjTEU5z16uQsWaeygYkz2EFye%2BLIHETpodcMTcVnMLDYDBceEJhqyQJxPr3AKNkNy9RxRhHqDg%3D%3D





No comments:

Post a Comment