Numerous times I have been a part of customer meetings having long and stressful discussions around storage space and the related costs just because users find creative reasons to keep their data forever, just in case they need to access it at some point. At times I think it stems from the nightmares of legacy infrastructure where users had to wait weeks and months to get old data from tapes located offsite.
Regardless of the reasons for rationalizing the behavior, storage costs skyrocket when users don’t think about or discuss true access requirements, which in turn over-stretches a typically limited infrastructure budget, placing an increased burden on IT.
Today’s infrastructure has come a long way from the legacy days. Today, users can access old data when they want without storing everything on a common platform. Often a conversation around business priorities and requirements can lead to more realistic infrastructure requirements, which can lead to significant cost savings, without impacting the users need for data.
More than 95% of analytics are built to meet the requirement for about 97% users that use data for operational and tactical purposes. They do not need data that has no relevance around the current timeframe of operations. Companies often store data in expensive platforms for special use cases like legal requirements to satisfy the needs of less than 3% of the users in less than 5% of the situations. This extensive data volume not only increases infrastructure costs but leads to performance challenges.
Big data platforms, like Hadoop, provide an excellent and inexpensive platform to store data built on an inexpensive infrastructure to support analytics for the majority of users. Tools are available to move data between these platforms to help users access data when needed rather than wait for an extended period of time.
It may be worthwhile to engage a knowledgeable business partner who understands modern infrastructure options. A partner experienced in assisting companies in classifying and migrating their data to this more efficient storage model. OnX has proven reference architectures for Hadoop clusters that can be used for this purpose of offloading rarely needed data to lower cost efficient storage. We have been assisting customers across all industries to make use of Hadoop clusters as a cost saving, performance enhancement to their storage infrastructure.
By: Kajal Mukhopadhyaya, Principal Solutions Architect