Cloud storage sounds deceptively simple. You pay only for what you use, and it's easy to determine how much storage...
you're using at any point in time. However, as seasoned IT professionals know, managing cloud resources is rarely this straightforward.
As you deploy cloud storage resources, enact policies and procedures to optimize storage use. For example, tag storage objects with metadata to enforce fine-grained management and version-control policies, as well as limit storage costs. Additionally, consider how access controls and other security measures influence data storage in the cloud. Establish cloud storage management practices that make the most of vendor tools, as well as those from third-party resource managers, such as CloudCheckr and Cloudyn.
Limiting waste with metadata
Cloud storage enables an organization to save increasingly large volumes of data without incurring substantial costs. However, this can lead to a focus on the marginal costs of using more storage. When adding storage that costs only pennies per gigabyte, it's easy to overlook storage optimization. While this may be a viable strategy for smaller data sets, it's not for large-scale storage requirements.
It's more effective to determine what new data to store relative to what is already stored. Ask yourself: Does an analytics unit that requires customer data really need to maintain their own copies of customer data sets? It depends on the type of analytics unit. From a storage manager's perspective, this is an inefficient and costly approach. Analysts, on the other hand, need to understand their data set properties. They require facts, such as the data creation date, the initial data source and transformations applied, along with attribute descriptions -- including formulas to create derived values. Rather than hoping other analytics groups' data is exactly what they need, analysts may create and save their own data sets.
Metadata -- or tags associated with stored data blocks -- can reduce redundant data storage. Simple attributes, such as creation date, owner and applications using the data, are among the potential metadata tags. Users can detail descriptions such as attribute formulas and transformation descriptions in separate documents; use tags to link to more specific documentation.
To promote reuse, include metadata management in your storage policies. This reduces overall storage costs and, perhaps more importantly, facilitates shared data use. It also mitigates the risk of using multiple versions of formulas and source data for commonly used measures.
Access controls and security issues
Well-designed metadata promotes reuse, but that's not always appropriate. For example, confidential and private data requires a limited-access policy. Relational databases provide a range of tools for controlling access to data, including fine-grained, row-level access controls. How a user stores data in the cloud partially determines the applied control access methods.
Users that store data in a relational database in the cloud have the same access control options as on-premises. Switching to a different storage model, such as cloud-based services like Amazon Web Services DynamoDB or SimpleDB, requires the access-control mechanisms available in those systems.
Block storage users may have to adapt to course-grained access controls, such as granting or denying access at the file level. These controls may make it necessary to duplicate data or organize it so confidential and private data is available only to those who legitimately need access.
Tools to aid storage management in the cloud
As you formulate your cloud storage strategy, evaluate tools to help with overall management. Cloud vendors typically offer these tools, some of which include alerts to avoid exceeding storage thresholds. To collect and analyze storage data and predict storage trends and requirements, third-party tools, such as those from Cloudyn or CloudCheckr, may be necessary.
Moving storage to the cloud will not automatically save on storage costs. To maximize cloud storage ROI, active management, well-defined policies and procedures need to be in place.
About the author:
Dan Sullivan holds a master of science degree and is an author, systems architect and consultant with more than 20 years of IT experience. He has had engagements in advanced analytics, systems architecture, database design, enterprise security and business intelligence. He has worked in a broad range of industries, including financial services, manufacturing, pharmaceuticals, software development, government, retail and education. Dan has written extensively about topics that range from data warehousing, cloud computing and advanced analytics to security management, collaboration and text mining.
Choosing a cloud vendor for your needs
Effective management crucial to track cloud usage
Sorting through major cloud storage options