Business Information

Technology insights for the data-driven enterprise

Sergey Nivens - Fotolia

Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Integrating cloud data? Don’t let data governance be an afterthought

Companies taking on cloud data integration projects must give ample thought to compliance, data sources and international regulations.

Splunk Inc. doesn't sell shoes or books. It's a cloud services company, involved in application delivery, big data and analytics. It's natural, then, to assume the company has an inside track on cloud integration, getting different applications and data sets in different locations to work together. But the fact is, Splunk wrestles with the very same issues as any other business that's looking to operate in the cloud.

Splunk's business needs require it to move multiple data sets among multiple applications in multiple clouds, tasks that require a clear understanding of the ultimate source and system of record for each data set, according to Christopher Nelson, the San Francisco company's senior director of business applications.

"People believe it's easy to leverage point-to-point integrations and tying systems together," Nelson said. "When companies start moving data between systems to improve their business processes, that's when they suddenly realize they need to have clear rules, or governance, as to how data is going to be moved, whether it should be moved at all, whether data can be fully leveraged where it already is, and who has access to each data component."

Christopher NelsonChristopher Nelson

data integration -- in which several applications share a common data set, or several sets of data are used by one application simultaneously -- presents a wide range of challenges for companies. Corporate governance rules likely dictate how data can and can't be used. Privacy regulations specify who can view information and who can't. Security protocols may call for encryption. Other mandates identify where data can be stored and how it must be handled when crossing international borders.

"Cloud application integration is about managing data in transit, so that you can apply different compliance rules, depending on where data is coming from, where it's going to or where the actual application resides," said Ross Mason, founder of cloud platform provider MuleSoft.

Border Patrol

That seemingly simple act of passing data is subject to regulations that vary widely by geography. In the U.S., there are restrictions regarding healthcare, financial services, human resources and payroll, thanks to legislation typified by the Health Insurance Portability and Accountability Act of 1996, or HIPAA, the federal patient privacy law. Businesses are adopting governance rules to address mandates for HIPAA and the Gramm-Leach-Bliley Act—a bank deregulation law passed in 1999—on a case-by-case basis, according to John Wheeler, an analyst at market research outfit Gartner.

What the U.S. doesn't have is territorial restrictions. In Europe, where countries are smaller and borders more plentiful, the issue is different. Businesses operating in Europe have to worry about data sovereignty, where data resides and restrictions against moving it across international boundaries, Mason said. "In the U.S., it's pretty straightforward, but once you get into Europe, you need some sort of hybrid strategy to make sure data remains within the physical boundaries of the country. It's a critical governance issue."

But having a data governance plan isn't just a European concern. Although data sovereignty applies to Europe, we live in a global marketplace where customers, transactions, inventories and the flow of information have little regard for frontiers. That makes governance an issue for every cloud provider and application developer. Enza Iannopollo, a Forrester Research analyst, said businesses that offer services or products to European citizens must comply with the European Union's data-protection requirements, whether they have a brick-and-mortar presence in the region or not.

Ash KulkarniAsh Kulkarni

Consider a bank officer who works in Switzerland, lives in Italy and maintains offices in London and New York. When she crosses an international border while using an internal company app on her smartphone, has she broken the law? Are locality rules being broken? Is she even allowed to have that data resident on her phone? There are no simple answers.

When it comes to conforming to EU regulations, Caspar Bowden, Microsoft's chief privacy adviser from 2002 to 2011, offers no-nonsense advice for cloud providers and users. In a 2012 presentation, "EU Data Sovereignty and Privacy In the Cloud," he recommended that providers use open source stacks; establish an audit process to document the static code base, updates and patches; create a forensic trail from source to compiled code and then to the machine-loaded binaries; ensure that subcontractors meet the same standards; and perhaps most important, declare data-retention polices and periods. Last, he urged cloud providers to take credit for and publicize their transparency. It's a business differentiator, after all.

Data locality, where it physically resides and what rules apply, is a new area for people to think about. It's not just about security anymore.
Ash KulkarniSenior vice president for cloud data integration, Informatica

"Data locality, where it physically resides and what rules apply, is a new area for people to think about. It's not just about security anymore," said Ash Kulkarni, senior vice president and general manager of cloud data integration at Informatica. The tide is turning as people grow more comfortable storing data in the cloud. Unless businesses feel they can do better at hiring security pros than Amazon Web Services or Microsoft Azure, the cloud is the safest option, he said.

The factors cloud integration providers must ponder include what rules to apply as data moves among clouds, which classes or groups of users can gain access, whether certain data fields (such as salary information) need to be masked or obfuscated and what kind of proliferation the data is being exposed to. "For the last several years, we've been making it possible to apply data masking as it flows through the pipeline," Kulkarni said. "As data travels between different silos, you can apply policies for governing who can access and in what form. This has become a much bigger part of the conversation than when data was solely on-premises."

Governing the ungovernable

Although sovereignty strives to keep data within international silos, no cloud is an island. Applications and data -- regardless of whether they reside on-premises or off -- must communicate with one another to get the job done. Integration, the process of configuring a diversity of home-grown, purchased or subscription applications to share data across private and public clouds and combinations of both, holds great benefit for business users.

For IT, it's not so rosy. As Splunk's Nelson notes, data governance -- managing the availability, usability, integrity and security of data used by an enterprise -- remains a key issue. And it can get very complicated very fast.

Consider one hypothetical example of what a poor data governance plan can create, posed by Brent Carlson, senior vice president of technology at Akana, a provider of cloud, mobile and Internet of Things products: A financial services business plans to publish a credit card processing API to its external business partners. The API must conform to payment-card industry standards. The company has also likely established its own internal corporate governance policies over the API—or if not, it should.

Carlson said without a good data governance plan in place, that API could be routed through the regular development and operational processes, resulting in the omission of a security review audit trail.

Even worse, publication of the API into a noncompliant production runtime environment could easily expose the company to a litany of woes. The Payment Card Industry Security Standards Council warns that noncompliance can lead to a loss of sales, reputation and share price -- and exposure to lawsuits, insurance claims and hefty government fines. For good measure, toss in the cost of providing credit monitoring services to every person affected by a data security breach.

Companies with strong governance for on-premises applications can't depend on that being enough in the cloud. "As elements of their infrastructure move to the cloud, they've got to figure out how to fit within those governance bounds or where they need to rewrite rules that are no longer adequate," MuleSoft's Mason said.

What level of governance and data security is deemed good enough when it comes to choosing a cloud service? The International Organization for Standardization's ISO/IEC 27018:2014 is here to help. This new standard aims to ensure that cloud service providers protect customers' privacy by securing any personally identifiable information they handle.

The next level

Laura Heritage Laura Heritage

Although governance, risk and compliance is most frequently thought of at the business level, it's important not to lose sight of the integrations required at the API level, according to a colleague of Carlson. "When companies had fully internal systems, they might be talking about 100 integrations, but as they go outside to the cloud, companies are finding they have 300 to 800 integrations they need to manage," said Laura Heritage, director of API strategy at Akana. "That's a governance factor that's crucial to every enterprise, exposing your own cloud service and integrating with your partners."

Regardless of how many discrete integrations there are, it adds up to lots of applications sharing lots of data sets. Managing the processes and ensuring that policies are established -- and adhered to -- becomes increasingly important.

For Splunk, the challenge was to solve integration-related architectural issues not just for its own operations, but for its cloud services customers.

Nelson, the applications director, offers three pieces of advice. First, develop a schema for where the master data sits—look at inactive data and determine how to present the data where necessary while eliminating duplication. Second, understand the value of APIs and services instead of just moving data back and forth. Finally, know what you're trying to build before getting into the technology. Focus on architectural principles, and then find supporting technology.

"Leverage tools out of the box as much as possible," Nelson said. "Once you go down the path to customization, you have to be careful to remain compatible with future updates."

Article 6 of 11

Next Steps

Encrypt for cloud compliance, data protection

How to pass data compliance audits

Draft a strong data governance plan

Dig Deeper on Cloud governance

Get More Business Information

Access to all of our back issues View All