As cloud computing evolves, one particular use for it is beginning to stand out: big data. What exactly big data means and how it fits into the cloud conversation, however, are questions not easily answered.
Medio Systems Inc., founded in 2004 as an application service provider (ASP), has recently reinvented itself as a cloud-hosted provider of big data analytics, with a focus on mobile platforms. The company is also marketing its inGenius Software as a Service, traditionally aimed at enterprise customers such as T-Mobile, Verizon, Disney and CBS, at small- and medium-size businesses (SMBs).
Brian Lent, co-founder and chief technology officer of Medio, and Ivan Sucharski, Medio's data strategist, spoke about big data, the evolution of cloud computing and how the two trends affect each other.
How would you define big data for people who aren't all that familiar?
Brian Lent: I would define it as data on [such] a scale that you can't have a single department effectively manage it. Once you turn data into an operational process where you say, 'The data is going to drive our commerce engine and our recommendations and our churn modeling and our financial forecast,' now it becomes a different kind of asset, where you need to have the analytics passage sitting next to data, collocated because of the volume and the transfer. Once data is seen as a profit center, and hence there's a utility, then I think it moves more into the big data realm.
What actually distinguishes big data and cloud computing from what came before? Or are they just the same things by different names?
Lent: As an ASP, we would still license software with a very traditional approach, where you might do a license agreement for three years. Now what we're doing with the cloud-based approach is a license fee per month. It's much more measured on volume, based on monthly active users. On the technical front, I think there are a lot of differences, but one would be us making available a lot of our services through REST-based [application programming interfaces] APIs.
On the big data side, the fundamental difference is volume-based. So, when some of our customers get into petabytes, that certainly puts you in that big data camp. The other aspect of big data is the velocity at which data is changing. One of our newer customers is Rovio, the makers of Angry Birds, which has just reached a billion downloads. We're seeing more than 1.4 billion logging events in our cloud per day [in total]. They could launch a new app that could quadruple the traffic instantly.
How does big data fit into a cloud computing discussion, and vice versa?
Lent: The combination of cloud computing and big data is going to become more practical, simply because of the efficiencies of scale. The ability for any one company to keep up from an IT perspective, I think, is difficult.
With big data typically come folks that know how to work on that data, and that may be the biggest gap -- you can't just hire this talent. A friend that used to be at Google talked about data scientists as 'pink unicorns,' and said that there's only about 120 pink unicorns in the world -- true data scientists that know how to manipulate and work with big data, delivering business value. So if you think about that as a commodity and a limited resource, the question becomes, 'How do you centralize that into a cloud-based environment so everyone can get the value, but you don't have to have that person on-premises?'
Ivan Sucharski: There's elasticity, too. When you're running analytical models, you want them to run as fast as possible, but you don't need to run them every 10 minutes. So you need as many machines as you can get for the next half hour. And then they're idle for 23 and a half hours, until the next computing cycle. In a cloud situation, you've got the flexibility of that elasticity without the cost of being offline 95% of the time.
What is the most misunderstood thing about big data right now?
Lent: The notion that it's just about the size of data. The reality is, big data as a term is morphing to describe the complexities of data, and also how data is used in the enterprise. I think there's a connotation with big data that you're going to find operational uses for that data versus just storing it.
How do you see big data and the cloud evolving in the future?
Lent: You'll see the chief financial officer get engaged more into these types of decisions where they haven't to date, and get more engaged with the chief marketing officer (CMO) and the chief information officer (CIO). I think you'll see the cloud decision making moved to the CMO from the CIO. So rather than it being an IT artifact, big data and the cloud will be part of the core decision making when they go to roll out a new product.
Sucharski: Many organizations are just at the baby steps of collecting the appropriate data. Turning data into a commodity will create a new generation of individuals who are interested in what I care about, which is quality, which means that the overall quality of information will jump. Today you're definitely mining. You're digging through a lot of garbage to glean small insights.