Cloud providers are racing to expand their infrastructure options as massive amounts of data move to their platforms,...
and Google is no different, as it looks to these newer technologies as a way to become a major player in the market.
Bart Sano, vice president of platforms at Google, heads up a team that's been around almost as long as the company itself. They design the warehouse-scale data centers and everything inside for a company renowned for its ability to operate at scale.
Sano spoke with SearchCloudComputing about Google hardware and infrastructure -- from how they adjust to enterprise demands, to the next wave of cloud computing and the challenge to migrate customers' diverse workloads.
Google and Intel recently announced a partnership around enterprise cloud adoption. Part of that includes the next-generation Intel chip into Google Cloud Platform in 2017. What's the impetus for this partnership?
Bart Sano: We normally don't do that type of announcement, but we thought it was important so people could understand that technology is coming early next year. The benefit to the end customer is not only useful, obviously, within Google itself for our search, ad, whatever systems. It's also very useful for the cloud because it offers higher performance, more configurations that would help out with different workloads with larger memory footprint, more threads and such, and also constructions architecture that help out with more computation and vector processing.
Google built its infrastructure to meet its own specific demands. Do you change your underlying hardware to meet the demands of cloud customers with a range of different needs?
Sano: Think of Google as five to eight different product areas, and those product areas have their own form and functions. And as we go to the cloud, we have a greater diversity of clients. Many of them still fit within the type of skews we do internally because they're general-purpose in nature, but there are customers that want the largest memory configuration, or the fastest floating-points ratios. It's getting to be a lot more heterogeneous, not only in compute but in the numerical computations -- GPUs and ultimately with our machine learning with TPUs [Tensor processing units].
Between TPUs and GPUs and even field-programmable gate arrays (FPGAs), there seems to be a big rush by the major cloud providers to incorporate these technologies on their platforms. What's behind that?
Sano: We're trying to support what we think is the next wave, which is machine learning and big data processing, and leveraging machine learning and analytics on that big data. You need more numerical computation to figure out what's in that big data processing.
For example, not everybody needs a GPU to do a small machine learning model that does one particular task -- maybe the CPU function is enough. That is a way we used to do it until our problems became too big that we had to do GPUs. Then, the problems became way too big and we had to do our own custom hardware. Then, the decision [becomes], do you do some custom ASICs [application-specific integrated circuits] versus an FPGA? That's a difference in architectural approach: Do you want something that's more programmable versus something that's more fixed-function but has more efficiency? Both have their own trappings and advantages.
The reason for all of these different acronyms is [because] we are seeing a shift in computation from general purpose into machine learning and analytics space, and we're seeing cloud providers trying to bring in that analytics capability that the general-purpose stuff didn't need before.
Could you explain a bit more about how that process played out at Google?
Sano: I'll start with the FPGAs. You typically go with an FPGA because it is programmable and you can't 'predict the future,' so you have the flexibility. You deploy this FPGA in the fleet and then you personalize it later. I understand that desire because it's hard to predict the future, but they're very costly and very power-hungry because they're general-purpose.
The other direction is if you can develop your customized ASICs fast enough and deploy them quickly, then the advantages that the FPGAs have are mitigated. That is our stance. We are able to develop ASICs in a timely manner, and we always work on having an infrastructure that you can remanufacture in a sense and reman and repersonalize.
Google was an early adopter of containers. What's your perspective on how the technology has now come into vogue?
Sano: Shortly after I got here, we had a decision to make: Should we go with VMs or containers? I remember the long dread, but what we finally [decided] is containers have the lower overhead, and although it might complicate some of the management and such, it made for a much more efficient solution. And what we proved out is that was a good decision. VMs are very flexible, but you pay a higher premium for that flexibility. Efficiency is the really big, important thing for us. Because of our scale, 1% or 2% of memory efficiency or processor migration time or overhead -- it matters.
What are the challenges Google faces to facilitate customer migration at scale?
Sano: It's not easy to move the data over, and that is a big challenge. Software is the biggest challenge, quite frankly, and getting all of the software so it can migrate over or the data sets, etc. I could easily see where it might turn into a heterogeneous platform environment.
To get out of their on-prem environment into our cloud, it's not only the software but physical constraints. You're tied to hardware, as it were, and that's the thing we're working with them on ... we're trying to be as flexible as we can to accommodate them, but it is a transition for this industry that we have to go through.
What are the other big challenges in this transition to cloud?
Sano: For the legacy enterprise running on legacy systems and such, there is a migration strategy that will have to be developed and navigated. That, to me, is the biggest question, and what we're trying to do is build as many bridges along the way to peer with the hybrid environments.
Trevor Jones is a news writer with TechTarget's Data Center and Virtualization Media Group. Contact him at firstname.lastname@example.org.
A Google cloud services guide for enterprise IT
When to use a Google Preemtible VM
Google positions Stackdriver tool around multicloud
Google turns focus to higher-level services