Google Cloud Dataproc is a managed service within the Google Cloud Platform for processing large datasets, such as those used in big data initiatives. Dataproc is built on open source platforms including Apache Hadoop, Spark and Pig. The service is primarily used by data scientists, business decision-makers and researchers.
MapReduce is a core component of the Apache Hadoop software framework.
Microsoft Azure Data Lake is a highly scalable public cloud service that allows developers, scientists, business professionals and other Microsoft customers to gain insight from large, complex data sets.
Microsoft Azure Machine Learning is a collection of services and tools intended to help developers train and deploy machine learning models.