This content is part of the Essential Guide: An enterprise guide to big data in cloud computing
Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Words to go: Microsoft Azure big data services

As Microsoft grows its Azure big data portfolio, it can be tough for users to keep up. Here's a breakdown of must-know Azure terms for organizations taking the big data leap.

Big data is growing, not only in size, but also in popularity. Every day provides businesses with more data to sift through, whether from internal transactions or social media. But many enterprises need a service that can crunch all that information in a short amount of time -- and that is where cloud enters the picture.

Microsoft Azure big data services are gaining traction, as the company refocuses its mission on building an intelligent cloud platform. The Azure platform offers capabilities including information management, storage, machine learning, analytics and cognitive services. Additionally, enterprises can access applications from big data and advanced analytics partners in the Azure Marketplace.

But before getting started, use this list of terms to get acquainted with Microsoft Azure big data services:

Azure Data Lake Analytics: Data Lake Analytics is a query service for big data in Microsoft's public cloud. The service allows users to analyze data to gain insights and automatically scale resources. According to Microsoft, organizations can use Data Lake Analytics with their existing tools for identity, management, security and warehousing tools. Azure Active Directory is integrated with the service to provide further management for user permissions. Azure Data Lake Analytics has access to Azure SQL Data Warehouse, Power BI and Data Factory and is part of the Cortana Analytics Suite. The service uses U-SQL: a Microsoft query language derived from SQL and C#.

Every day provides businesses with more data to sift through, whether from internal transactions or social media.

Azure Data Lake Store: Data Lake Store provides hyperscale storage, based on Apache Hadoop, for big data sets. The system can store structured and unstructured in its native format. The service is designed to be low latency, and places no fixed limits on account or file size, according to Microsoft. The service is integrated with other Microsoft Azure big data services, including Azure Data Lake Analytics and Azure HDInsight.

Azure HDInsight: HDInsight is a Hadoop as a service offering to manage Apache Hadoop, Spark, and R clusters. It can scale on demand and store large amounts of data, which users can analyze and visualize with Excel. The service is integrated with Hortonworks Data Platform, which enables data to migrate from an on-premises data center to Azure. HDInsight also includes features such as Apache HBase, Apache Storm, Apache Spark and R server for Hadoop.

Azure Stream Analytics: Stream Analytics is a service that allows users to perform real-time analytics. Primarily used for the internet of things, it executes and gains insights from streams of data and scales with low latency. It is integrated with Azure Event Hubs to compare multiple streams. The service sends out customized alerts and displays real-time data in a dashboard.

Azure Data Factory: Data Factory is an orchestration service that coordinates the movement of data between on premises and cloud to prepare it for consumption. Users can monitor and automate data pipelines, and the service creates, schedules, manages and orchestrates the flow of data. It is often used in conjunction with other Microsoft Azure big data services, such as HDInsight, Stream Analytics and machine learning.

A lighthearted look at the big data cloud and cloud history

Take this lighthearted quiz to test your recollection of key events in the development of cloud computing, many of which led to the marriage of big data and the cloud.

Azure Data Catalog: Data Catalog is a managed service that streamlines data discovery. The tool allows users to register and discover data sources, as well as share insights. Users can organize metadata into a catalog, and control who can access which data sets.

Azure Power BI Embedded: Power BI Embedded is a service that lets users create interactive reports to visualize data. Organizations can embed those visuals within applications, without having to change the application's design, through REST APIs and software developer kits. Data is visualized from multiple sources, including Azure SQL Database and Azure SQL Data Warehouse. Out-of-the-box data visualizations are available, as well as the ability to create custom visuals.

Next Steps

Navigate Google big data services with these terms

Compare the big three big data services in cloud

Sharpen your Azure management skills

Dig Deeper on Big data, machine learning and AI