Definition

Microsoft Azure Data Lake

Microsoft Azure Data Lake is a highly scalable data storage and analytics service. The service is hosted in Azure, Microsoft's public cloud, and is largely intended for big data storage and analysis. Like other data lakes, Azure Data Lake allows developers, scientists, business professionals and other users to gain insight from large, complex data sets. To do this, users write queries that process data and generate results. Because Azure Data Lake is a cloud computing service, it gives customers a faster and more efficient alternative to deploying and managing big data infrastructure within their own data centers.

As with most data lake offerings, the Azure Data Lake service is composed of two parts: data storage and data analytics. Users can store enormous volumes of structured, semi-structured or unstructured data produced from any application, ranging from large archival stores to small, time-sensitive transactional data. According to Microsoft, users can provision Azure Data Lake to store terabytes or even exabytes of data. The storage service also provides high throughput for fast data processing.

On the analytics side, Azure Data Lake users can produce their own code for specific data transformation and analysis tasks, or use existing tools, such as Microsoft's Analytics Platform System or Azure Data Lake Analytics, to query data sets.

Azure Data Lake is based on the Apache Hadoop YARN (Yet Another Resource Negotiator) cluster management platform and is intended to scale dynamically within the Azure public cloud. This helps the service accommodate the needs of big data projects, which tend to be compute-intensive.

Users can write their own processing code for Azure Data Lake with a programming language such as U-SQL, which merges SQL structure and user-specific code. This also allows users to run analytics across SQL servers in Azure, as well as across Azure SQL Database and Azure SQL Data Warehouse. This unifies access to most data sources in Azure.

Pricing for Azure Data Lake contains numerous components, including storage capacity, the number of analytics units (AUs) per minute, the number of completed jobs and the cost of managed Hadoop and Spark clusters. The Azure Pricing Calculator can help users determine exact data lake costs.

This was last updated in April 2016

Continue Reading About Microsoft Azure Data Lake

Dig Deeper on Big data and cloud business intelligence

PRO+

Content

Find more PRO+ content and other member only offers, here.

Join the conversation

2 comments

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

Amazon S3, Kinesis, Redshift. 
Cancel
Which cloud services do you use for big data storage and analysis?
Cancel

-ADS BY GOOGLE

File Extensions and File Formats

SearchServerVirtualization

SearchVMware

SearchVirtualDesktop

SearchAWS

SearchDataCenter

SearchWindowsServer

SearchCRM

Close