Q
Get started Bring yourself up to speed with our introductory content.

Would Azure Data Factory benefit my cloud data?

Azure Data Factory helps move large volumes of data between the cloud and on-premises environments. But how else could it impact my big data strategy?

To distinguish itself from competitors such as Google and Amazon Web Services, Microsoft rolled out a number of...

cloud services focused on machine learning, big data and analytics. For example, Microsoft recently unveiled Azure Data Factory, a workflow system for coordinating data flows between storage and processing systems.

AWS offers Data Pipeline, a comparable service to Data Factory, while Google offers Google Cloud Dataflow. And while all three services are designed to streamline repeated data movement operations, Azure Data Factory has a unique lineup of services for enterprises to consider.

Azure Data Factory serves some of the functions of an extraction, transformation and load (ETL) tool, but is especially designed to move large volumes of data between cloud and on-premises resources. Developers can create data pipelines using an Azure Data Factory console or PowerShell scripts.

Data Factory performs a number of "activities," or processes that take a data set as input and produce an output data set. The basic activity within Azure Data Factory is the Copy Activity, which supports a range of sources, including Azure Blob Storage, Azure SQL Database, Azure Table Storage, on-premises or infrastructure as a service SQL Server databases and Oracle databases. The Copy Activity supports some transformations, as well.

Azure Data Factory is especially well-suited for big data applications and analysis. For example, HDInsight Activity allows developers to work with Pig -- a high-level, declarative data manipulation language in the Hadoop ecosystem -- and Hive, a Hadoop database. Users can store data in a data hub for further processing.

Users configure Azure Data Factory jobs with JSON specifications, including inputs, outputs, transformations and policies. Transformations can take advantage of Data Factory date, time and text functions.

The Azure Management Portal provides access to key information about Data Factory processes and workloads. Administrators can view information on data sets and linked services, along with activity run details.

Azure Data Factory pricing is based on activity frequency. Low-frequency activities in the cloud start at $0.30 per activity, while on-premises activities cost $0.75 per activity. Microsoft does not charge for the first five low-frequency activities performed each month.

High-frequency activities start at $0.50 for cloud and $1.25 for on-premises environments. Microsoft offers volume discounts based on the number of activities performed each month.

About the author:
Dan Sullivan holds a master of science degree and is an author, systems architect and consultant with more than 20 years of IT experience. He has had engagements in advanced analytics, systems architecture, database design, enterprise security and business intelligence. He has worked in a broad range of industries, including financial services, manufacturing, pharmaceuticals, software development, government, retail and education. Dan has written extensively about topics that range from data warehousing, cloud computing and advanced analytics to security management, collaboration and text mining.

Next Steps

Azure SQL upgrades further Microsoft's big data push

Five tips to manage big data in the cloud

Google answers Amazon Glacier with its Nearline cloud storage

This was last published in June 2015

Dig Deeper on Public cloud providers

PRO+

Content

Find more PRO+ content and other member only offers, here.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

Join the conversation

1 comment

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

Would your organization use Azure Data Factory? Why or why not?
Cancel

-ADS BY GOOGLE

SearchServerVirtualization

SearchVMware

SearchVirtualDesktop

SearchAWS

SearchDataCenter

SearchWindowsServer

SearchCRM

Close