The Azure Data Factory service is a fully managed service for composing data storage, processing, and movement services into streamlined, scalable, and reliable data production pipelines.
The Azure Data Factory (ADF) is a service designed to allow developers to integrate disparate data sources. It is a platform somewhat like SSIS in the cloud to manage the data you have both on-prem and in the cloud.
It provides access to on-premises data in SQL Server and cloud data in Azure Storage (Blob and Tables) and Azure SQL Database. Access to on-premises data is provided through a data management gateway that connects to on-premises SQL Server databases.
It is not a drag-and-drop interface like SSIS. Instead, data processing is enabled initially through Hive, Pig and custom C# activities. Such activities can be used to clean data, mask data fields, and transform data in a wide variety of complex ways.
You will author your activities, combine them into a pipeline, set an execution schedule and you’re done. Data Factory also provides an up-to-the-moment monitoring dashboard, which means you can deploy your data pipelines and immediately begin to view them as part of your monitoring dashboard.
Within the Azure Preview Portal, you get a visual layout of all of your pipelines and data inputs and outputs. You can see all the relationships and dependencies of your data pipelines across all of your sources so you always know where data is coming from and where it is going. You get a historical accounting of job execution, data production status, and system health in a single monitoring dashboard.
Data Factory provides customers with a central place to manage their processing of weblog analytics, clickstream analysis, social sentiment, sensor data analysis, geo-location analysis, etc.