Cloud Photo

Azure Data Architect | DBA

Taking Inventory of Azure Data Factory V1 with PowerShell

,

I recently had to take inventory of an Azure Data Factory V1 to help identify pipelines that were no longer required as well as to audit corporate naming conventions. Originally, the plan was to examine each ADF component in the portal. This is cumbersome and tedious. We could also examine the Visual Studio ADF project and review the JSON there. However, to quickly generate a list of Linked Services, Datasets and Pipelines, I used the PowerShell script below.

###Login
Login-AzureRmAccount

### Set Subscription
#Select-AzureRmSubscription -SubscriptionId "SUBSCR1PT10N-G03S-H3R3"

$resourceGroupName = "myDataFactoryResourceGroup"
$dataFactoryName = "myDataFactory"

### Get Linked Services
$linkedServices = Get-AzureRmDataFactoryLinkedService -ResourceGroupName $resourceGroupName -DataFactoryName $dataFactoryName
Write-Host "--------------------------------"
Write-Host "Linked Services"
ForEach($ls in $linkedServices){
    Write-Host $ls.LinkedServiceName: $ls.Properties.Type
}

### Get Datasets
$dataSets = Get-AzureRmDataFactoryDataset -ResourceGroupName $resourceGroupName -DataFactoryName $dataFactoryName
Write-Host "--------------------------------"
Write-Host "Datasets"
ForEach($dataset in $dataSets){
    Write-Host $dataset.DatasetName: $dataset.Properties.LinkedServiceName 
}

### Get Pipelines
$pipelines = Get-AzureRmDataFactoryPipeline -ResourceGroupName $resourceGroupName -DataFactoryName $dataFactoryName
Write-Host "--------------------------------"
Write-Host "Pipelines"
ForEach($pl in $pipelines){
    Write-Host `n $pl.PipelineName
    ForEach($activity in  $pl.Properties.Activities){
        ForEach($inName in $activity.Inputs){
        Write-Host "INPUT: "$inName.Name
        }
        ForEach($outName in $activity.Outputs){
        Write-Host "OUTPUT: "$outName.Name
        }
    }
}

 

Below is sample output from the script.

Azure Data Factory
——————————–
Linked Services
LS_ORA_EBS : OnPremisesOracle
LS_DB2_AS400 : OnPremisesOdbc
LS_ADLA_USQL : AzureDataLakeAnalytics
LS_SQLDW : AzureSqlDW
LS_ADLS : AzureDataLakeStore
LS_SQL_OLTPDB : OnPremisesSqlServer
LS_BLOB_DATAFACTORY : AzureStorage

——————————–
Datasets
DS_IN_OLTPDB_CONFIG : LS_SQL_OLTPDB
DS_IN_OLTPDB_CONFIG_DAILY : LS_SQL_OLTPDB
DS_IN_OLTP_ITEM_DATA : LS_SQL_OLTPDB
DS_IN_SQLDW_SALES : LS_SQL_OLTPDB

——————————–
Pipelines
PL_ADLS_RAW_COST_ADLS_STAGING
INPUT: DS_OUT_ADLS_RAW_COST
OUTPUT: DS_OUT_ADLS_STAGING_COST

PL_ADLS_STAGING_COST_ADLS_CURATED
INPUT: DS_OUT_ADLS_RAW_COST
INPUT: DS_OUT_ADLS_RAW_ITEMS
OUTPUT: DS_OUT_ADLS_CURATED_COST

Leave a Reply