Create a Data Pipeline in Azure - azure

I've a class which make some extract, transform an load to a dataset located in a different JSON files.
This process work Ok. But, I've the necessity to process manually every month. I submitt an spark application in intelliJ (and submit an Scalla Singleton Object with the transformation)
So, I'm trying to automate this process. But, I didn't find documentation or a tutorial to known what is the best service to accomplish this objective.
The processs Should:
Create a HDInsight Spark Cluster
Run The process (An Scala Class)
Delete the HDInsight Spark Cluster created before
I've searched but the links I find (looking for "Create on demand HD insight spark cluster") are the following:
Access datalake from Azure datafactory V2 using on demand HD Insight
cluster
How to create Azure on demand HD insight Spark cluster using Data
Factory
Other options I've searched:
Host and run your PowerShell scripts in Azure
Azure Logic Apps
Azure Automation
Thanks!

Here are the process which you want to
Create a HDInsight Spark Cluster
Using power shell it should be easy to create HDInsight cluster, here is a sample code:
### Create a Spark 2.3 cluster in Azure HDInsight
# Default cluster size (# of worker nodes), version, and type
$clusterSizeInNodes = "1"
$clusterVersion = "3.6"
$clusterType = "Spark"
# Create the resource group
$resourceGroupName = Read-Host -Prompt "Enter the resource group name"
$location = Read-Host -Prompt "Enter the Azure region to create resources in, such as 'Central US'"
$defaultStorageAccountName = Read-Host -Prompt "Enter the default storage account name"
New-AzResourceGroup -Name $resourceGroupName -Location $location
# Create an Azure storage account and container
# Note: Storage account kind BlobStorage can only be used as secondary storage for HDInsight clusters.
New-AzStorageAccount `
-ResourceGroupName $resourceGroupName `
-Name $defaultStorageAccountName `
-Location $location `
-SkuName Standard_LRS `
-Kind StorageV2 `
-EnableHttpsTrafficOnly 1
$defaultStorageAccountKey = (Get-AzStorageAccountKey `
-ResourceGroupName $resourceGroupName `
-Name $defaultStorageAccountName)[0].Value
$defaultStorageContext = New-AzStorageContext `
-StorageAccountName $defaultStorageAccountName `
-StorageAccountKey $defaultStorageAccountKey
# Create a Spark 2.3 cluster
$clusterName = Read-Host -Prompt "Enter the name of the HDInsight cluster"
# Cluster login is used to secure HTTPS services hosted on the cluster
$httpCredential = Get-Credential -Message "Enter Cluster login credentials" -UserName "admin"
# SSH user is used to remotely connect to the cluster using SSH clients
$sshCredentials = Get-Credential -Message "Enter SSH user credentials" -UserName "sshuser"
# Set the storage container name to the cluster name
$defaultBlobContainerName = $clusterName
# Create a blob container. This holds the default data store for the cluster.
New-AzStorageContainer `
-Name $clusterName `
-Context $defaultStorageContext
$sparkConfig = New-Object "System.Collections.Generic.Dictionary``2[System.String,System.String]"
$sparkConfig.Add("spark", "2.3")
# Create the HDInsight cluster
New-AzHDInsightCluster `
-ResourceGroupName $resourceGroupName `
-ClusterName $clusterName `
-Location $location `
-ClusterSizeInNodes $clusterSizeInNodes `
-ClusterType $clusterType `
-OSType "Linux" `
-Version $clusterVersion `
-ComponentVersion $sparkConfig `
-HttpCredential $httpCredential `
-DefaultStorageAccountName "$defaultStorageAccountName.blob.core.windows.net" `
-DefaultStorageAccountKey $defaultStorageAccountKey `
-DefaultStorageContainer $clusterName `
-SshCredential $sshCredentials
Get-AzHDInsightCluster `
-ResourceGroupName $resourceGroupName `
-ClusterName $clusterName
Run The process (An Scala Class)
You can refer this link to submit an application job remotely to the Spark cluster:
https://learn.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-create-standalone-application#run-the-application-on-the-apache-spark-cluster
Delete the HDInsight Spark Cluster created before
Cleaning up the cluster , you can achieve it using powershell, here is a sample code for the same;
# Removes the specified HDInsight cluster from the current subscription.
Remove-AzHDInsightCluster `
-ResourceGroupName $resourceGroupName `
-ClusterName $clusterName
# Removes the specified storage container.
Remove-AzStorageContainer `
-Name $clusterName `
-Context $defaultStorageContext
# Removes a Storage account from Azure.
Remove-AzStorageAccount `
-ResourceGroupName $resourceGroupName `
-Name $defaultStorageAccountName
# Removes a resource group.
Remove-AzResourceGroup `
-Name $resourceGroupName
Additional reference:
https://learn.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-jupyter-spark-sql-use-powershell
https://github.com/MicrosoftDocs/azure-docs/blob/master/articles/data-factory/v1/data-factory-build-your-first-pipeline-using-powershell.md
Hope it helps.

Related

Azure VM templates

So, I exported a VM template and I'm trying to build more VMs based on that template.
How can I define the subscription, resource group and region in the template or in the parameters file?
You can use Azure Powershell to Deploy Virtual Machine, I have reproduced in my environment and followed Microsoft-Document :
Firstly if you donot have a resourcegroup create it using below cmdllet:
New-AzResourceGroup -Name 'rithwik' -Location 'EastUS'
(rithwik- Resourcegroup name)
Then you need to try below cmdlet for deploying VM:
New-AzVm `
-ResourceGroupName 'myResourceGroup' `
-Name 'myVM' `
-Location 'East US' `
-VirtualNetworkName 'myVnet' `
-SubnetName 'mySubnet' `
-SecurityGroupName 'myNetworkSecurityGroup' `
-PublicIpAddressName 'myPublicIpAddress' `
-OpenPorts 80,3389
After giving the command type your username and password as below:
Output:
References of Code taken from:
https://learn.microsoft.com/en-us/azure/virtual-machines/windows/quick-create-powershell#code-try-1
In Portal:

Is it possible to install application while creating Azure VM from PowerShell?

I am trying to create a new Azure VM from PowerShell.
I am currently using the below script to create VM:
$location = "EastUS"
$rgName = "TestRG"
$credential = Get-Credential
New-AzResourceGroup -Name $rgName -Location $location
New-AzVm `
-ResourceGroupName $rgName `
-Name "TestVM" `
-Location $location `
-VirtualNetworkName "TestVnet" `
-SubnetName "TestSubnet" `
-SecurityGroupName "TestNsg" `
-PublicIpAddressName "TestPip" `
-OpenPorts 80,3389 `
-Credential $credential
Can anyone achieve installing applications from PowerShell into Azure VM while creating it? If it's possible, how to do that? Can anyone assist??
Thanks in Advance.
You can not install apllication or add extension to the VM while creating the VM . Once VM will be provisioned then only can only install the application or the add the extension.
You can refer this Micorosoft Document to install the Custom Script Extension Using Set-AzVMExtension.

How do you associate an Azure web app with a vnet using PowerShell Az?

I know it can be done using Azure CLI like this:
az webapp vnet-integration add -g $resourceGroupName -n $applicationName --vnet $vnetName --subnet $subnetName
Is there an equivalent command using PowerShell Az?
If you reference the docs at https://learn.microsoft.com/en-us/azure/app-service/web-sites-integrate-with-vnet, at the bottom is a link to the Script Center gallery where this is a full PS1 script at https://gallery.technet.microsoft.com/scriptcenter/Connect-an-app-in-Azure-ab7527e3 which shows how to integrate web app with vnet.
The final lines of interest (it uses AzureRM, but should be easy to convert to Az):
$PropertiesObject = #{
"vnetName" = $VirtualNetworkName; "vpnPackageUri" = $packageUri
}
New-AzureRmResource -Location $location -Properties $PropertiesObject -ResourceName "$($webAppName)/$($vnetName)/primary" -ResourceType "Microsoft.Web/sites/virtualNetworkConnections/gateways" -ApiVersion 2015-08-01 -ResourceGroupName $webAppResourceGroup -Force

How to create a Linux AppService Plan with New-AzAppServicePlan?

What is the equivalient of this code using New-AzAppServicePlan?
az appservice plan create --resource-group $ServerFarmResourceGroupName `
--name $AppServicePlanName `
--is-linux `
--location $ResourceGroupLocation `
--sku $AppServicePlanTier `
--number-of-workers $NumberOfWorkers
Is there really no way to create an App Service Plan using Az Powershell? Why can it only be done via Azure CLI or ARM?
I only found this answer, which basically uses ARM directly: How do I use Powershell to create an Azure Web App that runs on Linux?
There are some issues about this, suppose for now this is not supported for New-AzureRmAppServicePlan, however you could use New-AzureRmResource to create a linux plan. You could try the below command.
New-AzureRmResource -ResourceGroupName <>group name -Location "Central US" -ResourceType microsoft.web/serverfarms -ResourceName <plan name> -kind linux -Properties #{reserved="true"} -Sku #{name="S1";tier="Standard"; size="S1"; family="S"; capacity="1"} -Force
I originally used my script to create a ConsumptionPlan (Y1) through PowerShell and AzureCLI because I don't like when Azure put a generated name when creating a ConsumptionPlan.
Please find my solution to create a Linux App Service Plan (B1) using New-AzResource:
$fullObject = #{
location = "West Europe"
sku = #{
name = "B1"
tier = "Basic"
}
kind = "linux"
properties = #{
reserved = $true
}
}
$resourceGroupName = "rg-AppServicePlanLinux"
$serverFarmName = "aspl-test"
Write-Host "Step 1: CREATING APP SERVICE PLAN B1:Basic named [$serverFarmName]"
# Create a server farm which will host the function app in the resource group specified
New-AzResource -ResourceGroupName $resourceGroupName -ResourceType "Microsoft.Web/serverfarms" -Name $serverFarmName -IsFullObject -PropertyObject $fullObject -Force
So I used the ARM template to understand which information you need to provide on the -PropertyObject parameter
It also now seems possible to do an App Service Plan Linux with New-AzAppServicePlan command since Az PowerShell 4.3.0 (June 2020) with the parameter -Linux
Az.Websites
Added safeguard to delete created webapp if restore failed in 'Restore-AzDeletedWebApp'
Added 'SourceWebApp.Location' for 'New-AzWebApp' and 'New-AzWebAppSlot'
Fixed bug that prevented changing Container settings in 'Set-AzWebApp' and 'Set-AzWebAppSlot'
Fixed bug to get SiteConfig when -Name is not given for Get-AzWebApp
Added a support to create ASP for Linux Apps
Added exceptions for clone across resource groups
Release Note: https://learn.microsoft.com/en-us/powershell/azure/release-notes-azureps?view=azps-5.6.0&viewFallbackFrom=azps-4.3.0#azwebsites-7
New-AzAppServicePlan: https://learn.microsoft.com/en-us/powershell/module/az.websites/new-azappserviceplan?view=azps-5.6.0
If you get "The Service is unavailable" after deploying your new Function app (Consumption Plan) with Azure CLI, please make sure the following statement from Microsoft:
https://github.com/Azure/Azure-Functions/wiki/Creating-Function-Apps-in-an-existing-Resource-Group
I waste the whole day because I got another Function App (Premium Plan) in the same resource group I used to deploy the Consumption one.
This worked for me:
Adding -Linux as a parameter to my command
New-AzAppServicePlan -ResourceGroupName $RESOURCE_GROUP_NAME -Name $APP_SERVICE_PLAN_NAME -Location $RESOURCE_LOCATION -Linux -Tier $APP_SERVICE_PLAN_TIER -NumberofWorkers $APP_SERVICE_PLAN_WORKERS -WorkerSize $APP_SERVICE_PLAN_WORKER_SIZE
Example:
New-AzAppServicePlan -ResourceGroupName 'MyResourceGroup' -Name 'MyServicePlan' -Location 'northeurope' -Linux -Tier 'PremiumV2' -NumberofWorkers 2 -WorkerSize Medium
That's all.
I hope this helps

Azure - VMSS - Data disk

I have created a vmss and added an addition Fisk from Fisk storage but it’s not available in the VM’s disk management as we usually get for normal VM(not from VMSS)
So how I can attach the new data disk on my VM from vmss.
There are a bunch of tutorials (in the official docs) available for this:
https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/tutorial-use-disks-cli
https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/tutorial-use-disks-powershell
https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-attached-disks
powershell cmd:
# Get scale set object
$vmss = Get-AzVmss `
-ResourceGroupName "myResourceGroup" `
-VMScaleSetName "myScaleSet"
# Attach a 128 GB data disk to LUN 2
Add-AzVmssDataDisk `
-VirtualMachineScaleSet $vmss `
-CreateOption Empty `
-Lun 2 `
-DiskSizeGB 128
# Update the scale set to apply the change
Update-AzVmss `
-ResourceGroupName "myResourceGroup" `
-Name "myScaleSet" `
-VirtualMachineScaleSet $vmss
cli:
az vmss disk attach \
--resource-group myResourceGroup \
--name myScaleSet \
--size-gb 128
Check this link for ARM template Data Disk Link
You can find detailed options of how to use data disk and use prepopulated data disk here:
Azure virtual machine scale sets and attached data disks
Azure VM Scale Sets attach-detach disk preview

Resources