Access Azure Function App system keys in Terraform - azure

I want to create Azure EventGrid subscription using Terraform.
resource "azurerm_eventgrid_system_topic_event_subscription" "function_app" {
name = "RunOnBlobUploaded"
system_topic = azurerm_eventgrid_system_topic.function_app.name
resource_group_name = azurerm_resource_group.rg.name
included_event_types = [
"Microsoft.Storage.BlobCreated"
]
subject_filter {
subject_begins_with = "/blobServices/default/containers/input"
}
webhook_endpoint {
url = "https://thumbnail-generator-function-app.azurewebsites.net/runtime/webhooks/blobs?functionName=Create-Thumbnail&code=<BLOB-EXTENSION-KEY>"
}
}
By following this doc, I successfully deployed it and it works. However, the webhook_endpoint URL needs <BLOB-EXTENSION-KEY> which is hardcoded right now and found from the following place in the portal:
In order to not commit a secret to GitHub, I want to get this value by reference, ideally using Terraform.
According to my research, it seems there is no way in Terraform to reference that value.
The closest one is this data source azurerm_function_app_host_keys in Terraform. However, it doesn't cover the blobs_extension key!
Is there any good way to reference blobs_extension in Terraform without a hardcoded value?
Thanks in advance!

If TF does not support it yet, you can create your own External Data Source which is going to use azure cli or sdk to get the value you want, and return it to your TF for further use.

Related

How to parameterise cluster type (basic/standard/dedicated) on confluent cloud

I am trying to automate cluster creation process of confluent cloud using terraform. I would like to spawn different clusters for each environment using the same code through parameters.
resource "confluent_kafka_cluster" "standard" {
display_name = "standard_kafka_cluster"
availability = "SINGLE_ZONE"
cloud = "AZURE"
region = "centralus"
standard {}
environment {
id = confluent_environment.development.id
}
lifecycle {
prevent_destroy = true
}
}
I would like to parametrize standard/basic/dedicated so that I can have basic in dev/staging and standard/dedicated on uat/prod.
I have tried to do it using dynamic block. Haven't got any success yet. Any help would be really appreciated.
The resource name cannot be dynamic; that needs to be saved in the state file as a static resource ID.
You could create a Terraform Module, to define a generic "azure-centralus-confluentcloud" module, then parameterize the rest, or you can use a for_each to loop over each environment, then use accessors like confluent_kafka_cluster.clusters["dev"] when you need a specific one.

Terraform: How to obtain VPCE service name when it was dynamically created

I am trying to obtain (via terraform) the dns name of a dynamically created VPCE endpoint using a data resource but the problem I am facing is the service name is not known until resources have been created. See notes below.
Is there any way of retrieving this information as a hard-coded service name just doesn’t work for automation?
e.g. this will not work as the service_name is dynamic
resource "aws_transfer_server" "sftp_lambda" {
count = local.vpc_lambda_enabled
domain = "S3"
identity_provider_type = "AWS_LAMBDA"
endpoint_type = "VPC"
protocols = ["SFTP"]
logging_role = var.loggingrole
function = var.lambda_idp_arn[count.index]
endpoint_details = {
security_group_ids = var.securitygroupids
subnet_ids = var.subnet_ids
vpc_id = var.vpc_id
}
tags = {
NAME = "tf-test-transfer-server"
ENV = "test"
}
}
data "aws_vpc_endpoint" "vpce" {
count = local.vpc_lambda_enabled
vpc_id = var.vpc_id
service_name = "com.amazonaws.transfer.server.c-001"
depends_on = [aws_transfer_server.sftp_lambda]
}
output "transfer_server_dnsentry" {
value = data.aws_vpc_endpoint.vpce.0.dns_entry[0].dns_name
}
Note: The VPCE was created automatically from an AWS SFTP transfer server resource that was configured with endpoint type of VPC (not VPC_ENDPOINT which is now deprecated). I had no control over the naming of the endpoint service name. It was all created in the background.
Minimum AWS provider version: 3.69.0 required.
Here is an example cloudformation script to setup an SFTP transfer server using Lambda as the IDP.
This will create the VPCE automatically.
So my aim here is to output the DNS name from the auto-created VPC endpoint using terraform if at all possible.
example setup in cloudFormation
data source: aws_vpc_endpoint
resource: aws_transfer_server
I had a response from Hashicorp Terraform Support on this and this is what they suggested:
you can get the service SFTP-Server-created-VPC-Endpoint by calling the following exported attribute of the vpc_endpoint_service resource [a].
NOTE: There are certain setups that causes AWS to create additional resources outside of what you configured. The AWS SFTP transfer service is one of them. This behavior is outside Terraform's control and more due to how AWS designed the service.
You can bring that VPC Endpoint back under Terraform's control however, by importing the VPC endpoint it creates on your behalf AFTER the transfer service has been created - via the VPCe ID [b].
If you want more ideas of pulling the service name from your current AWS setup, feel free to check out this example [c].
Hope that helps! Thank you.
[a] https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/vpc_endpoint_service#service_name
[b] https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/vpc_endpoint#import
[c] https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/vpc_endpoint#gateway-load-balancer-endpoint-type
There is a way forward like I shared earlier with the imports but it not going to be fully automated unfortunately.
Optionally, you can use a provisioner [1] and the aws ec2 describe-vpc-endpoint-services --service-names command [2] to get the service names you need.
I'm afraid that's the last workaround I can provide, as explained in our doc here [3] - which will explain how - as much as we'd like to, Terraform isn't able to solve all use-cases.
[1] https://www.terraform.io/language/resources/provisioners/remote-exec
[2] https://awscli.amazonaws.com/v2/documentation/api/latest/reference/ec2/describe-vpc-endpoint-services.html
[3] https://www.terraform.io/language/resources/provisioners/syntax
I've finally found the solution:
data "aws_vpc_endpoint" "transfer_server_vpce" {
count = local.is_enabled
vpc_id = var.vpc_id
filter {
name = "vpc-endpoint-id"
values = ["${aws_transfer_server.transfer_server[0].endpoint_details[0].vpc_endpoint_id}"]
}
}

How to create Azure Databricks Notebook via Terraform?

So I am completely new to the terraform and I found that by using this in terraform main.tf I can create Azure Databricks infrastructure:
resource "azurerm_databricks_workspace" "bdcc" {
depends_on = [
azurerm_resource_group.bdcc
]
name = "dbw-${var.ENV}-${var.LOCATION}"
resource_group_name = azurerm_resource_group.bdcc.name
location = azurerm_resource_group.bdcc.location
sku = "standard"
tags = {
region = var.BDCC_REGION
env = var.ENV
}
}
And I also found here
That by using this I can even create particular notebook in this Azure DataBricks infrastructure:
resource "databricks_notebook" "notebook" {
content_base64 = base64encode(<<-EOT
# created from ${abspath(path.module)}
display(spark.range(10))
EOT
)
path = "/Shared/Demo"
language = "PYTHON"
}
But since I am new to this, I am not sure in what order I should put those pieces of code together.
It would be nice if someone could point me to the full example of how to create notebook via terraform on Azure Databricks.
Thank you beforehand!
In general you can put these objects in any order - it's a job of the Terraform to detect dependencies between the objects and create/update them in the correct order. For example, you don't need to have depends_on in the azurerm_databricks_workspace resource, because Terraform will find that it needs resource group before workspace could be created, so workspace creation will follow the creation of the resource group. And Terraform is trying to make the changes in the parallel if it's possible.
But because of this, it's becoming slightly more complex when you have workspace resource together with workspace objects, like, notebooks, clusters, etc. As there is no explicit dependency, Terraform will try create notebook in parallel with creation of workspace, and it will fail because workspace doesn't exist - usually you will get a message about authentication error.
The solution for that would be to have explicit dependency between notebook & workspace, plus you need to configure authentication of Databricks provider to point to newly created workspace (there are differences between user & service principal authentication - you can find more information in the docs). At the end your code would look like this:
resource "azurerm_databricks_workspace" "bdcc" {
name = "dbw-${var.ENV}-${var.LOCATION}"
resource_group_name = azurerm_resource_group.bdcc.name
location = azurerm_resource_group.bdcc.location
sku = "standard"
tags = {
region = var.BDCC_REGION
env = var.ENV
}
}
provider "databricks" {
host = azurerm_databricks_workspace.bdcc.workspace_url
}
resource "databricks_notebook" "notebook" {
depends_on = [azurerm_databricks_workspace.bdcc]
...
}
Unfortunately, there is no way to put depends_on on the provider level, so you will need to put it into every Databricks resource that is created together with workspace. Usually the best practice is to have a separate module for workspace creation & separate module for objects inside Databricks workspace.
P.S. I would recommend to read some book or documentation on Terraform. For example, Terraform: Up & Running is very good intro

Using databricks workspace in the same configuration as the databricks provider

I'm having some trouble getting the azurerm & databricks provider to work together.
With the azurerm provider, setup my workspace
resource "azurerm_databricks_workspace" "ws" {
name = var.workspace_name
resource_group_name = azurerm_resource_group.rg.name
location = azurerm_resource_group.rg.location
sku = "premium"
managed_resource_group_name = "${azurerm_resource_group.rg.name}-mng-rg"
custom_parameters {
virtual_network_id = data.azurerm_virtual_network.vnet.id
public_subnet_name = var.public_subnet
private_subnet_name = var.private_subnet
}
}
No matter how I structure this, I can't say seem to get the azurerm_databricks_workspace.ws.id to work in the provider statement for databricks in the the same configuration. If it did work, the above workspace would be defined in the same configuration and I'd have a provider statement that looks like this:
provider "databricks" {
azure_workspace_resource_id = azurerm_databricks_workspace.ws.id
}
Error:
I have my ARM_* environment variables set to identify as a Service Principal with Contributor on the subscription.
I've tried in the same configuration & in a module and consuming outputs. The only way I can get it to work is by running one configuration for the workspace and a second configuration to consume the workspace.
This is super suboptimal in that I have a fair amount of repeating values across those configurations and it would be ideal just to have one.
Has anyone been able to do this?
Thank you :)
I've had the exact same issue with a not working databricks provider because I was working with modules. I separated the databricks infra (Azure) with databricks application (databricks provider).
In my databricks module I added the following code at the top, otherwise it would use my azure setup:
terraform {
required_providers {
databricks = {
source = "databrickslabs/databricks"
version = "0.3.1"
}
}
}
In my normal provider setup I have the following settings for databricks:
provider "databricks" {
azure_workspace_resource_id = module.databricks_infra.databricks_workspace_id
azure_client_id = var.ARM_CLIENT_ID
azure_client_secret = var.ARM_CLIENT_SECRET
azure_tenant_id = var.ARM_TENANT_ID
}
And of course I have the azure one. Let me know if it worked :)
If you experience technical difficulties with rolling out resources in this example, please make sure that environment variables don't conflict with other provider block attributes. When in doubt, please run TF_LOG=DEBUG terraform apply to enable debug mode through the TF_LOG environment variable. Look specifically for Explicit and implicit attributes lines, that should indicate authentication attributes used. The other common reason for technical difficulties might be related to missing alias attribute in provider "databricks" {} blocks or provider attribute in resource "databricks_..." {} blocks. Please make sure to read alias: Multiple Provider Configurations documentation article.
From the error message, it looks like Authentication is not configured for provider could you please configure it through the one of following options mentioned above.
For more details, refer Databricks provider - Authentication.
For passing the custom_parameters, you may checkout the SO thread which addressing the similar issue.
In case if you need more help on this issue, I would suggest to open an issue here: https://github.com/terraform-providers/terraform-provider-azurerm/issues

How to create an Azure Data factory Azure SQL Database dataset using terraform

I am trying to create an Azure SQL Database dataset using terraform for my Azure Data Factory. The code bellow works fine to define a linked service:
resource "azurerm_data_factory_linked_service_azure_sql_database" "example" {
name = "example"
resource_group_name = azurerm_resource_group.example.name
data_factory_name = azurerm_data_factory.example.name
connection_string = "data source=serverhostname;initial catalog=master;user id=testUser;Password=test;integrated security=False;encrypt=True;connection timeout=30"
}
But I can't find a way to create the dataset, since the only resource provider for SQL datasets is the azurerm_data_factory_dataset_sql_server which does not work with the provider azurerm_data_factory_linked_service_azure_sql_database because it was supposed to be used with the azurerm_data_factory_linked_service_sql_server
It's been a while, but maybe someone could use the solution.
You could use: azurerm_data_factory_custom_dataset
resource "azurerm_data_factory_custom_dataset" "DatasetSource" {
name = "Your ADF Name"
data_factory_id = "Your ADF id"
type = "AzureSqlTable"
linked_service {
name = azurerm_data_factory_linked_service_azure_sql_database.LinkedServicesDBSource.name
}
type_properties_json = <<JSON
{
}
JSON
}
It looks like this isn't supported in tf yet (as I'm discovering are a lot of things).
You could raise it here https://github.com/terraform-providers/terraform-provider-azurerm as an enhancement - there are a couple of kind folk actively adding tf resources in this area.
We made a decision to implement our linked services in tf (because they contain/use secrets we can inject in the pipeline), but are deploying datasets as JSON from the repo. Any reason why you want to deploy them with tf specifically? Did we miss something?

Resources