Securely Configure Azure Storage Credentials for Flink - azure

The Apache Flink docs on the Azure filesystem say that it's discouraged to put storage account keys in the flink-conf.yaml. Putting them in plain text wouldn't be secure, but it's not clear to me how to securely store them. The link to Azure documentation doesn't help. Assuming I wanted to run the Flink word-count example (local:///opt/flink/examples/batch/WordCount.jar in the Flink container) on Azure how would I setup secure storage access? (There is the option to specify a single key as environment variable which could be a Kubernetes secret, but if I have more than one storage account that wouldn't work).
A good starting point might be this Flink on Azure quickstart or the Flink operator.

Related

WriteStream a dataFrame to Elasticsearch behind an Azure API Management that requires a client certificate?

We have an environment where we have Elasticsearch that is protected behind Azure API Management. We have this locked down with client certificate requirements (as well as other security measures). Calls that come into APIM w/o the client cert are rejected.
I have a new system I am brining online where data is stored in Delta Lake tables and processed with PySpark (using Azure Synapse). At the end of the processing, I want to push the final product to Elasticsearch. I know that I can write to es using org.elasticsearch.spark, but I don't see any way that I can include a client certificate to be able to clear the APIM.
Are any of these possible?
Include a certificate when making the connection to Elasticsearch for the writeStream.
Use .Net to do the the streaming reads and writes. I am not yet sure what capabilities Microsoft.Spark has and if it can read from Delta tables with structured streaming. If it does work, I can use my existing libraries for calling to ES.
Find a way to peer the VNets so that I can call ES via a local IP address. I am doing this in another sytem, but in that case, I have access to both VNets. With Synapse, the Spark Pook is managed and I don't think it supports the Azure VNet peering functionality.
Something else?
Thanks!
The only solution I have found was to enable a Private Link Service on the Load Balancer in the cluster where I am running Elasticsearch, then create a Managed Private Endpoint connected to the PLS in Synapse.

moving locally stored documented documents to azure

I want to spike whether azure and the cloud is a good fit for us.
We have a website where users upload documents to our currently hosted website.
Every document has an equivalent record in a database.
I am using terraform to create the azure infrastructure.
What is my best way of migrating the documents from the local file path on the server to azure?
Should I be using file storage or blob storage. I am confused about the difference.
Is there anything in terraform that can help with this?
Based on your comments, I would recommend storing them in Blob Storage. This service is suited for storing and serving unstructured data like files and images. There are many other features like redundancy, archiving etc. that you may find useful in your scenario.
File Storage is more suitable in Lift-and-Shift kind of scenarios where you're moving an on-prem application to the cloud and the application writes data to either local or network attached disk.
You may also find this article useful: https://learn.microsoft.com/en-us/azure/storage/common/storage-decide-blobs-files-disks
UPDATE
Regarding uploading files from local computer to Azure Storage, there are actually many options available:
Use a Storage Explorer like Microsoft's Storage Explorer.
Use AzCopy command-line tool.
Use Azure PowerShell Cmdlets.
Use Azure CLI.
Write your own code using any available Storage Client libraries or directly consuming REST API.

is it possible to take Azure key vaults secretes to blob storage?

Good morning!
want to take backup of azure key vaults secretes blob storage using power shell.
i,m able to take backup to my local machine. team any help? suggestion pls?
There isn't a direct mechanism to achieve this. You will indeed need to have an intermediary PowerShell process to download the secrets and upload them to blob storage.
Using blob storage as a medium for backup is okay provided you fully understand the implications and mitigate the risks. You should at the very least ensure your storage account resides in a different region to your KeyVault for continuity reasons, and have appropriate controls in place to prevent unauthorized access. You must also appreciate that the transportation of secrets is ultimately protected by a RSA 2048-bit key encrypting key (KEK). You should apply key equivalency principles when making consideration for the security of your secrets in transportation outside of the Microsoft network. You should also consider the security of the machine from which you run PowerShell on. Using an automation account in Azure to run the PowerShell using a service principal may be better.
To send a file to blob storage using PowerShell, please see this article:
https://learn.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-powershell

Is Azure Cloud Service Local Storage encrypted?

Is Azure Cloud Service Local Storage encrypted?
I would like to utilize Local Storage for my worker role as a scratch disk for image manipulation. I'm currently using an encrypted Azure file share, but the performance isn't great. I'm concerned that if I start using Local Storage my data may not be encrypted at rest. I haven't been able to find definitive information about encryption and Local Storage.
Microsoft clearly allows blob and file services to encrypt data using Storage Service Encryption (SSE) and its trivial to enable this via the Azure Portal.
When configuring local storage in the cloud service definition I don't see any options related to encryption. There's also no mention of Local Storage in the Azure data at rest white paper.
It looks like Azure Disk Encryption supports encrypting both OS and data drives, but again, I didn't see any mention of Local Storage or PaaS in the Azure Disk Encryption page.

How to get list of file from Azure blob using Spark/Scala?

How to get list of file from Azure blob storage in Spark and Scala.
I am not getting any idea to approach this.
I don't know the Spark you used is either on Azure or on local. So they are two cases, but similar.
For Spark running on local, there is an offical blog which introduces how to access Azure Blob Storage from Spark. The key is that you need to configure Azure Storage account as HDFS-compatible storage in core-site.xml file and add two jars hadoop-azure & azure-storage to your classpath for accessing HDFS via the protocol wasb[s]. You can refer to the offical tutorial to know HDFS-compatible storage with wasb, and the blog about configuration for HDInsight more details.
For Spark running on Azure, the difference is just only access HDFS with wasb, the other preparations has been done by Azure when creating HDInsight cluster with Spark.
The method for listing files is listFiles or wholeTextFiles of SparkContext.
Hope it helps.
If you are using databricks, try the below
dbutils.fs.ls(“blob_storage_location”)

Resources