I want to download large files (>300MB) from an Azure Blob Storage within an Azure Container App using the dapr output binding. This works without any issues for small file (~<100 MB) but it keeps failing for files that are larger than 300 MB.
My Container has 4GiB RAM and 2 CPU and regarding the Metrics Dashboard, I am way under the resource limits. I can reproduce the error when connecting to the container using a bash and try to retrieve the files using the dapr HTTP API with the following command:
curl -L -d '{ "operation": "get", "metadata": { "blobName": "XY_2217.ziprt" }}' http://localhost:3500/v1.0/bindings/raw-data-storage --output testlarge.zip
Here is a screenshot that shows two downloads. The first with a small file is working, the second with a large file keeps failing:
Any ideas?
Related
I have some issues with deployment.
When I run command below to push configuration it pushes corrupted data:
az webapp config container set --resource-group ${AZURE_RG_NAME} --name ${AZURE_APP_NAME} --multicontainer-config-type compose --multicontainer-config-file deploy/docker-compose.yml
As I see, sent encoded data (as base64) cannot be decoded properly:
{
"name": "DOCKER_CUSTOM_IMAGE_NAME",
"value": "COMPOSE|dmvyc2lvbjogjzmncnnlcnzpy2..." //here
}
When I try to dump base64 file, encoded by myself, it decodes correctly. I have checked encoding for both files and these are UTF-8.
This is how it looks in azure configuration page.
I have talked with Azure support about this and they have confirmed the bug and a release is on its way. Here is a bug report and link to the fix: https://github.com/Azure/azure-cli/issues/14208
Now known ETA for the deploy, but the support engineer guessed around the beginning of August.
In the mean time, we have worked around this bug by pasting the contents of the Docker Compose file under "Container Settings" in the Azure Portal.
"fileUris": [
"https://files.blob.core.windows.net/extensions/test.sh"]
In an Azure scale set, does this part of extension download the file test.sh to the VM or call it directly from blob storage?
I'm assuming you are talking about the custom script extension for Azure virtual machines.
On its documentation page it reads:
The Custom Script Extension downloads and executes scripts on Azure
virtual machines. This extension is useful for post deployment
configuration, software installation, or any other configuration /
management task. Scripts can be downloaded from Azure storage or
GitHub, or provided to the Azure portal at extension run time. The
Custom Script extension integrates with Azure Resource Manager
templates, and can also be run using the Azure CLI, PowerShell, Azure
portal, or the Azure Virtual Machine REST API.
Highlighted are the relevant parts.
The extension work so that it first downloads and then executes the scripts you provide for it.
Edit: If you need to deploy some external resources you can upload them to your GitHub account or an Azure Storage Blob and download/read them from there.
See for example this answer for more details on how to download a file from a blob.
Invoke-WebRequest -Uri https://jasondisk2.blob.core.windows.net/msi/01.PNG -outfile 'C:\'
If you simply want to read the json file, then you can do as described here in this other answer.
$response = Invoke-RestMethod -Uri "https://yadayada:8080/bla"
$response.flag
Note: Invoke-RestMethod automatically converts the json response to a psobject.
As for the working directory. The extension downloads its files into the following directory
C:\Packages\Plugins\Microsoft.Compute.CustomScriptExtension\1.*\Downloads\<n>
where <n> is a decimal integer which may change between executions of the extension. The 1.* value matches the actual, current typeHandlerVersion value of the extension.
For example, the actual directory could be
C:\Packages\Plugins\Microsoft.Compute.CustomScriptExtension\1.8\Downloads\2
See the troubleshooting section in the Azure documentation for more information.
Alternatively, for a Linux based system the path is similar to
/var/lib/waagent/custom-script/download/0/
see this page for more information.
I am developing a Logic App which is scheduled every minute and creates a storage blob with logging data. My problem is that I must create a container for the blob manually to get it working. If I create blob within a not existing container
I get the following error:
"body": {
"status": 404,
"message": "Specified container tmp does not exist.\r\nclientRequestId: 1111111-2222222-3333-00000-4444444444",
"source": "azureblob-we.azconn-we.p.azurewebsites.net"
}
This stackoverflow question suggested putting the container name in the blob name
but If I do so I get the same error message: (also with /tmp/log1.txt)
{
"status": 404,
"message": "Specified container tmp does not exist.\r\nclientRequestId: 1234-8998989-a93e-d87940249da8",
"source": "azureblob-we.azconn-we.p.azurewebsites.net"
}
So you may say that is not a big deal, but I have to deploy this Logic App multiple times with a ARM template and there is no possibility to create a container in an storage account (see this link).
Do I really need to create the container manually or write a extra azure function to check if the container exists?
I have run in this before, you have a couple of options:
You write something that runs after the ARM template to provision the container. This is simple enough if you are provisioning through VSTS release management where you can just add another step.
You move from ARM templates to provisioning in PowerShell where you have the power to do the container creation. See New-AzureStorageContainer.
Say I have a bunch of similar tasks run in parallel in Azure Batch pool of VMs. These tasks connect to SQL database and extract data for individual table using sqlcmd. Then table output is compressed by piping it to 7zip.exe. So, command line is similar to (mind those "")
cmd /c sqlcmd -i.\table.sql -S . -E -s "," -I -h -1 -W| "c:\Program Files\7-Zip\7z.exe" a -tbzip2 -si "out.csv.bz2"
The catch here is that normally data is saved to each VMs local storage as out.csv.bz2 file. However, under Azure Batch once tasks are finished, VMs allocated from the pool are gone. So, I need a mechanism to collect all these out.csv.bz2 files to Azure storage account (eg Azure Blob Storage or Data Lake storage). I don't seem find a mechanism in Azure Batch to redirect/persist output instead of local VMs storage directly to Azure Storage for my command line task(s).
Does anyone know how to accomplish this?
If you are on a VirtualMachineConfiguration pool, you can use OutputFile to upload to Azure Storage Blobs. This functionality is not available in CloudServiceConfiguration pools yet.
For CloudServiceConfiguration pools, please see this related question.
Is there a way to upload multiple files to Azure Blob Storage from a Linux machine, either using the terminal or an application (web based or not)?
Thank you for your interest – There are two options to upload files in Azure Blobs from Linux:
Setup and use XPlatCLI by following the steps below:
Install the OS X Installer from http://azure.microsoft.com/en-us/documentation/articles/xplat-cli/
Open a Terminal window and connect to your Azure subscription by either downloading and using a publish settings file or by logging in to Azure using an organizational account (find instructions here)
Create an environment variable AZURE_STORAGE_CONNECTION_STRING and set its value (you will need your account name and account key): “DefaultEndpointsProtocol=https;AccountName=enter_your_account;AccountKey=enter_your_key”
Upload a file into Azure blob storage by using the following command: azure storage blob upload [file] [container] [blob]
Use one of the third party web azure storage explorers like CloudPortam: http://www.cloudportam.com/.
You can find the full list of azure storage explorers here: http://blogs.msdn.com/b/windowsazurestorage/archive/2014/03/11/windows-azure-storage-explorers-2014.aspx.
You can use the find command with the exec option to execute the command to upload each file, as described here as described here:
find *.csv -exec az storage blob upload --file {} --container-name \
CONTAINER_NAME --name {} --connection-string=‘CONNECTION_STRING’ \;
where CONNECTION_STRING is the connection string of your Azure Blob store container, available from portal.azure.com. This will upload all CSV files in your directory to the Azure Blob store associated with the connection string.
If you prefer the commandline and have a recent Python interpreter, the Azure Batch and HPC team has released a code sample with some AzCopy-like functionality on Python called blobxfer. This allows full recursive directory ingress into Azure Storage as well as full container copy back out to local storage. [full disclosure: I'm a contributor for this code]