I am developing an azure webjob which is monitoring a blob storage account for new inserted blobs. My storage account consists of multiple containers all holding similar information. Currently I'm using separate BlobTriggers for every container to monitor the single containers.
Is there a way to monitor the whole account for new blobs instead of every single container? If not, can I automatically iterate over the containers in a storage account and call the webjob with the container names as parameter?
No, currently each BlobTrigger monitors for changes on a single container. At startup time, the blob containers indicated by your BlobTrigger annotated functions result in multiple "listeners" being started, monitoring the various containers. So there's no runtime way for you to iterate over containers and set this self up yourself, short of codegen/ILGen of SDK methods with the appropriate attributes.
If you'd like, you can add a feature suggestion here: https://github.com/Azure/azure-webjobs-sdk/issues, and we can review it for the next release. However, I've never heard of anyone else needing this functionality, so it seems pretty corner case :)
Related
This is a question on general advice on the usage of ACI and its pricing calculation.
I checked the MSFT link: ACI pricing
It stated that it is charged based on memory and CPU consumptions. And the calculations for my scenario is not available in the MSFT pricing calculator.
Some background studies that i did, tf I stopped/de-allocated my ACI, I should no longer be charged, but I read that a lot of tutorials actually guided us to create/delete a NEW ACI, and the start/stop functions only exist in Azure CLI.
Logic-app ACI custom connector does not have start/stop functions
Azure PowerShell does not have start/stop functions.
Questions:
If so, what is the difference between Stopping and Deleting ACI in terms of pricing and performance?
For example:
I can use Azure automation run book to call Azure CLI to stop/start my ACI.
I can use Logic app - ACI custom connector to delete/create a new ACI.
Wouldn't it be faster to just stop the ACI and start again, to avoid the bandwidth cost to pull the image from Docker hub (or keeping an image in Azure container registry)? And it will be faster rather than provisioning a new instance each time.
Why the stop/start functions are not available in the Azure PowerShell module, and logic app custom connector? Seems it sounds like a better an approach to handle them.
From here, I would say there is no difference. Also stopped instances should incur no cost.
When the containers are recycled, the resources are deallocated and
billing stops for the container group.
This, however, also means that restarting a stopped instance will not really be faster than creating from scratch. Starting might happen on a new host, so pulling the image will need to happen again as well.
We are using an Azure Storage account to store some files that shall be downloaded by our app on the users demand.
Even though there should be no write operations (at least none I could think of), we are exceeding the included write operations just some days into the billing period (see image).
Regarding the price it's still within limits, but I'd still like to know whether this is normal and how I can analyze the matter. Besides the storage we are using
Functions and
App Service (mobile app)
but none of them should cause that many write operations. I've checked the logs of our functions and none of those that access the queues or the blobs have been active lately. There are are some functions that run every now and then, but only once every few minutes and those do not access the storage at all.
I don't know if this is related, but there is a kind of periodic ingress on our blob storage (see the image below). The period is roundabout 1 h, but there is a baseline of 100 kB per 5 min.
Analyzing the metrics of the storage account further, I found that there is a constant stream of 1.90k transactions per hour for blobs and 1.3k transactions per hour for queues, which seems quite exceptional to me. (Please not that the resolution of this graph is 1 h, while the former has a resolution of 5 minutes)
Is there anything else I can do to analyze where the write operations come from? It kind of bothers me, since it does not seem as if it's supposed to be like that.
I 've had the exact same problem; after enabling Storage Analytics and inspecting the $logs container I found many log entries that indicate that upon every request towards my Azure Functions, these write operations occur against the following container object:
https://[function-name].blob.core.windows.net:443/azure-webjobs-hosts/locks/linkfunctions/host?comp=lease
In my Azure Functions code I do not explicitly write in any of container or file as such but I have the following two Application Settings configured:
AzureWebJobsDashboard
AzureWebJobsStorage
So I filled a support ticker in Azure with the following questions:
Are the write operation triggered by these application settings? I
believe so but could you please confirm.
Will the write operation stop if I delete these application settings?
Could you please describe, in high level, in what context these operations occur (e.g. logging? resource locking, other?)
and I got the following answers from Azure support team, respectively:
Yes, you are right. According to the logs information, we can see “https://[function-name].blob.core.windows.net:443/azure-webjobs-hosts/locks/linkfunctions/host?comp=lease”.
This azure-webjobs-hosts folder is associated with function app and it’s created by default as well as creating function app. When function app is running, it will record these logs in the storage account which is configured with AzureWebJobsStorage.
You can’t stop the write operations because these operations record necessary logs to storage account used by Azure Functions runtime. Please do not remove application setting AzureWebJobsStorage. The Azure Functions runtime uses this storage account connection string for all functions except for HTTP triggered functions. Removing this Application Settings will cause your function app unable to start. By the way, you can remove AzureWebJobsDashboard and it will stop Monitor rather than the operation above.
These operations is to record runtime logs of function app. These operations will occur when our backend allocates instance for running the function app.
Best place to find information about storage usage is to make use of Storage Analytics especially Storage Analytics Logging.
There's a special blob container called $logs in the same storage account which will have detailed information about every operation performed against that storage account. You can view the blobs in that blob container and find the information.
If you don't see this blob container in your storage account, then you will need to enable storage analytics on your storage account. However considering you can see the metrics data, my guess is that it is already enabled.
Regarding the source of these write operations, have you enabled diagnostics for your Functions and App Service? These write diagnostics logs to blob storage. Also, storage analytics is also writing to the same account and that will also cause these write operations.
For my case, I have a Azure App Insight which took 10K transactions on its storage per mintues for functions and app services, even thought there are only few https requests among them. I'm not sure what triggers them, but once I removed app insights, everything becomes normal.
We have a Static Class in the WebApp that contains a static dictionary of current sessions and username. We need to have access to the data in the dictionary in the WebJob as we want to update data based on who currently has active sessions. The webJob runs every 5 minutes and needs to have the current list of sessions/users.
I can access the dictionary from the webjob but its always null. We have logging in the webApp that verifies there are entries in the dictionary but when the webjob accesses the dictionary its null.
How can I get that object in the webJob and get its data? Do we need to use Azure Storage (Queue/Table) for this to work?
An "Azure AppService" is hosted on an "AppService Plan", which in turn consists of a number of virtual machines. WebJobs ("your.webjob.exe") and WebApps(usually "w3wp.exe") are completely independent processes on theses systems. They may run on the same machine, but there is no guarantee for it. Either way, communication between them would be difficult and can definitely not be achieved by using a common static variable.
For your use case, you should use a common storage. Azure Storage could work, but Azure Redis Cache or simple SQL might also do the trick. Depends on your framework and requirements.
Scenario:
An azure function hosted on an app service plan and scaled out to 5 instances. The Azure function is triggered by Blob.
Question:
Is there any documentation that explains the mechanism that prevents a Scaled out Azure Function process the same blob multiple times? I am asking because there is more than one instance of the function is running.
Agree with#Peter, here are my understandings for references, correct me if it doesn't make sense.
Blob trigger mechanism related info is stored in the Azure storage account for our Function app (defined by the app setting AzureWebJobsStorage). Locks locate in a blob container named azure-webjobs-hosts and there's a queue azure-webjobs-blobtrigger-<FunctionAppName> for internal use.
See another part in the same comment.
Normally only 1 of N host instances is scanning for new blobs (based on a singleton host id lock). When it finds a new blob it adds a queue message for it and one of the N hosts processes it.
So in the first step--scanning for new blobs, scale out feature doesn't participate. The singleton host id lock is implemented by blob lease as #Peter mentioned (check blob locks/<FunctoinAppName>/host in azure-webjobs-hosts).
Once internal queue starts receiving messages of new blobs, scale out feature begins to work as host instances fetch and process messages together. When a blob message is being processed it can't be seen by other instances and would be deleted later.
Besides, to ensure that blob processed never triggers function later(e.g. in next turn of scanning), another mechanism is blob receipts.
As far as I can tell blob leases are used.
It is backed by this comment made by a MS engineer working on the Azure Functions team.
The singleton mechanism used under the covers to ensure only one host processes a blob is based on the HostId. In regular scale out scenarios, the HostId is the same for all instances, so they collaborate via blob leases behind the scenes using the same lock blob scoped to the host id.
I am trying to a deploy an app which has a frontend app and a backend worker. The worker runs a CPU intensive process. Now my requirements are to run the web app in a Azure A0 instance while the CPU intensive process runs in a D2 instance. Now both the instance must be able to share the files. I have read at places where they spoke of SBS.
I tried creating the linux VMs in same cloud services but couldnt figure out how to ssh into them separately since they use the same cloud service url. i followed this http://azure.microsoft.com/en-us/documentation/articles/cloud-services-connect-virtual-machine/
to create the 2nd vm.
Can anyone suggest me as how to achieve this setup? Also if possible how do i check if the disks are available to both the instances?
Azure docs aren't as helpful as aws. :(
If the two VMs just want to share files and you don't want to go to the extra effort of coding for blob storage then consider Azure Files which exposes an SMB share against a blob storage back end. This allows you to do standard file IO operations instead of custom blob storage code. See http://blogs.msdn.com/b/windowsazurestorage/archive/2014/05/12/introducing-microsoft-azure-file-service.aspx which shows how to create the file share in Windows and Linux VMs.
[Probably easier to give an answer here]
BlobStorage is a universal storage container that can effectively act as the common drive you are looking for. Access to the blob storage container is made over HTTP / HTTPS either through a BlobStorage Client or over REST, where you will have functions to upload, download, list objects, etc.
For Python, you'll hopefully find this article sufficient although I've no experience with Python on Azure to comment, or if choosing REST and http requests - that should work fine.
HTH