Reading files from Azure blob into Function App in Python

Reading files from Azure blob into Function App in Python - python-3.x

I'm trying to read multiple files of same type from a container recursively from Azure blob storage in Python with Function App. But how could that be done using binding functions in host.json of orchestrator as shown below? What appropriate changes should be made in local settings as I've mentioned the conn strings and paths to blobs already in the same?
{
"scriptFile": "__init__.py",
"bindings": [
{
"name": "context",
"type": "orchestrationTrigger",
"direction": "in"
},
{
"name": "inputblob",
"type": "blob",
"dataType": "string",
"path": "test/{file_name}.pdf{queueTrigger}",
"connection": "CONTAINER_CONN_STR",
"direction": "in"
}
]
}
*test : The directory I have.
CONTAINER_CONN_STR : already specified path
Also, when doing so, in normal method without binding, gives error while downloading the files to local system as given below:
Exception: PermissionError: [Errno 13] Permission denied: 'analytics_durable_activity/'
Stack: File "C:\Program Files\Microsoft\Azure Functions Core Tools\workers\python\3.8\WINDOWS\X64\azure_functions_worker\dispatcher.py", line 271, in _handle__function_load_request
func = loader.load_function(

how could that be done using binding functions in host.json of orchestrator as shown below? What appropriate changes should be made in local settings
The configuration that you have used looks good. For more information, you can refer to this Example.
Also, when doing so, in normal method without binding, gives error when downloading the files to local system as given below:
You might this error when you are trying to open a file, but your path is a folder or if you don't have the permissions that are required.
You can refer to this SO thread which discusses a similar issue.
REFERENCES:
Set, View, Change, or Remove Permissions on Files and Folders | Microsoft Docs

You can keep the state of the trigger in an entity and checking the same every time the function gets triggered. The function will process the file only when the state matches i.e. the previous file has already been received but not processed.
Please refer https://learn.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-overview?tabs=csharp - Pattern #6: Aggregator (stateful entities)

Related

Azure TimerTrigger Multiple instances with different configuration

Using VS Code + "Azure Function" Extension, I generated the default python 3.7 timedTrigger function with the following settings:
// functions.json
{
"scriptFile": "__init__.py",
"bindings": [
{
"name": "mytimer",
"type": "timerTrigger",
"direction": "in",
"schedule": "0 0 */6 * * *"
}
]
}
I have also set up two environment variables "USER" and "PASSWORD" which are set up in the Configuration of the app service.
{
"IsEncrypted": false,
"Values": {
"AzureWebJobsStorage": ****************,
"FUNCTIONS_WORKER_RUNTIME": "python",
"USER": "********",
"PASSWORD": "*********"
}
}
Goal:
I want to run two instances of the same function, but using two different Configs, i.e. Users+Passwords.
Problem:
I believe that the Configuration/App Settings might not be sufficient for this. I can't find a way to run the function twice with multiple different parameters.
Question: What options do I have to reach my goal? One idea I had was to put the User/PW into the functions.json, but I could not figure out how to access that information from within the app function.

You have two options:
Read a custom json (not necessarily reading the value of function.json), you can add a custom json in the function app, and then read the value you want according to the hierarchy of the json file, Then use the value you read in the trigger.
Use deployment slot. (This is the official method, I think it is completely suitable for your current needs)
In this newly created slot you can use completely different environment variables in Configuration Settings.
This is the doc:
https://learn.microsoft.com/en-us/azure/azure-functions/functions-deployment-slots

I'd probably do it by having a single setting that holds a JSON array, viz
"Credentials": "[{'username':'***','password':'***'},{'username':'******','password':'******'}]"
Then, assuming you want to process them all at the same time, make a single function that parses the array and iterates over each username and password.
If you need to run them on different schedules, create a shared Python function DoTheThing(credentialIndex) that actually does the work and then multiple Azure Functions that simply call DoTheThing(0), DoTheThing(1), ...
(Security note: not immediately relevant to the problem at hand, but secrets are best kept in a secret store such as Key Vault rather than directly in the settings)

EDIT/SOLUTION:
I ended up having a the following keys in my environment variables:
"USERS": "[\"UserA\", \"UserB\"]"
"UserA_USER": "Username1"
"UserA_PW": "Password1"
"UserB_USER": "Username2"
"UserB_PW": "Password2"
Then I iterated over the USERS array and retrieved the keys for each user like so:
import os
import json
users = json.loads(os.environ["USERS"])
for u in users:
user = os.environ[u + "_USER"]
pw = os.environ[u + "_PW"]
doStuff(user, pw)

Virtal Assistant throwing 'Sorry it looks like something went wrong'

I have created a virtual assistant using the Microsoft virtual assistant template. When testing in the emulator whatever message i send i am getting a 'something went wrong reply.'
I am new to the entire bot framework ecosystem and it is becoming very difficult to proceed.
In the log what i can see is:
[11:26:32]Emulator listening on http://localhost:65233
[11:26:32]ngrok not configured (only needed when connecting to remotely hosted bots)
[11:26:32]Connecting to bots hosted remotely
[11:26:32]Edit ngrok settings
[11:26:32]POST201directline.startConversation
[11:26:39]<-messageapplication/vnd.microsoft.card.adaptive
[11:26:39]POST200conversations.replyToActivity
[11:26:54]->messagehi
[11:26:55]<-traceThe given key 'en' was not present in the dictiona...
[11:26:55]POST200conversations.replyToActivity
[11:26:55]<-trace at System.Collections.Generic.Dictionary`2.get_...
[11:26:55]POST200conversations.replyToActivity
[11:26:55]<-messageSorry, it looks like something went wrong.
[11:26:55]POST200conversations.replyToActivity
[11:26:55]POST200directline.postActivity
[11:27:48]->messagehello
[11:27:48]<-traceThe given key 'en' was not present in the dictiona...
[11:27:48]POST200conversations.replyToActivity
[11:27:48]<-trace at System.Collections.Generic.Dictionary`2.get_...
[11:27:48]POST200conversations.replyToActivity
[11:27:48]<-messageSorry, it looks like something went wrong.
[11:27:48]POST200conversations.replyToActivity
[11:27:48]POST200directline.postActivity
From what I understood the 'en' is not present in dictionary and I am not sure what is means. I checked in the Responses folder and could not see an en file not sure if that is the issue:
My emulator screenshot is attached:
Any help would be useful.

I believe the issue you are experiencing is a problem on the following lines inside MainDialog.cs:
var locale = CultureInfo.CurrentUICulture.TwoLetterISOLanguageName;
var cognitiveModels = _services.CognitiveModelSets[locale];
This tries to use the locale (retrieved from the current thread as per this documentation) as the key to access the cognitive models in your cognitivemodels.json file.
Inside your cognitivemodels.json file it should look like:
{
"cognitiveModels": {
// This line below here is what could be missing/incorrect in yours
"en": {
"dispatchModel": {
"type": "dispatch",
"region": "westus",
...
},
"knowledgebases": [
{
"id": "chitchat",
"name": "chitchat",
...
},
{
"id": "faq",
"name": "faq",
...
},
],
"languageModels": [
{
"id": "general",
"name": "msag-test-va-boten_general",
"region": "westus",
...
}
]
}
},
"defaultLocale": "en-us"
}
The en key insides the cognitiveModels object is what the code is trying to use to retrieve your cognitive models, thus if the locale pulled out in the code doesn't match the locale keys in your cognitivemodels.json then you will get the dictionary key error.
EDIT
The issue the OP has was a failed deploy. The steps we took were to:
Checked the deploy_log.txt inside the Deployment folder for errors.
If this case it was empty - not a good sign.
Checked the deploy_cognitive_models_log.txt inside the Deployment folder for errors.
There was an error present Error: Cannot find module 'C:\Users\dip_chatterjee\AppData\Roaming\npm\node_modules\botdispatch\bin\dispatch.js.
To fix this error we reinstalled all of the required npm packages as per step 5 of this guide then ran the deploy script as per this guide.

Is it possible to create a blob triggered azure function with file name pattern?

I am developing a blob triggered azure function. Following is the configuration of my "function.json" file:
{
"disabled": false,
"bindings": [
{
"name": "myBlob",
"type": "blobTrigger",
"direction": "in",
"path": "input/{name}",
"connection": "BlobConnectionString"
}
]
}
My function is working fine. It is triggered for all files in "input" blob. Now I want to filter files by its naming pattern. For Example : I want to trigger my azure function for only those files which contains "~123~" in its name.
Is it possible to do with some change in "path" property of "function.json" file?
If yes, then what should be the value of the "path" property?
If not, please let me know if there is any other workaround possible.
Thanks,

input/{prefix}~123~{suffix} should work. In function method signature, instead of name, use prefix and suffix to get blob name if needed.

Azure Data Factory specify custom output filename when copying to Blob Storage

I'm currently using ADF to copy files from an SFTP server to Blob Storage on a scheduled basis.
The filename structure is AAAAAA_BBBBBB_CCCCCC.txt.
Is it possible to rename the file before copying to Blob Storage so that I end up with a folder-like structure like below?
AAAAAA/BBBBBB/CCCCCC.txt

Here is what worked for me
I created 3 parameters in my Blob storage dataset, see the image bellow:
I specified the name of my file, added the file extension, you can add anything in the Timestamp just so you could bypass the ADF requirement since a parameter can't be empty.
Next, click on the Connection tab and add the following code in the FileName box: #concat(dataset().FileName,dataset().Timestamp,dataset().FileExtension). This code basically concatenate all parameters do you could have something like "FileName_Timestamp_FileExtension. See the image bellow:
Next, click on your pipeline then select your copy data activity. Click on the Sink tab. Find the parameter Timestamp under Dataset properties and add this code: #pipeline().TriggerTime. See the image bellow:
Finally, publish your pipeline and run/debug it. If it worked for me then I am sure it will work for you as well :)

With ADF V2, you could do that. First, use a lookup activity to get all the filenames of your source.
Then chain a foreach activity to iterate the source file names. The foreach activity contains a copy activity. Both your source dataset and sink dataset of the cop activity have parameters for filename and folder path.
You could use split and replace functions to generate the sink folder path and filename based on your source file names.

First you have to get the filenames in a GetMetadata-Activity. You can use this as a parameter in a copy-Activity and rename the filenames.
As mentioned in previous answer you can use a replace function to do this:
{
"name": "TgtBooksBlob",
"properties": {
"linkedServiceName": {
"referenceName": "Destination-BlobStorage-data",
"type": "LinkedServiceReference"
},
"folder": {
"name": "Target"
},
"type": "AzureBlob",
"typeProperties": {
"fileName": {
"value": "#replace(item().name, '_', '\\')",
"type": "Expression"
},
"folderPath": "data"
}
},
"type": "Microsoft.DataFactory/factories/datasets"
}

In Azure Functions, using a Bash script, is it possible to access properties from the queue message trigger?

Using Azure Functions, I'd like use the properties from a dequeued message as arguments in my Bash script. Is this possible? And if so, how? It seems documentation on Bash azure functions is a bit sparse.
I have looked at:
This documentation on binding to custom input properties. It gives C#/Javascript examples, but no bash samples.
And this GitHub sample with a similar Batch function.
However, after trying to apply the similar concepts to my function, I've come up short.
Here is my setup:
Functions.json
{
"bindings": [
{
"name": "inputMessage",
"type": "queueTrigger",
"direction": "in",
"queueName": "some-queue",
"connection": "AzureWebJobsStorage"
}
],
"disabled": false
}
Run.sh
echo "My name is $FirstName $LastName"
Sample Queue Message
{
"FirstName": "John",
"LastName": "Doe"
}
Actual Result
My name is:
What I'm hoping for
My name is: John Doe
Any thoughts on how to accomplish this, either by updating Functions.json or Run.sh?

For bash queue trigger queue message returns as string and you need to parse JSON yourself in run.sh. Note bash queue trigger is experimental. I think it's not easy to implement bash json parser as you can't install third party libs like jq in function app sandbox.
You can easily extract json objects from queue message using other languages (JS/C#/Powershell)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Reading files from Azure blob into Function App in Python - python-3.x

Related

Azure TimerTrigger Multiple instances with different configuration

Virtal Assistant throwing 'Sorry it looks like something went wrong'

Is it possible to create a blob triggered azure function with file name pattern?

Azure Data Factory specify custom output filename when copying to Blob Storage

In Azure Functions, using a Bash script, is it possible to access properties from the queue message trigger?

Categories

Resources