PIL UnidentifiedImageError from Azure Blob Trigger though image opens in 'watch' - azure

I am trying to debug an Azure function locally using Blob trigger. When uploading an image file to Azure, the trigger is received by my function running locally.
def main(blobin: func.InputStream, blobout: func.Out[bytes], context: func.Context):
logging.info(f"Python blob trigger function processed blob \n"
f"Name: {blobin.name}\n"
f"Blob Size: {blobin.length} bytes")
image_file = Image.open(blobin)
My function.json
{
"scriptFile": "__init__.py",
"bindings": [
{
"name": "blobin",
"type": "blobTrigger",
"direction": "in",
"path": "uploads/{name}",
"connection": "STORAGE"
},
{
"name": "blobout",
"type": "blob",
"direction": "out",
"path": "uploads/{blob_name}_resized.jpg",
"connection": "STORAGE"
}
]
}
The error I get when the Image.open(blobin) line runs is :
System.Private.CoreLib: Exception while executing function:
Functions.ResizePhoto. System.Private.CoreLib: Result: Failure
Exception: UnidentifiedImageError: cannot identify image file
<_io.BytesIO object at 0x0000017FD4FD7F40>
The interesting thing is that the image itself does open in VSCode watch window, but fails when the code is ran. It also gives the same error as above if I add it to the watch again (probably triggering a watch refresh).

If you want to resize an image and then save it by function blob trigger, try the code below:
import logging
from PIL import Image
import azure.functions as func
import tempfile
import ntpath
import os
def main(blobin: func.InputStream, blobout:func.Out[func.InputStream], context: func.Context):
logging.info(f"Python blob trigger function processed blob \n"
f"Name: {blobin.name}\n"
f"Blob Size: {blobin.length} bytes")
temp_file_path = tempfile.gettempdir() + '/' + ntpath.basename(blobin.name)
print(temp_file_path)
image_file = Image.open(blobin)
image_file.resize((50,50)).save(temp_file_path)
blobout.set(open(temp_file_path, "rb").read())
os.remove(temp_file_path)
function.json :
{
"scriptFile": "__init__.py",
"bindings": [
{
"name": "blobin",
"type": "blobTrigger",
"direction": "in",
"path": "samples-workitems/{name}",
"connection": "STORAGE"
},
{
"name": "blobout",
"type": "blob",
"direction": "out",
"path": "resize/{name}",
"connection": "STORAGE"
}
]
}
Note that you should not store the resized image in the same container as it will lead to an endless loop (new image triggers the blob trigger and resize again and again) and your issue is due to the newly resized image outputted not correctly so that the exception occurs while run: Image.open(blobin)
Anyway, the code above works for me perfectly, see the result below:
Upload a big image:
resize the image and save it to another container:

It turns out setting a breakpoint at the Image.open(blobin) line breaks the function somehow. Removing it from there and adding it to the next line does not prompt the error anymore. Probably Azure doesn't like to wait and times out the stream? Who knows.

Related

Process csv file stored in blob storage using Python Azure functions

I have a csv file with 'n' number of records stored in blob storage. I want to read the new records being added to the csv file, process it and stored it back to another container in the blob storage. I want to achieve this flow using Python Azure Functions. I am unable to write the code for inbound and outbound in the Python Azure Functions.
Please help. Thanks
Below is the code that worked for me. I'm using HTTP Trigger to achieve your requirement.
function.json
{
"scriptFile": "__init__.py",
"bindings": [
{
"authLevel": "anonymous",
"type": "httpTrigger",
"direction": "in",
"name": "req",
"methods": [
"get",
"post"
]
},
{
"type": "blob",
"direction":"in",
"name": "inputblob",
"path": "input/input.csv",
"connection": "storageacc_STORAGE"
},
{
"type": "blob",
"direction":"out",
"name": "outputblob",
"path": "output/output.csv",
"connection": "storageacc_STORAGE"
}
]
}
init.py
import csv
import logging
import azure.functions as func
def main(req: func.HttpRequest, inputblob: func.InputStream,outputblob: func.Out[bytes]) -> func.InputStream:
# Reading from the input binding
input_file = inputblob.read()
# Processing the csv file
# Writing to output binding
outputblob.set(input_file)
local.settings.json
{
"IsEncrypted": false,
"Values": {
"AzureWebJobsStorage": "",
"FUNCTIONS_WORKER_RUNTIME": "python",
"storageacc_STORAGE": "<Your_Connection_String>"
}
}
RESULTS:

Excel file is not triggering Azure function

I have setup a very simple azure function that should be triggered whenever an excel file is dropped into my bucket. However, nothing is happening. Any suggestions?
Here is my functions.json
{
"scriptFile": "__init__.py",
"bindings": [
{
"name": "myblob",
"type": "blobTrigger",
"direction": "in",
"path": "excel/{name}.xlsx",
"connection": "jbucket123_STORAGE"
}
]
}
Here is my init.py file. Any recommendations?
import logging
import pandas as pd
import azure.functions as func
def main(myblob: func.InputStream):
logging.info(f"Python blob trigger function processed blob \n"
f"Name: {myblob.name}\n"
f"Blob Size: {myblob.length} bytes")
df=pd.read_excel(myblob.read())
logging.info(f"{df}")
Your configuration looks correct, a few things to doublecheck:
Do you use a "general-purpose" storage account? Blob triggers are not supported for the legacy "BlobStorage" account types.
Does a container "excel" exist in your storage account and are you putting your file into it? Does your file end with .xlsx (and not e.g. .xls)?
If you test locally is the storage account connection string stored under local.settings.json like so:
{
"IsEncrypted": false,
"Values": {
"FUNCTIONS_WORKER_RUNTIME": "python",
"AzureWebJobsStorage": "DefaultEndpointsProtocol=https;AccountName=[...]",
"jbucket123_STORAGE": "DefaultEndpointsProtocol=https;AccountName=[...]"
}
}
If you run the function in Azure, is the "jbucket123_STORAGE" property set under "Configuration > Application settings"? Did you put all dependencies (e.g. pandas, openpyxl) into your requirements.txt before you published the function?
You might also want to check the "Log stream" of your function to get more details what your function is doing.

Http Trigger-Python-Log Specific Message Constantly and not URL body or payload

Building on an earlier question. The following code is an httptrigger that listed to a gis layer edits and updates. It logs into the queue the url payload. I do not want the payload loaded but a specific repetitive message so that it is overwritten everytime for I do not want to dequeue every now and then. How can I go about this?
import logging
import azure.functions as func
def main(req: func.HttpRequest,msg: func.Out[str]) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
input_msg = req.params.get('message')
logging.info(input_msg)
msg.set(req.get_body())
return func.HttpResponse(
"This is a test.",
status_code=200
)
**function.json**
{
"scriptFile": "__init__.py",
"bindings": [
{
"authLevel": "anonymous",
"type": "httpTrigger",
"direction": "in",
"name": "req",
"methods": [
"get",
"post"
]
},
{
"type": "http",
"direction": "out",
"name": "$return"
},
{
"type": "queue",
"direction": "out",
"name": "msg",
"queueName": "outqueue1",
"connection": "AzureStorageQueuesConnectionString"
}
]
}
I do not want the payload loaded but a specific repetitive message so
that it is overwritten everytime for I do not want to dequeue every
now and then.
No, when you put in the same message, It will not overwritten. It just queued in the queue storage.
If you want to process the message in queue, just use queueclient or use queuetrigger of azure function.(queuetrigger of function is based on queueclient, they are basically same.)
This is the API reference of queue:
https://learn.microsoft.com/en-us/python/api/azure-storage-queue/azure.storage.queue?view=azure-python
You can use it to process message in the queue with python code.
And this is the queuetrigger of azure function:(This is already integrated and can be used directly)
https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-queue-trigger?tabs=python

How to set the output bindings for name and location in an Azure function using python?

I can't quite seem to get the output bindings to enable a file to be saved to blob storage. I have created an Azure Function using Python, that uses a CosmosDB Change Feed trigger. I need to save that document to blob storage.
I've set-up the function.json file as follows:
{
"scriptFile": "__init__.py",
"bindings": [
{
"type": "cosmosDBTrigger",
"name": "documents",
"direction": "in",
"leaseCollectionName": "leases",
"connectionStringSetting": "cosmos_dev",
"databaseName": "MyDatabase",
"collectionName": "MyCollection",
"createLeaseCollectionIfNotExists": "true"
},
{
"type": "blob",
"direction": "out",
"name": "outputBlob",
"path": "raw/changefeedOutput/{blobname}",
"connection": "blobStorageConnection"
}
]
}
So the trigger will get a documents like the following:
{ "id": "documentId-12345",
other sections here
"entity": "customer"
}
In the init.py file I have the base code of
def main(documents: func.DocumentList) -> func.Document:
logging.info(f"CosmosDB trigger executed!")
for doc in documents:
blobName = doc['id'] + '.json'
blobFolder= doc['entity']
blobData = doc.to_json()
I think i need to add in the def something like 'outputBlob: func.Out' but unsure how to proceed
Looking at the examples on github
https://github.com/yokawasa/azure-functions-python-samples/tree/master/v2functions/blob-trigger-watermark-blob-out-binding
it look like i have to
outputBlob.set(something)
So i'm looking for how to set up the def part and send the blob to the location that i've set from the data in the cosmosdb document.
I have tried the following:
def main(documents: func.DocumentList, outputBlob: func.Out[str] ) -> func.Document:
logging.info(f"CosmosDB trigger executed!")
for doc in documents:
blobName = doc['id'] + '.json'
outputBlob.set(blobName)
and get the result:
CosmosDB trigger executed!
Executed 'Functions.CosmosTrigger_py' (Failed, Id=XXXXX)
System.Private.CoreLib: Exception while executing function: Functions.CosmosTrigger_py. Microsoft.Azure.WebJobs.Host: No value for named parameter 'blobname'.
I could just call the connection stuff from the os.enviro, and get the connection string that way, I think and use the standard create_blob_from_text, with location, name and blob data,
block_blob_service.create_blob_from_text(blobLocation, blobName, formattedBlob)
Any pointers would be great

Trying to upload a file with ftp using azure functions

I am trying to send a file using external file protocol and FTP api connection. The configuration and code is straight forwards and the app runs successfully however no data is sent to the FTP and I cannot see any trace that the function even tried to send data using ftp.... What is wrong? and more important; Where can i monitor the progress of the external file api?
My code follows (Note: I have tried Stream and string as input and output)
run.csx
public static void Run(Stream myBlobInput, string name, out Stream
myFTPOutput, TraceWriter log)
{
myFTPOutput = myBlobInput;
//log.Info($"C# Blob trigger function Processed blob\n Name:{name} \n Content:{myBlob}");
log.Info($"C# Blob trigger function Processed blob\n Name:{name} \n Size:{myBlobInput.Length} \n Content:{myBlobInput.ToString()}");
}
function.json
"bindings": [
{
"name": "myBlobInput",
"type": "blobTrigger",
"direction": "in",
"path": "input/{name}",
"connection": "blob_STORAGE"
},
{
"name": "myFTPOutput",
"type": "apiHubFile",
"direction": "out",
"path": "/output/{name}",
"connection": "ftp_FTP"
}
],
"disabled": false
}
I could make it working :
If we want to have same file content in the output FTP server and same file name ,then here is the code and function.json
public static void Run(string myBlob, string name, TraceWriter log , out string outputFile )
{
log.Info($"2..C# Blob trigger function Processed blob\n Name:{name} \n Size:{myBlob.Length} \n Content:{myBlob.ToString()}");
outputFile=myBlob;
}
Also here is the function.json
{
"bindings": [
{
"name": "myBlob",
"type": "blobTrigger",
"direction": "in",
"path": "myblobcontainer/{name}",
"connection": "AzureWebJobsDashboard"
},
{
"type": "apiHubFile",
"name": "outputFile",
"path": "LogFiles/{name}",
"connection": "ftp_FTP",
"direction": "out"
}
],
"disabled": false
}
the input binding should have a valid container name as in blob account here--:
blob container structure as path
Also in output binding for FTP, the path should be any folder in Root of FTP , what you see in FTP login UI/console and then filename , which in this case {name} ,which allows us to keep the same output file name as input blob name.
Ok, so I changed the ftp Connection to some other server and it work like a charm. That means that there were some firewall refusal from the Azure Function that triggered. The sad thing is that no error messages triggers that i can spot. Thanks for all support

Resources