I have setup a very simple azure function that should be triggered whenever an excel file is dropped into my bucket. However, nothing is happening. Any suggestions?
Here is my functions.json
{
"scriptFile": "__init__.py",
"bindings": [
{
"name": "myblob",
"type": "blobTrigger",
"direction": "in",
"path": "excel/{name}.xlsx",
"connection": "jbucket123_STORAGE"
}
]
}
Here is my init.py file. Any recommendations?
import logging
import pandas as pd
import azure.functions as func
def main(myblob: func.InputStream):
logging.info(f"Python blob trigger function processed blob \n"
f"Name: {myblob.name}\n"
f"Blob Size: {myblob.length} bytes")
df=pd.read_excel(myblob.read())
logging.info(f"{df}")
Your configuration looks correct, a few things to doublecheck:
Do you use a "general-purpose" storage account? Blob triggers are not supported for the legacy "BlobStorage" account types.
Does a container "excel" exist in your storage account and are you putting your file into it? Does your file end with .xlsx (and not e.g. .xls)?
If you test locally is the storage account connection string stored under local.settings.json like so:
{
"IsEncrypted": false,
"Values": {
"FUNCTIONS_WORKER_RUNTIME": "python",
"AzureWebJobsStorage": "DefaultEndpointsProtocol=https;AccountName=[...]",
"jbucket123_STORAGE": "DefaultEndpointsProtocol=https;AccountName=[...]"
}
}
If you run the function in Azure, is the "jbucket123_STORAGE" property set under "Configuration > Application settings"? Did you put all dependencies (e.g. pandas, openpyxl) into your requirements.txt before you published the function?
You might also want to check the "Log stream" of your function to get more details what your function is doing.
Related
I want to create a Azure function using Python which will read data from the Azure Event Hub.
Fortunately, Visual Studio Code provides a way to create to create Azure functions skeleton. That can be edited according to the requirement.
I am able to create a demo HTTP trigger Azure Function with the help of a Microsoft Documentation but I don't know what change I should made in the below function so that it can read the data from the event hub and write the same to Azure Blob Storage.
Also, if someone can refer suggest any blog to get more details on azure function and standard practice.
UPDATE:
I tried to update my code based on suggestion of #Stanley but possibly it need to update in code.
I have written following code in my Azure function.
local.settings.json
{
"IsEncrypted": false,
"Values": {
"AzureWebJobsStorage": "Storage account connection string",
"FUNCTIONS_WORKER_RUNTIME": "python",
"EventHub_ReceiverConnectionString": "Endpoint Connection String of the EventHubNamespace",
"Blob_StorageConnectionString": "Storage account connection string"
}
}
function.json
{
"scriptFile": "__init__.py",
"bindings": [
{
"authLevel": "function",
"type": "eventHubTrigger",
"direction": "in",
"name": "event",
"eventHubName": "pwo-events",
"connection": "EventHub_ReceiverConnectionString",
"cardinality": "many",
"consumerGroup": "$Default",
"dataType": "binary"
}
]
}
init.py
import logging
import azure.functions as func
from azure.storage.blob import BlobClient
storage_connection_string='Storage account connection string'
container_name = ''
def main(event: func.EventHubEvent):
logging.info(f'Function triggered to process a message: {event.get_body().decode()}')
logging.info(f' SequenceNumber = {event.sequence_number}')
logging.info(f' Offset = {event.offset}')
blob_client = BlobClient.from_connection_string(storage_connection_string,container_name,str(event.sequence_number) + ".txt")
blob_client.upload_blob(event.get_body().decode())
Following is the screenshot of my blob container:
After executing he above code something got written to blob containers.
but instead of txt file it got saved in some other format. also, if I trigger azure function multiple time then files are getting overwritten.
I want to perform append operation instead of overwrite.
Also, I want to save my file in user defined location. Example: container/Year=/month=/date=
Thanks !!
If you want to read data from the Azure Event Hub, using the event hub trigger will be much easier, this is my test code (read data and write into storage):
import logging
import azure.functions as func
from azure.storage.blob import BlobClient
import datetime
storage_connection_string=''
container_name = ''
today = datetime.datetime.today()
def main(event: func.EventHubEvent):
logging.info(f'Function triggered to process a message: {event.get_body().decode()}')
logging.info(f' SequenceNumber = {event.sequence_number}')
logging.info(f' Offset = {event.offset}')
blob_client = BlobClient.from_connection_string(
storage_connection_string,container_name,
str(today.year) +"/" + str(today.month) + "/" + str(today.day) + ".txt")
blob_client.upload_blob(event.get_body().decode(),blob_type="AppendBlob")
I use the code below to send events to the event hub:
import asyncio
from azure.eventhub.aio import EventHubProducerClient
from azure.eventhub import EventData
async def run():
# Create a producer client to send messages to the event hub.
# Specify a connection string to your event hubs namespace and
# the event hub name.
producer = EventHubProducerClient.from_connection_string(conn_str="<conn string>", eventhub_name="<hub name>")
async with producer:
# Create a batch.
event_data_batch = await producer.create_batch()
# Add events to the batch.
event_data_batch.add(EventData('First event '))
event_data_batch.add(EventData('Second event'))
event_data_batch.add(EventData('Third event'))
# Send the batch of events to the event hub.
await producer.send_batch(event_data_batch)
loop = asyncio.get_event_loop()
loop.run_until_complete(run())
My local.settings.json:
{
"IsEncrypted": false,
"Values": {
"AzureWebJobsStorage": "<storage account conn str>",
"FUNCTIONS_WORKER_RUNTIME": "python",
"testhubname0123_test_EVENTHUB": "<event hub conn str>"
}
}
My function.json just as this doc indicated:
{
"scriptFile": "__init__.py",
"bindings": [{
"type": "eventHubTrigger",
"name": "event",
"direction": "in",
"eventHubName": "test01(this is my hubname, pls palce yours here)",
"connection": "testhubname0123_test_EVENTHUB"
}]
}
Result
Run the function and send data to the event hub using the code above:
Data has been saved into storage successfully:
Download .txt and check its content we can see that 3 event content has been written:
I started playing around with Azure Functions and am running in the issue that my Function is not being triggered by events entering my eventhub.
This is the code for my Function:
host.json:
"version": "2.0",
"logging": {
"applicationInsights": {
"samplingSettings": {
"isEnabled": true,
"excludedTypes": "Request"
}
}
},
"extensionBundle": {
"id": "Microsoft.Azure.Functions.ExtensionBundle",
"version": "[2.*, 3.0.0)"
}
}
function.json:
"scriptFile": "__init__.py",
"bindings": [
{
"type": "eventHubTrigger",
"name": "events",
"direction": "in",
"eventHubName": "eventhub",
"connection": "eventhub_connection",
"cardinality": "many",
"consumerGroup": "$Default",
"dataType": "stream"
}
]
}
init.py:
import logging
import azure.functions as func
def main(events: List[func.EventHubEvent]):
for event in events:
logging.info('Python EventHub trigger processed an event: %s',
event.get_body().decode('utf-8'))
logging.info(f'Function triggered to process a message: {event.get_body().decode()}')
logging.info(f' EnqueuedTimeUtc = {event.enqueued_time}')
logging.info(f' SequenceNumber = {event.sequence_number}')
logging.info(f' Offset = {event.offset}')
# def main(event: func.EventHubEvent):
# logging.info(f'Function triggered to process a message: {event.get_body().decode()}')
# logging.info(f' EnqueuedTimeUtc = {event.enqueued_time}')
# logging.info(f' SequenceNumber = {event.sequence_number}')
# logging.info(f' Offset = {event.offset}')
# # Metadata
# for key in event.metadata:
# logging.info(f'Metadata: {key} = {event.metadata[key]}')
"IsEncrypted": false,
"Values": {
"FUNCTIONS_WORKER_RUNTIME": "python",
"AzureWebJobsStorage": "DefaultEndpointsProtocol=https;AccountName=storageaccount;AccountKey=storageacciuntaccesskey=;EndpointSuffix=core.windows.net",
"eventhub_connection": "Endpoint=sb://eventhub01.servicebus.windows.net/;SharedAccessKeyName=function;SharedAccessKey=0omitted;EntityPath=eventhub"
}
}
I started out with the basic eventhub python code provided by the Azure Function Core tools. And have been testing different pieces of code found in online examples from people's blogs and the Microsoft docs.
When switching to cardinality: one -> I switch to the code which is currently commented out. I don't know if that is supposed to go like that, it just feels right to me.
In any case, regardless of the cardinality setting, or the datatype being changed between binary, stream or string. My Function simply does not trigger.
I can query my eventhub and see/read the events. So I know my policy, and the sharedkey and such, work fine. I am also only using the $Default consumer group.
I also tried setting up a HTTP triggered function, and this function gets triggered from Azure Monitor. I can see in the logs each request entering the function.
Am I doing something wrong in the code for my eventhub function?
Am I missing some other configuration setting perhaps? I already checked the Access Rules on the function, but that realy doesn't matter does it? The function is pulling the event from the eventhub. It's not being sent data by an initiator.
Edit: Added the local.settings.json file configuration and updated the function.json
Edit 2: solution to my specific issue is in the comments of the answer.
Update:
__init__.py of the function:
from typing import List
import logging
import azure.functions as func
def main(events: List[func.EventHubEvent]):
for event in events:
logging.info('Python EventHub trigger processed an event: %s',
event.get_body().decode('utf-8'))
Send message to event hub:
import asyncio
from azure.eventhub.aio import EventHubProducerClient
from azure.eventhub import EventData
async def run():
# Create a producer client to send messages to the event hub.
# Specify a connection string to your event hubs namespace and
# the event hub name.
producer = EventHubProducerClient.from_connection_string(conn_str="Endpoint=sb://testbowman.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=xxxxxx;EntityPath=test", eventhub_name="test")
async with producer:
# Create a batch.
event_data_batch = await producer.create_batch()
# Add events to the batch.
event_data_batch.add(EventData('First event '))
event_data_batch.add(EventData('Second event'))
event_data_batch.add(EventData('Third event'))
# Send the batch of events to the event hub.
await producer.send_batch(event_data_batch)
loop = asyncio.get_event_loop()
loop.run_until_complete(run())
And please make sure you give the right event hub name:
It seems your function.json has a problem, the connection string should not directly put in the binding item.
It should be like below:
function.json
{
"scriptFile": "__init__.py",
"bindings": [
{
"type": "eventHubTrigger",
"name": "events",
"direction": "in",
"eventHubName": "test",
"connection": "testbowman_RootManageSharedAccessKey_EVENTHUB",
"cardinality": "many",
"consumerGroup": "$Default",
"dataType": "binary"
}
]
}
local.settings.json
{
"IsEncrypted": false,
"Values": {
"AzureWebJobsStorage": "DefaultEndpointsProtocol=https;AccountName=0730bowmanwindow;AccountKey=xxxxxx;EndpointSuffix=core.windows.net",
"FUNCTIONS_WORKER_RUNTIME": "python",
"testbowman_RootManageSharedAccessKey_EVENTHUB": "Endpoint=sb://testbowman.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=xxxxxx;EntityPath=test"
}
}
check for configuration of function app and EventHub. pre-warned instance of function app should be lesser/equal to partition count of EventHub. Worked for me ad was able to receive events properly after this configuration.
I am trying to debug an Azure function locally using Blob trigger. When uploading an image file to Azure, the trigger is received by my function running locally.
def main(blobin: func.InputStream, blobout: func.Out[bytes], context: func.Context):
logging.info(f"Python blob trigger function processed blob \n"
f"Name: {blobin.name}\n"
f"Blob Size: {blobin.length} bytes")
image_file = Image.open(blobin)
My function.json
{
"scriptFile": "__init__.py",
"bindings": [
{
"name": "blobin",
"type": "blobTrigger",
"direction": "in",
"path": "uploads/{name}",
"connection": "STORAGE"
},
{
"name": "blobout",
"type": "blob",
"direction": "out",
"path": "uploads/{blob_name}_resized.jpg",
"connection": "STORAGE"
}
]
}
The error I get when the Image.open(blobin) line runs is :
System.Private.CoreLib: Exception while executing function:
Functions.ResizePhoto. System.Private.CoreLib: Result: Failure
Exception: UnidentifiedImageError: cannot identify image file
<_io.BytesIO object at 0x0000017FD4FD7F40>
The interesting thing is that the image itself does open in VSCode watch window, but fails when the code is ran. It also gives the same error as above if I add it to the watch again (probably triggering a watch refresh).
If you want to resize an image and then save it by function blob trigger, try the code below:
import logging
from PIL import Image
import azure.functions as func
import tempfile
import ntpath
import os
def main(blobin: func.InputStream, blobout:func.Out[func.InputStream], context: func.Context):
logging.info(f"Python blob trigger function processed blob \n"
f"Name: {blobin.name}\n"
f"Blob Size: {blobin.length} bytes")
temp_file_path = tempfile.gettempdir() + '/' + ntpath.basename(blobin.name)
print(temp_file_path)
image_file = Image.open(blobin)
image_file.resize((50,50)).save(temp_file_path)
blobout.set(open(temp_file_path, "rb").read())
os.remove(temp_file_path)
function.json :
{
"scriptFile": "__init__.py",
"bindings": [
{
"name": "blobin",
"type": "blobTrigger",
"direction": "in",
"path": "samples-workitems/{name}",
"connection": "STORAGE"
},
{
"name": "blobout",
"type": "blob",
"direction": "out",
"path": "resize/{name}",
"connection": "STORAGE"
}
]
}
Note that you should not store the resized image in the same container as it will lead to an endless loop (new image triggers the blob trigger and resize again and again) and your issue is due to the newly resized image outputted not correctly so that the exception occurs while run: Image.open(blobin)
Anyway, the code above works for me perfectly, see the result below:
Upload a big image:
resize the image and save it to another container:
It turns out setting a breakpoint at the Image.open(blobin) line breaks the function somehow. Removing it from there and adding it to the next line does not prompt the error anymore. Probably Azure doesn't like to wait and times out the stream? Who knows.
I'm trying to create a Blob trigger Azure Function in Python that automatically split all sheets in a specific Excel file into separate .csv files onto the same Azure Blob container. My init.py and function.json files look something like this:
function.json file:
{
"scriptFile": "__init__.py",
"bindings": [
{
"name": "blobin",
"type": "blobTrigger",
"direction": "in",
"path": "folder/ExcelFile.xlsx",
"connection": "blobSTORAGE"
},
{
"name": "blobout",
"type": "blob",
"direction": "out",
"path": "folder/{name}",
"connection": "blobSTORAGE"
}
]
}
init.py file:
import logging
from xlrd import open_workbook
import csv
import azure.functions as func
def main(blobin: func.InputStream, blobout: func.Out[bytes]):
logging.info(f"Python blob trigger function processed blob \n")
try:
wb = open_workbook('ExcelFile.xlsx')
for i in range(0, wb.nsheets):
sheet = wb.sheet_by_index(i)
print(sheet.name)
with open("Sheet_%s.csv" %(sheet.name.replace(" ","")), "w") as file:
writer = csv.writer(file, delimiter = ",")
print(sheet, sheet.name, sheet.ncols, sheet.nrows)
header = [cell.value for cell in sheet.row(0)]
writer.writerow(header)
for row_idx in range(1, sheet.nrows):
row = [int(cell.value) if isinstance(cell.value, float) else cell.value
for cell in sheet.row(row_idx)]
writer.writerow(row)
blobout.set(file.read)
logging.info(f"Split sheet %s into CSV successfully.\n" %(sheet.name))
except:
print("Error")
I tried to run the pure Python code on my PC without the Azure function implementing and succeeded. However, when I deployed the function onto Azure Apps it does not "trigger" when I tried to upload the Excel file. I am thinking that the config I put is wrong but don't know how to confirm or fix it. Any suggestion?
Since you don't have problem on local, I think the problems comes from the blobSTORAGE that doesn't have setted in the environment variable.
On local, environment variable is setted in the values section in local.settings.json. But when you deploy the function app, it will get environment variable from this place:
I have a activity function that should store message in Blob storage.I can overwrite a file in blob storage but i need to store data in different name.how to do that? Azure function doesn't support dynamic binding in nodejs.
Find one workaround, see whether it's useful.
Along with blob output binding, there's an activity trigger to receive message msg, we can put self-defined blob name in msg for blob binding path to consume.
In your orchestrator function which calls Activity function
yield context.df.callActivity("YourActivity", {'body':'messagecontent','blobName':'myblob'});
Then Activity function code should be modified
context.bindings.myOutputBlob = context.bindings.msg.body;
And its function.json can use blobName as expected
{
"bindings": [
{
"name": "msg",
"type": "activityTrigger",
"direction": "in"
},
{
"name":"myOutputBlob",
"direction": "out",
"type": "blob",
"connection": "AzureWebJobsStorage",
"path": "azureblob/{blobName}"
}
],
"disabled": false
}