Could I know the event upload file on aws s3 automatically? - node.js

I'm using Node JS to subscribe SNS on AWS and SQS to handle queues . How do I know if a file has been uploaded to S3 then sent the message to node js automatically via SNS? Sorry my English is not good

You can do that send notification using SNS when any file uploaded to s3
https://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html
require 'aws-sdk-s3' # v2: require 'aws-sdk'
req = {}
req[:bucket] = bucket_name
events = ['s3:ObjectCreated:*']
notification_configuration = {}
# Add function
lc = {}
lc[:lambda_function_arn] = 'my-function-arn'
lc[:events] = events
lambda_configurations = []
lambda_configurations << lc
notification_configuration[:lambda_function_configurations] = lambda_configurations
# Add queue
qc = {}
qc[:queue_arn] = 'my-topic-arn'
qc[:events] = events
queue_configurations = []
queue_configurations << qc
notification_configuration[:queue_configurations] = queue_configurations
# Add topic
tc = {}
tc[:topic_arn] = 'my-topic-arn'
tc[:events] = events
topic_configurations = []
topic_configurations << tc
notification_configuration[:topic_configurations] = topic_configurations
req[:notification_configuration] = notification_configuration
req[:use_accelerate_endpoint] = false
s3 = Aws::S3::Client.new(region: 'us-west-2')
s3.put_bucket_notification_configuration(req)
you can refer this code also.

Related

AWS SQS FIFO Received message was empty

I'm learning AWS SQS and I've sent 1 messages to a FIFO queue. But when I try to receive messages, I cant get meesage.
What is my mistake?
Senders code (lambda-function)
XURL=quote(URL)
client = boto3.client('sqs')
url_fifo = 'https://sqs.xxxxxx.com/xxxxxx/que.fifo'
now = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
response = client.send_message(
QueueUrl=url_fifo,
MessageBody=f'URL: {XURL}, EventTime: {now}',
MessageDeduplicationId=str(time.time_ns()),
MessageGroupId='Group1'
)
Receivers code (lambda-function)
name = 'que.fifo'
sqs = boto3.resource('sqs')
url_fifo = 'https://sqs.xxxxxx.amazonaws.com/xxxxxxx/que.fifo'
queue = sqs.get_queue_by_name(QueueName=name)
msg_list = queue.receive_messages(QueueUrl=url_fifo, MaxNumberOfMessages=1)
print(msg_list)
if msg_list:
for message in msg_list:
print(message.body)
URL=message.body
message.delete()
else:
print("no message")
result log
2023-02-07T13:47:50.326+09:00 []
2023-02-07T13:47:50.326+09:00 no message
You are calling the Queue object receive_messages method, but passing parameters expected by the client object receive_message method. Check out the two hyperlinks, I have linked them accordingly to different sections in the documentation.
Therefore, your code should be
client = boto3.client('sqs')
url_fifo = 'https://sqs.xxxxxx.amazonaws.com/xxxxxxx/que.fifo'
msg_list = client.receive_message(QueueUrl=url_fifo, MaxNumberOfMessages=1)

Problems about concurrent task in FastAPI

dear community
I've tried to execute async def in FastAPI app.
the workflow is crate FastAPI service to recieve from end-user requests and send it to another service such as DB Writer service.
first, I've create async def for send request with aiosonic library
Here's the code
import aiosonic
from aiosonic.timeout import Timeouts
async def db_writer_requests(arrival_input, prediction_out) :
client = aiosonic.HTTPClient()
timeouts_settings = Timeouts(sock_connect = 10,sock_read = 10)
await client.post('http://127.0.0.1:8082/api/motor/writer/test1',
headers = {'Content-Type' : 'application/json'},
json = arrival_input,
timeouts = timeouts_settings)
client.shutdown()
Here's main app
#app.post('/api/motor/test')
async def manager_api(raw_input:arrival_requests) :
depart_json = dict(raw_input)
inc_preds, model_error = await predict_requests(depart_json)
if (inc_preds == None) or (inc_preds['status_code'] != 200) :
return inc_preds if model_error == None else model_error
else :
mlid = uuid.uuid4()
pred_output = model_output(
predict_uuid = str(mlid),
net_premium = str(inc_preds["result"]["data"]["net_premium"]),
car_repair = str(inc_preds["result"]["data"]["car_repair"]),
place_repair = str(inc_preds["result"]["data"]["place_repair"]),
insurer_tier = str(inc_preds["result"]["data"]["insurer_tier"])
)
send_msg = good_response_msg(
status = 'OK',
status_code = 200,
result = data(
data = pred_output
)
)
await db_writer_requests(depart_json,dict(pred_output))
return send_msg
when I've tried to send request.
case 1 I've not to use "await", the program not send request to service and not show any response in endpoint service.
case 2 I've use await it worked normally, but if endpoint service not available the main service shown "Internal Server Error".
Thank you for Advance

EventHub and Receive

All,
I modified the sample Receive python script for Azure EventHub a bit but when I run it goes into a loop fetching the same events over and over. I'm not sending any events to the eventhub since I want to read what is there and I dont see a while loop here so how is this happening and how do I stop after it reads all the events currently in the EventHub?
Thanks
grajee
# https://learn.microsoft.com/en-us/python/api/overview/azure/eventhub-readme?view=azure-python#consume-events-from-an-event-hub
import logging
from azure.eventhub import EventHubConsumerClient
connection_str = 'Endpoint=sb://testhubns01.servicebus.windows.net/;SharedAccessKeyName=getevents;SharedAccessKey=testtestest='
consumer_group = '$Default'
eventhub_name = 'testpart'
client = EventHubConsumerClient.from_connection_string(connection_str, consumer_group, eventhub_name=eventhub_name)
logger = logging.getLogger("azure.eventhub")
logging.basicConfig(level=logging.INFO)
def on_event(partition_context, event):
logger.info("Received event from partition: \"{}\" : \"{}\"" .format(partition_context.partition_id,event.body_as_str()))
partition_context.update_checkpoint(event)
with client:
client.receive(
on_event=on_event,
starting_position="-1", # "-1" is from the beginning of the partition.
)
# receive events from specified partition:
# client.receive(on_event=on_event, partition_id='0')
client.close()
The below piece of code from here makes it more clear .
import asyncio
from azure.eventhub.aio import EventHubConsumerClient
from azure.eventhub.extensions.checkpointstoreblobaio import BlobCheckpointStore
connection_str = '<< CONNECTION STRING FOR THE EVENT HUBS NAMESPACE >>'
consumer_group = '<< CONSUMER GROUP >>'
eventhub_name = '<< NAME OF THE EVENT HUB >>'
storage_connection_str = '<< CONNECTION STRING FOR THE STORAGE >>'
container_name = '<<NAME OF THE BLOB CONTAINER>>'
async def on_event(partition_context, event):
# do something
await partition_context.update_checkpoint(event) # Or update_checkpoint every N events for better performance.
async def receive(client):
await client.receive(
on_event=on_event,
starting_position="-1", # "-1" is from the beginning of the partition.
)
async def main():
checkpoint_store = BlobCheckpointStore.from_connection_string(storage_connection_str, container_name)
client = EventHubConsumerClient.from_connection_string(
connection_str,
consumer_group,
eventhub_name=eventhub_name,
**checkpoint_store=checkpoint_store, # For load balancing and checkpoint. Leave None for no load balancing**
)
async with client:
await receive(client)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(main())

How to publish and subscribe .pdf file in Google Pub/Sub (GCP)

In the below code the big .pdf file is split into single pages and uploaded into bucket and enqueued to pubsub simultaneously
def publish_messages(project_id, topic_id, enqueue_file):
publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path(project_id, topic_id)
data = enqueue_file
# Data must be a bytestring
data = data.encode("utf-8")
# When you publish a message, the client returns a future.
future = publisher.publish(topic_path, data=data)
print(future.result())
print(enqueue_file + "has been enqueued to Pub/Sub.")
def upload_local_directory_to_gcs(local_path, bucket, gcs_path):
assert os.path.isdir(local_path)
for local_file in glob.glob(local_path + '/**'):
if not os.path.isfile(local_file):
continue
remote_path = os.path.join(gcs_path, local_file[1 + len(local_path) :])
storage_client = storage.Client()
buck = storage_client.bucket(bucket)
blob = buck.blob(remote_path)
blob.upload_from_filename(local_file)
print("Uploaded " + local_file + " to gs bucket " + bucket)
publish_messages("Project1", "my-topic", local_file)
I receive messages using the below code
def receive_messages(project_id, subscription_id , timeout=None):
from concurrent.futures import TimeoutError
from google.cloud import pubsub_v1
subscriber = pubsub_v1.SubscriberClient()
subscription_path = subscriber.subscription_path(project_id, subscription_id)
def callback(message):
print("Received message: {}".format(message))
message.ack()
streaming_pull_future = subscriber.subscribe(subscription_path, callback=callback)
print("Listening for messages on {}..\n".format(subscription_path))
with subscriber:
try:
streaming_pull_future.result(timeout=timeout)
except TimeoutError:
streaming_pull_future.cancel()
if __name__ == "__main__":
receive_messages("Project1", "my-sub")
But when I receive I am getting just string data.
Received message: Message {
data: b'/tmp/doc_pages/document-page17.pdf'
ordering_key: ''
attributes: {}
}
My idea is to get that pdf file and perform some OCR operation using Vision API. Is it possible to get pdf file itself? Is there any other methodology please let me know.
Thanks!

Get message count from azure SubscriptionClient with python

I am new to azure SubscriptionClient, I am trying to get the total message count from azure SubscriptionClient with python.
Please try something like the following:
from azure.servicebus import SubscriptionClient
conn_str = "Endpoint=sb://<service-bus-namespace-name>.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=access-key="
topic_name = "test"
subscription_name = "test"
client = SubscriptionClient.from_connection_string(conn_str, subscription_name, topic_name)
props = client.get_properties()
message_count = props['message_count']
print message_count
This worked for me:
from azure.servicebus.aio import ServiceBusClient
from azure.servicebus.management import ServiceBusAdministrationClient
number_of_messages_in_subscription = 0
CONNECTION_STR = "<your_connection_string>"
with ServiceBusAdministrationClient.from_connection_string(CONNECTION_STR) as servicebus_mgmt_client:
global number_of_messages_in_subscription
TOPIC_NAME = "<your_topic_name>"
SUBSCRIPTION_NAME = "<your_subscription_name>"
get_subscription_runtime_properties = servicebus_mgmt_client.get_subscription_runtime_properties(TOPIC_NAME, SUBSCRIPTION_NAME)
number_of_messages_in_subscription = get_subscription_runtime_properties.active_message_count
Source: https://github.com/Azure/azure-sdk-for-python/blob/1709ec7898c87e4369f5324302f274f254857dc3/sdk/servicebus/azure-servicebus/samples/async_samples/mgmt_subscription_async.py

Resources