Capture SEE (Server-Sent Events) in Azure - azure

I am having trouble to identify the best "tool" to solve the problem. I am using a python library which publishes to its data via Sever-Sent-Events (SSE) (see https://github.com/wattsight/wapi-python/blob/development/wapi/events.py).
I would like to constantly listen for new events. However, I am not sure which tool in Azure in appropriate. An Azure Function would have to run continously which seems like a misuse, SignalR requires control over the "sender" of events and I don't know if EventHub would be able to manage that job.
Thank you for letting my learn from your experience.

The azure Eventhub is the right service to receive new events. Besides that, it can also provide other benefits like scalability, events storage etc.
And you can also consider using azure function with eventhub trigger. But it has a limitation about the incoming message max size.

Related

Send messages to clients using Azure Service Bus - Topics

Using Azure Service Bus - Topics, I want to implement a solution wherein messages are sent/notified to end consumers once the producer sends the message to Topic (like Queues).
I understand that Topics work as Pub/Sub model wherein subscribers need to read messages from subscriptions. But I'm looking for a workaround that works some what similar to Queue (where it triggers a web job / service when any message is received).
I have few thoughts like
1. Using Auto-Forwarding in subscriptions to forward messages to Queues but again I think if this kills the purpose of Topics
2. Schedule a job to process these requests but again I think if I'm delaying the process
First, I want to know if Service Bus Topic is right option to go with? Next, If possible to implement a workaround what is the best/better way?
PS: I have to send messages which has information - I guess I can't use Relays
Just to be clear, Queues and Topics in Service Bus are different. As you noted, Topics are useful in publish/subscribe scenarios.
Since you are looking for something that gets triggered, Azure functions might be what you need.
Azure Functions supports trigger and output bindings for Service Bus
queues and topics
https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-service-bus
I think that #William is right, you can use/attach other process to the subscription to make what you trying to do.
he mentioned Azure Functions which is a good tool and I want to suggest Azure Logic Apps as well in case you want to take some decisions based in the message that you received.
With Azure Logic Apps you can create a logic Workflow and integrate many services using connectors provided by this tool.
You will find more in:
https://learn.microsoft.com/en-us/azure/connectors/connectors-create-api-servicebus
And for answer your question
First, I want to know if Service Bus Topic is right option to go with?
The quick answer is yes, using messaging patterns is the best way to create reliable solutions.
In your case you want as well notify another system after receiving a message.
The only thing that you need to be aware is, whenever you did not receive the notification what you'll do? you need to think about this scenario.
From discussion above.
Azure functions with Queues/Topics
Regardless of queues or topics. you can trigger azure function with both. This function will process the message . Now you can create two methods in same function SendEmail(), sendPhoneNotifcation() and parrellize the tasks using C# task parallel library. So same function will do both tasks in parallel.
Every time you get a message , your function is triggered. Process the message and notify user. The benefit is this function will scale automatically if you have large number of message coming through queue.

Azure architecture best suited to save JSON from API to a Data Lake Store?

I am looking forward to build an endpoint capable of receiving JSON objects and saving them into ADLS. So far I have tried several different combinations using Functions, Event Hubs, and Stream Analytics. The problem is: no solution so far seems ideal.
TL;DR In my scenario, I have a few set of users that will send me JSON data through an API, and I need to save it inside ADLS, separated by user. What is the best way of doing so?
Could anyone shed me some light? Thanks in advance.
WARNING: LONG TEXT AHEAD
Let me explain my findings so far:
Functions
Advantages
single solution approach - solving the scenario with a single service
built-in authorization
organization - saving user's files to separate folders inside ADLS
HTTP endpoint - to send data only a POST is required
cheap & pay-as-you-go - charged per request
Disadvantages
bindings & dependencies - Functions doesn't have ADLS bindings. To authorize and use ADLS, I need to install extra dependencies and manually manage its credentials. I was only able to do it with C#, but haven't tested with other languages. May also be a drawback, although I can't confirm.
File management - saving 1 file per request is not suggested by ADLS. The alternative would be to append to files and manage its size. This means more code compared to the other solutions.
Event Hub
Advantages
no code at all - all I need is enabling data capture
Disadvantages
one event hub per user - the only way of separating data inside ADLS through event hub's capture capability requires using one event hub per user
price - capturing one-event-hub-per-user increases the prices drastically
authorization - sending events are not as trivial as doing a POST
Functions + Event Hub
Using Event Hub with Functions mitigate Functions disadvantages, but have the same drawbacks (except auth) of Event Hub
Functions + Event Hub + Stream Analytics
Although I would be able to have a single event hub without capture, using Stream Analytics SQL as a filter to direct each user's data to its specific folder, it would be a limiting factor. I have tried it and it gets slower as the SQL gets bigger.
IoT Hub
IoT Hub has routing, but it is not as dynamic as I require.
Could anyone shed me some light? Thanks in advance.
I don't quite see the disadvantages of using only Azure Functions to write data to ADLS.
As long as you don't write lots of small files, writing 1 file per request should not really be an issue
Use the .NET SDK should be pretty straightforward even without an existing binding
To solve the authentication piece: Use Managed Service Identity (MSI) and KeyVault to store your client secrets there. MSI support in the SDK is apparently on the roadmap and would then make this very easy indeed.
You save yourself the extra cost of an Event Hub and I don't see a real value add through it

Which Azure service to use for processing data from Event Hub?

I would appreciate some help picking out the best suited Azure services for my scenario - I am just beginning with Azure services and my knowledge is pretty limited.
I have data from multiple sources, and of different shapes, coming into an Event Hub. I need to subscribe to the events from the Event Hub and, based on their format, process them and ultimately save them into an SQL Database. All components - events consumers, the SQL Database - need to be hosted in the cloud.
How would I implement this in an "Azure Orientated Architecture"?
In an off cloud application, I would have competing consumers subscribing to the Event Hub. They would be some console applications or Windows services, and each would be processing the events asynchronously (this is further simplified by the event processing being idempotent).
Ideally, the Azure equivalent of the above consumers would scale up and down automatically, so I would like to not have to use VMs that host console applications (where I would need to keep an eye on the VM's resources myself). Scaling and deployment wise they would have to behave like App Services, however I'm under the impression that those are just for web applications. I've also briefly looked at Web Jobs, but those seem to be polling data at various intervals, whereas I need a proper event subscriber that the Event Hub pushes data into.
Any help will be greatly appreciated!
Thank you.
Later Edit:
I've looked into Web Jobs and they do allow continuous
processing, hence looks like they can be used as automatically
scaling subscribers.
Ideally I would like to write the code for
the subscribers in F#. C# is the other option if that is not
available.
You can see my post regarding IoT Hub. Its basically the same for Event Hub.
(each of the examples in the post can be used on Event Hubs).
https://stackoverflow.com/a/38682324/6659347
In addition, For Event Hub you can also use Azure Function which has an Event Hub trigger - a function that will run whenever an event hub receive a new event. And it will also answer your requirement of scaling.
Make sure that if you are working with multiple consumers make use of the Event Hub Consumer Groups so each consumer can read the stream independently.
I'd say use a WebJob in combination with an EventProcessor. I wrote some demo code that can easily be transferred to a WebJob: https://github.com/DeHeerSoftware/SemanticLogging.EventHub/tree/master/SemanticLogging.EventHub.Processor
See https://azure.microsoft.com/en-us/documentation/articles/event-hubs-csharp-ephcs-getstarted/#receive-messages-with-eventprocessorhost for official documentation.
I've created a WebJob myself using this approach. Works like a charm.

Sending Azure Application Insights data to Event Hub

I have a static dot net web application with application insights sdk. How do I send application insights data to Azure Event Hub? I have successfully used the Azure Continuous Export feature but I would rather like to send the telemetry data to the Event Hub.
To explicitly send data to eventhub you will need to use EventHub SDK, which is currently available in .NET/C#, Java, REST, and Node.js. For your case which is a web application, sending via REST APIs might be the easiest way. Take a look at API reference for more information: https://msdn.microsoft.com/en-us/library/azure/dn790674.aspx
One catch is that receiving events is not currently supported using REST, you would still need a .NET or Java application to be on the receive side.
If you are looking for a common logging framework - which can be configured to send data to "whatever data destination" you want to - you should consider looking at log4net.
Here's a good implementation of log4net-appender for EventHubs.
#greypanda,
As you know Continuous Export currently only exports Application Insights to blob storage, from which you can pick up the data for use in any workstream you want. Exporting directly in an Event Hub is something that could be a future feature, so please log this at our UserVoice site: https://visualstudio.uservoice.com/forums/357324-application-insights.
We will also have a set of REST APIs for Application Insights soon (see https://visualstudio.uservoice.com/forums/357324-application-insights/suggestions/4999529), which might help you.
I would like to learn more about your scenario so I can better help you in this instance and improve our export and API features. Feel free to reply here or if you want, shoot me a mail offline.
Thank you
Dale Koetke (dalek#microsoft.com)
We don't really support that. It's a lot easier to let the SDK send the data to App Insights portal, then you can use Continuous Export to move it out into Storage. If you want, you can use Stream Analytics to move it from there.
What do you plan to ultimately do with the data? (I mean, why event hub...?)

Monitoring Azure Event Hub

I have been researching on Microsoft Azure Event Hubs. My goal is to figure out a way to provide automatic scalability. This is a experimental work and I am really only trying to know what can I do with Azure event hubs. I do not have access to the Azure platform to test test anything :( .
Well, so far, I found that through REST API and Service Bus Powershell I can add Throughput Units (to increase performance - I am relying on this: Scale Azure Service Bus through Powershell or API) and increase or decrease Event's Expiration time (which might influence capacity - https://msdn.microsoft.com/en-us/library/azure/dn790675.aspx).
The problem is that, presuming that the previous techniques work and I am able to scale event hubs' performance automatically, I still need a way to know when to trigger scalability mechanisms. To know when and how to trigger scalability, I need to work on some functions that rely upon the event hub's metrics (or a way to monitoring it). The problem is that I can't really find any metrics. The only thing that I find is this: https://azure.microsoft.com/en-us/documentation/articles/cloud-services-how-to-monitor/ - Which actually does not solve my problem because although it may present some interesting metrics, it does not serve the purposes of my "application" (which will come if I can prove that I can successfully scale Azure automatically); and this Azure service bus statistics/Monitoring - which's links are not working.
Surely I can find more information about Service Bus Explorer, and surely it may provide some interesting insights over the event hub metrics, I am just wondering if there is something like this: https://github.com/HBOCodeLabs/incubator-storm/blob/master/STORM-UI-REST-API.md that allow me to access some kind of metrics, rather than creating my own metrics
Thanks in advance
Best regards
You can retrieve metrics about Event Hubs (an Event Hub is a Service Bus Entity) using the Service Bus Entity Metrics REST APIs(https://msdn.microsoft.com/library/azure/dn163589.aspx). Using this you can retrieve the same metrics displayed in the portal such as:
Number of incoming messages
Incoming throughput
Outgoing throughput
These should help you determine when you need to scale your application up or down.
This video is useful for getting started https://channel9.msdn.com/Blogs/Subscribe/Service-Bus-Namespace-Management-and-Analytics
If 3rd party services are an option, look into CloudMonix # http://cloudmonix.com
It can monitor Event Hubs (among gazillion other Azure-related things) and execute Azure Automation runbooks (among gazillion other actions) as a reaction to load conditions/throughout of a whole hub or individual partitions and optionally based on any other metrics in your environment.
Your Azure Automation runbooks could have the logic to execute increases in your EH's throughout, etc.
Disclaimer: I'm affiliated with the product.
HTH
Service Bus Explorer is great. I actually use this.
ServiceBus Explorer

Resources