Non-Azure input/source options for Steam Analytics - node.js

Does Steam Analytics support input sources other than products in the Azure family?
For example, can I setup a REST endpoint and send events this way? Are there client libraries for node.js?
Documentation is somewhat scant in this regard; I wanted to check here before assuming no on both fronts.

I believe the answer is no Azure Stream Analytics does not currently support non Azure sources.
One recommended approach is to write to Azure Event Hub then let Azure Stream Analytics read from there.
You could write to an event hub in Node.JS:
http://hypernephelist.com/2014/09/16/sending-data-to-azure-event-hubs-from-nodejs.html

Revise for my old answer.
As #PanagiotisKanavos said, Azure Stream Analytics (ASA) is just the processing service engine, not the ingestion endpoint, that doesn't need to have a non-azure input source as EventHub do and that how to feed ASA with data to it.
EventHub can be used by ASA, has a variety of libraries that work on tons of different machines, form factors etc, and can run on any OS and many frameworks. Worst case, simple HTTP works as well, AMQP is not mandatory but definitely ideal in terms of performance.
The correct route is PRODUCER -> EventHub -> ASA or PRODUCER -> STORAGE -> ASA. So if there is a library that supports storage on the device that they want, it can work as well, but EventHub is obviously a better choice.
Thanks a lot for #PanagiotisKanavos help.

Some circumstantial evidence below seems to prove that Azure not suppport non Azure Service as input for Stream Analytics.
From the REST API Create Input of Stream Analytics https://msdn.microsoft.com/en-us/library/azure/dn835010.aspx, there are only three data sources that include Event Hub, Blob Storgae & IoT Hub.
Screenshots from Azure old & new portal for add input.
Fig 1. The input options on Azure old portal (Step 1)
Fig 2. The options for Data stream (Step 2)
Fig 3. The option for Reference data (Step 2)
Fig 4. The input options on Azure new portal

Related

Azure how to get events shown in CLI from IoT to a database

I am having some issues actually retrieving and using the data I send to the IoT Hub in Azure. When I run 'az IoT hub monitor-events --hub-name ' in CLI I can see my events, and I can also send messages to my devices in the IoT hub.
I have then tried to create a stream, to forward the messages to my SQL database, but without any luck. Do you have any suggestions on how to retrieve this data?
There are multiple ways about this. The two most common scenarios are probably using an Azure Function, or using a Stream Analytics job. I don't know what you've tried up until this point, but a Stream Analytics job is probably the easiest way to go.
Stream Analytics
This answer on SO could be what you're looking for, it also links to this tutorial that you could follow from "Create a new Azure SQL Database" onwards. It covers creating an IoT Hub input and Azure SQL output on your Stream Analytics job and using a simple query to link the two together. There is more info in the Microsoft docs here
Azure Function
While looking this one up I found this answer, which is mine, awkward. But it describes how you can go about creating an Azure Function that accepts IoT Hub messages and shoots them to your database. This option is a lot more cost-efficient (or even free, if you use the consumption plan for a Function) for a few devices.

Azure Functions vs Azure Stream Analytics

I noticed that both Azure Functions and Azure Stream Analytics can take an input, modify or transform that input, and put it into an output.
When would I use one versus the other? Are there any general rules I can use to decide?
I tried looking at the pricing of each to guide me, but I'm having trouble discerning how my logic would affect the compute time cost of Functions versus the App service plan cost of Functions versus the streaming unit cost of Stream Analytics.
Azure Stream Analytics is a real time analytics service which can "run massively parallel real-time analytics on multiple IoT or non-IoT streams of data" whereas Azure Function is a (serverless) service to host functions (little pieces of code) that can be used for e.g. event-driven applications.
General rule is always difficult since everything depends on your requirement, but I would say if you have to analyze a data stream, you should take a look at Azure Stream Analytics and if you want to implement something like a serverless event-driven or timer-based application, you should check Azure Function or Logic Apps.

Using partitionId or partitionKey with Iot Hub Azure

We are developing an application where IoT devices will be publishing events to azure IoT hub using MQTT protocol (by using one topic to push message). We want to consume these message using Stream Analytic service. And to scale Stream analytic services, it is recommended to use partitionBy clause.
Since, we are not using Azure Event hub SDK, can we somehow attached partitionId with events?
Thanks In Advance
As Rita mentioned in the comments, Event Hub will automatically associate each device to a particular partition.
Then, when you can use PARTITION BY PartitionId for steps closer to the input to efficiently parallelize processing of the input and reduce/aggregate the data.
Then, you can have another non-partitioned step to output to SQL sending some aggregate data.
Doing that you will be able to assign more thank 6 SUs, even with an output to SQL.
We will update our documentation to give more info about scaling ASA jobs and describe the different possible scenarios.
Thanks,
JS - Azure Stream Analytics

Pulling data from Stream Analytics to Azure Machine Learning

Working on a IoT telemetry project that receives humidity and weather pollution data from different sites on the field. I will then apply Machine Learning on the collected data. I'm using Event Hubs and Stream Analytics. Is there a way of pulling the data to Azure Machine Learning without the hassle of writing an application to get it from Stream Analytics and push to AML web service?
Stream Analytics has a functionality called the “Functions”. You can call any web service you’ve published using AML from within Stream Analytics and apply it within your Stream Analytics query. Check this link for a tutorial.
Example workflow in your case would be like the following;
Telemetry arrives and reaches Stream Analytics
Streaming Analytics (SA) calls the Machine Learning function to apply it on the data
SA redirects it to the output accordingly, here you can use the PowerBI to create a predictions dashboards.
Another way would be using R, and here’s a good tutorial showing that https://blogs.technet.microsoft.com/machinelearning/2015/12/10/azure-ml-now-available-as-a-function-in-azure-stream-analytics/ .
It is more work of course but can give you more control as you control the code.
Yes,
This is actually quite easy as it is well supported by ASA.
You can call custom AzureML function from your ASA query when you create this function from the portal.
See the following tutorial on how to achieve something like this.

Does Microsoft Azure IoT Hub stores data?

I have just started learning Azure IoT and it's quite interesting. I am confuse about does IoT Hub stores data somewhere?
i.e. Suppose i am passing room Temperature to IoT hub and want to store it in database for further use. How it's possible?
I am clear on how device-to-cloud and cloud-to-device works with IoT hub.
IoT Hub exposes device to cloud messages through an event hubs endpoint. Event Hubs has a retention time expressed in days. It's a stream of data that the reading client could re-read more time because the cursor is on client side (not on server side like queues and topics). With IoT Hub the related retention time is 1 day by default but you can change it.
If you want to store received messages from device you need to have a client reading on the Event Hubs exposed endpoint (for example with an Event Processor Host) that has the business logic to process the messages and store them into a database for example.
Of course you could use another decoupling layer so that the client reads from event hubs and store messages into queues. Then you have another client that at its own pace reads from queues and store into database. In this way you have a fast path reading event hubs.
This is pretty much the use case for all IoT scenarios.
Step 1: High scale data ingestion via Event Hub.
Step 2: Create and use a stream processing engine (Stream Analytics or HDInsight /Storm). You can run conditions (SQL like queries) to filter and store appropriate data in either cold or hot store for further analytics.
Step 3: Storage for cold-path analytics can be Azure BLOB. Stream Analytics can directly be configured to write the Data into it. Cold can contain all other data that doesn't require querying and will be cheap.
Step 4: Processing for hot-path analytics. This is data that is more regularly queries for. Or data where real time analytics needs to be carried on. Like in your case checking for Temperature values going beyond a threshold! needs an urgent trigger!
Let me know if you face any challenges while configuring the Stream analytics job! :)
If you take a look at the IoT Suite remote monitoring preconfigured solution (https://azure.microsoft.com/documentation/articles/iot-suite-remote-monitoring-sample-walkthrough/) you'll see that it persists telemetry in blob storage and maintains device status information in DocumentDb. This preconfigured solution gives you a working illustration of the points made in the previous answers.

Resources