I'm using an Azure stream Analytics job to process the data from an IoT hub. So my data comes from my simulated device to an IoT hub and I have configured the hub as an input to the stream analytics job and a blob storage as an output job.
My question is if I stop the stream analytics job and restart it, do I lose the data between stop and start? Or is the data stored on the IoT hub and when I restart the job and select the start time as from when it stopped, I'll get all the data.
As krishmam said, you can choose the outputStartMode when starting a job. Specifies if the job should start producing output at a given timestamp or at the point when the job starts.
You won't lose any data. Stream Analytics gives you an option to start a job from the previous stop time when you start the job.
Related
I would need to know that,
Is stopping Azure stream analytics service will stop the costing.
As per the answer from MSFT: For Azure Stream Analytics, there is no charge when the job is stopped.
But for Azure Stream Analytics on IoT Edge: Billing starts when an ASA job is deployed to devices, no matter what the job status is (running/failed/stopped).
Welcome to Stackoverflow!
Note: There is no charges for the stopped jobs. It will be billed on basis on steaming units in Cloud and jobs/devices in Edge.
Detailed explanation:
As a cloud service, Stream Analytics is optimized for cost. There are no upfront costs involved - you only pay for the streaming units you consume, and the amount of data processed. There is no commitment or cluster provisioning required, and you can scale the job up or down based on your business needs.
While creating stream Analytics Job, if you created a Stream Analytics job with streaming units = 1, it will be billed $0.11/hour.
Pricing:
Azure Stream Analytics on Cloud: If you created a Stream Analytics job with streaming units with N, it will be billed $0.11 * N/hour.
Azure Stream Analytics on Edge: Azure Stream Analytics on IoT Edge is priced by the number of jobs that have been deployed on a device. For instance, if you have two devices and the first device has one job whereas the second device has two jobs your monthly charge will be (1 job)(1 device)($1/job/device)+(2 jobs)(1 device)($1/job/device) = $1+$2 = $3 per month.
Hope this helps. If you have any further query do let us know.
In order to write sensor data from an IoT device to a SQL database in the cloud I use an Azure Streaming Analytics job. The SA job has an IoT Hub input and a SQL database output. The query is trivial; it just sends all data through).
According to the MS price calculator, the cheapest way of accomplishing this (in western Europe) is around 75 euros per month (see screenshot).
Actually, only 1 message per minute is send through the hub and the price is fixed per month (regardless of the amount of messages). I am surprised by the price for such a trivial task on small data. Would there be a cheaper alternative for such low capacity needs? Perhaps an Azure function?
If you are not processing the data real-time then SA is not needed, you could just use an Event Hub to ingest your sensor data and forward it on. There are several options to move data from the Event Hub to SQL. As you mentioned in your question, you could use an Azure Function or if you want a no-code solution, you could us a Logic App.
https://learn.microsoft.com/en-us/azure/connectors/connectors-create-api-azure-event-hubs
https://learn.microsoft.com/en-us/azure/connectors/connectors-create-api-sqlazure
In addition to Ken's answer, the "cold path" can be your solution, when the telemetry data are stored in the blob storage by Azure IoT Hub every 720 seconds (such as a maximum batch frequency).
Using the Azure Event Grid on the blob storage, it will trigger an EventGridTrigger subscriber when we can handle starting a streaming process for this batch (or for a group of batches within an one hour). After this batch process is done, the ASA job can be stopped.
Note, that the ASA job is billed based on the active processing time (that's the time between the Start/Stop) which your cost using an ASA job can be significantly dropped down.
I have created two stream analytic jobs for one iothub having multiple devices.
But, the data is being received only to the first created stream analytic job. Even if I stop that, no data is being sent to the second stream analytic job.
Is that a bug or am I missing something. Or is it simply that one iothub can have only one stream analytic job.
It looks like your both stream analytic jobs are using the same Consumer Group such as $Default from the same IoT Hub.
So, create for Azure IoT Hub two Consumer Groups dedicated for each ASA job, in other words, each ASA job will have own Consumer Group.
the following screen snippet shows an example of the plugin ASA job to the IoT Hub, where the Consumer group can be selected specifically for each job.
Our azure event hub has been running for a few months now in a production environment. Today the event hub stopped receiving inputs and processing any outputs. Not sure what the problem is?
Our stream analytics also can't connect to the event hub anymore - also one of the stream jobs has a status of 'degraded'. Could this be the cause for the whole hub to go down?
Most likely the degraded status of your Stream Analytics job is caused by the Event Hub access issue. You can confirm this from the job's activity log.
I have just started learning Azure IoT and it's quite interesting. I am confuse about does IoT Hub stores data somewhere?
i.e. Suppose i am passing room Temperature to IoT hub and want to store it in database for further use. How it's possible?
I am clear on how device-to-cloud and cloud-to-device works with IoT hub.
IoT Hub exposes device to cloud messages through an event hubs endpoint. Event Hubs has a retention time expressed in days. It's a stream of data that the reading client could re-read more time because the cursor is on client side (not on server side like queues and topics). With IoT Hub the related retention time is 1 day by default but you can change it.
If you want to store received messages from device you need to have a client reading on the Event Hubs exposed endpoint (for example with an Event Processor Host) that has the business logic to process the messages and store them into a database for example.
Of course you could use another decoupling layer so that the client reads from event hubs and store messages into queues. Then you have another client that at its own pace reads from queues and store into database. In this way you have a fast path reading event hubs.
This is pretty much the use case for all IoT scenarios.
Step 1: High scale data ingestion via Event Hub.
Step 2: Create and use a stream processing engine (Stream Analytics or HDInsight /Storm). You can run conditions (SQL like queries) to filter and store appropriate data in either cold or hot store for further analytics.
Step 3: Storage for cold-path analytics can be Azure BLOB. Stream Analytics can directly be configured to write the Data into it. Cold can contain all other data that doesn't require querying and will be cheap.
Step 4: Processing for hot-path analytics. This is data that is more regularly queries for. Or data where real time analytics needs to be carried on. Like in your case checking for Temperature values going beyond a threshold! needs an urgent trigger!
Let me know if you face any challenges while configuring the Stream analytics job! :)
If you take a look at the IoT Suite remote monitoring preconfigured solution (https://azure.microsoft.com/documentation/articles/iot-suite-remote-monitoring-sample-walkthrough/) you'll see that it persists telemetry in blob storage and maintains device status information in DocumentDb. This preconfigured solution gives you a working illustration of the points made in the previous answers.