How to Separate Data from Multiple Devices on Microsoft Azure Stream Analytics - azure

I am currently trying to connect 2 different devices to the IoT Hub, and I need to separate the data from each device. In order to do so, I tried configuring my stream analytics query like this:
SELECT
deviceId, temperature, humidity, CAST(iothub.EnqueuedTime AS datetime) AS event_date
INTO
NodeMCUOutput
FROM
iothubevents
WHERE
deviceId = "NodeMCU1"
However, for some reason, the output is not shown if the WHERE statement is in the code (the outputs are shown without it, but the data is not filtered). I need the WHERE statement in order to sort the data the way I want it. Am I missing something? Are there any solutions to this? Thanks a lot. Cheers!

The device ID and other properties that are not in the message itself are included as metadata on the message. You can read that metadata using the GetMetadataPropertyValue() function. This should work for you:
SELECT
GetMetadataPropertyValue(iothubevents, 'IoTHub.ConnectionDeviceId') as deviceId,
temperature,
humidity,
CAST(GetMetadataPropertyValue(iothubevents, 'IoTHub.EnqueuedTime') AS datetime) AS event_date
INTO
NodeMCUOutput
FROM
iothubevents
WHERE
GetMetadataPropertyValue(iothubevents, 'IoTHub.ConnectionDeviceId') = 'NodeMCU1'

I noticed you use a double quote in the WHERE clause.
You need a simple quote to get a match on strings. In this case it will be
WHERE deviceId = 'NodeMCU1'
If the deviceId is the one from IoT Hub metadata, Matthijs answer will help you to retrieve it.

Related

send datetime with offset field in Stream Analytics

I'm trying to send a Timestamp field which is ISO 8601 with offset (
"2023-02-01T11:11:12.2220000+03:00" )
Azure doesn't really work with offsets, I first encountered that when sending data to event hub.
I was hoping to resolve this by splitting timestamp field into 2 fields:
timestamp: 2023-02-01T11:11:12.2220000
offset: +03:00
and combining them is SA query.
This seemed to have worked in Query editor, where test output is shown as a correct timestamp+offset
however when data is sent to output (in this case SQL, field type datetimeoffset), value looks like this:
2023-02-01T08:11:12.2220000+00:00
I suspect this is because timestamp field type in SA is datetime (seen in query explorer test results window)
even if I cast to nvarchar field type is still datetime.
is there a way to force SA to use specific types for fields (in this case, treat field as a string and not datetime)?
or, in general, how pass value like "2023-02-01T11:11:12.2220000+03:00" through SA without altering it? bonus points if it can be done in Event Hub as well

Azure Data Explorer & IoT Hub Table Mapping Configuration

recently I have been trying to get data from IoT Hub to Data Explorer along with the MXChip AZ3166. However, I am unable to map the EnqueuedTime variable onto the Data Explorer table, whereas the other variables are mapped just fine. I've inserted my code and screenshots to help me describe the issue. May I know what the issue might be? I've tried using different types such as datetime and string for the EnqueuedTime variable, but it still did not show up in the table. Thank you.
.create table TelemetryIoTHub (EnqueuedTime: datetime, Temperature: real, Humidity: real, Pressure: real, GyroX: real, GyroY: real, GyroZ: real, AccelX: real, AccelY: real, AccelZ: real, MagX: real, MagY: real, MagZ: real)
.create table TelemetryIoTHub ingestion json mapping 'DataMapping' '[{"column":"EnqueuedTime","path":"$.enqueuedTime","datatype":"datetime"},{"column":"Humidity","path":"$.humidity","datatype":"real"},{"column":"Pressure","path":"$.pressure","datatype":"real"},{"column":"Temperature","path":"$.temperature","datatype":"real"},{"column":"AccelX","path":"$.accelX","datatype":"real"},{"column":"AccelY","path":"$.accelY","datatype":"real"},{"column":"AccelZ","path":"$.accelZ","datatype":"real"},{"column":"GyroX","path":"$.gyroX","datatype":"real"},{"column":"GyroY","path":"$.gyroY","datatype":"real"},{"column":"GyroZ","path":"$.gyroZ","datatype":"real"},{"column":"MagX","path":"$.magnetX","datatype":"real"},{"column":"MagY","path":"$.magnetY","datatype":"real"},{"column":"MagZ","path":"$.magnetZ","datatype":"real"}]'
Table Output
Telemetry Output
Updated:
You should use the system names for this column, in this case it should be $.iothub-enqueuedtime, and also enable the iothub-enqueuedtime under the Event system properties. See the example in the ingest from IOT hub doc
{ "column" : "enqueuedtime", "Properties":{"Path":"$.iothub-enqueuedtime"}}'

Azure Stream Analytics: Regex in Reference Data

I have an Azure Stream Analytics job that uses an EventHub and a Reference data in Blob storage as 2 inputs. The reference data is CSV that looks something like this:
REGEX_PATTERN,FRIENDLY_NAME
115[1-2]{1}9,Name 1
115[3-9]{1}9,Name 2
I then need to lookup an attribute in the incoming event in EventHub against this CSV to get the
FRIENDLY_NAME.
Typical way of of using reference data is using JOIN clause. But in this case I cannot use it because such regex matching is not supported with LIKE operator.
UDF is another option, but I cannot seem to find a way of using reference data as a CSV inside the function.
Is there any other way of doing this in an Azure Stream Analytics job?
As I know, the JOIN is not supported in your scenario. The join key should be specific, can't be a regex value.
Thus, reference data is not suitable here because it should be used in the ASA sql like below:
SELECT I1.EntryTime, I1.LicensePlate, I1.TollId, R.RegistrationId
FROM Input1 I1 TIMESTAMP BY EntryTime
JOIN Registration R
ON I1.LicensePlate = R.LicensePlate
WHERE R.Expired = '1'
The join key is needed. What I mean is that the reference data input is not needed even here.
Your idea is using UDF script and load the data in the UDF to compare with the hardcode regex data. This idea is not easy to maintain. Maybe you could consider my workaround:
1.You said you have different reference data,please group them and store as json array. Assign one group id to every group. For example:
Group Id 1:
[
{
"REGEX":"115[1-2]{1}9",
"FRIENDLY_NAME":"Name 1"
},
{
"REGEX":"115[3-9]{1}9",
"FRIENDLY_NAME":"Name 2"
}
]
....
2.Add one column to referring group id and set Azure Function as Output of your ASA SQL. Inside Azure Function, please accept the group id column and load the corresponding group of json array. Then loop the rows to match the regex and save the data into destination residence.
I think Azure Function is more flexible then UDF in ASA sql job. Additional,this solution is maybe easier to maintain.

filtering in Azure SAQL if an element does not exist in the input data

I am trying to write an SAQL on the data which is coming from event Hub in json format.
The input to the azure Stream Analytics job is as shown below.
{"ver":"2019-12-28 18:41:45.4184730","Data":"Data01","d":{"IDNUM":"XXXXX01","Time1":"2017-12-20T00:00:00.0000000Z","abc":"610000","efg":"0000","XYZ":"00000","ver":"2017-12-20T18:41:45.4184730Z"}}
{"ver":"2019-12-28 18:41:45.4184730","Data":"Data01","d":{"IDNUM":"XXXXX02","Time1":"2017-12-20T00:00:00.0000000Z","abc":"750000","efg":"0000","XYZ":"90000","ver":"2017-12-20T18:41:45.4184730Z"}}
{"ver":"2017-01-01 06:28:52.5041237","Data":"Data02","d":{"IDNUM":"XXXXX03","acc":-10.7000,"PQR":35.420639038085938,"XYZ":139.95817565917969,"ver":"2017-01-01T06:28:52.5041237Z"}}
{"ver":"2017-01-01 06:28:52.5041237","Data":"Data02","d":{"IDNUM":"XXXXX04","acc":-8.5999,"PQR":35.924240112304688,"XYZ":139.6097412109375,"ver":"2017-01-01T06:28:52.5041237Z"}}
In the first two rows, the attribute Time1 is available where as in last two rows Time1 attribute itself is not present.
I have to store the data into cosmos DB based on the Time1 attribute in the input data.
Path in json data >>> input.d.Time1.
I have to store data which are having Time1 into a cosmosDB container and data which are not having Time1 into another container.
I tried with the below SAQL.
SELECT [input].ver,
[input].Data,
d.*
INTO [cosmosDB01]
FROM [input] PARTITION BY PartitionId
WHERE [input].Data is not null
AND [input].d.Time1 is not null
SELECT [input].ver,
[input].Data,
d.*
INTO [cosmosDB01]
FROM [input] PARTITION BY PartitionId
WHERE [input].Data is not null
AND [input].d.Time1 is null
Is there any other ways like IS EXISTS keyword in stream analytics query ?
Per my knowledge,there is no is_exists or is_defined sql built-in keyword in ASA so far. You have to follow the way you mentioned in the question to deal with multiple outputs scenario.
(Similar case:Azure Stream Analytics How to handle multiple output table?)
Surely,you could submit feedback to ASA team to push the progress of ASA.

Forcing a string field to DateTime in WCF query with Azure Table Storage

So, a quick overview of what I'm doing:
We're currently storing events to Azure Table storage from a Node.js cloud service using the "azure-storage" npm module. We're storing our own timestamps for these events in storage (as opposed to using the Azure defined one).
Now, we have coded a generic storage handler script that for the moment just stores all values as strings. To save refactoring this script, I was hoping there would be a way to tweak the query instead.
So, my question is, is it possible to query by datetime where the stored value is not actually a datetime field and instead a string?
My original query included the following:
.where( "_timestamp ge datetime'?'", timestamp );
In the above code I need to somehow have the query treat _timestamp as a datetime instead of a string...
Would something like the following work, or what's the best way to do it?
.where( "datetime _timestamp ge datetime'?'", timestamp );
AFAIK, if the attribute type is String in an Azure Table, you can't convert that to DateTime. Thus you won't be able to use .where( "_timestamp ge datetime'?'", timestamp );
If you're storing your _timestamp in yyyy-MM-ddTHH:mm:ssZ format, then you could simply do a string based query like
.where( "_timestamp ge '?'", timestamp );
and that should work just fine other than the fact that this query is going to do a full table scan and will not be an optimized query. However if you're storing in some other format, you may get different results.

Resources