I'm using stream analytics to process some RFID data in realtime. The events from the RFID reader is sending to event hub as an input. Right now I'm facing a problem that the time in events is in UNIX time format, which looks like "TimeStamp":1460471242.22402," It's very strange that when I test the query(Not start the job but use the sample data from input), the UNIX time changed to "2016-04-12T14:48:00.0000000Z" , but when I start the SA job, it failed and said that the column 'timestamp' doesn't conforms to ISO 8601 standard. Is there any way to convert UNIX time to standard date format in SA without change the input raw data?
My query is simple like:
SELECT
EPCValue, Antenna, System.TimeStamp AS Time
INTO
dataoutput
FROM
datainput timestamp by TimeStamp
Please take a look at the sample from this page. It describes how to convert UNIX time to SQL datetime format
https://msdn.microsoft.com/en-us/library/mt573293.aspx
Related
I have an Event Hub that sends data to Time Series Insights, with the following message format:
{
"deviceId" : "Device1",
"time" : "2022-03-30T21:27:29Z"
}
I want to calculate the difference in seconds between the Event Hub EnqueuedTimeUtc property and time property.
I created a Time Series Insights with an Event Source without specifying the Timestamp property name, in that way in Time Series Insights our Timestamp ($ts) property will be the EnqueuedTimeUtc property of the Event.
Now with those two properties, using TSX (Time Series Expression Language), I want to do something like this:
$event.$ts - $event.time.DateTime
The problem I'm facing is that the result of that operation returns a DateTime, but in Time Series Expression there isn't a function to convert DateTime to Seconds, or to Unix Timestamp. Time Series Expresion Doc
Is there a way of achieving this using Time Series Insights and TSX (Time Series Expression)?
Thanks!
TSI is an depreciated service in Azure and there are not much features (inbuilt functions) available in it to explore data. Therefore, I suggest you to use Azure Data Explorer to work with the Event Hub Data.
Azure Data Explorer provides inbuild datetime_diff function which allows to calculate the period in many supported formats based on your requirement using simple Kusto Query Language.
datetime_diff(): Calculates calendarian difference between two datetime values.
Syntax:
datetime_diff(period,datetime_1,datetime_2)
Example:
second = datetime_diff('second',datetime(2017-10-30 23:00:10.100),datetime(2017-10-30 23:00:00.900))
When running stream analytics, I get an error message:
"Dropping events due to improper timestamps. Stream Analytics only
supports ISO8601 format for DateTime values"
I have tried the following formats:
2017-09-19T13:17:29.0111070Z
2017-09-19T13:17:29.123456
2017-09-19 13:17:29.123456
2017-09-19T13:17:29.123
2017-09-19 13:17:29.123
However, when I use the Test button on the query in Stream Analytics, the output comes out fine. Also, when I comment out the timestamp by clause, the query works, but the System.timestamp in the select statment will not return the correct time.
Is this a formatting issue or something else?
Firstly, as Vignesh Chandramohan mentioned, you can try to use CAST to convert the expression to DateTime, and check if it returns data conversion error that indicates any input data/value cannot cast to type 'datetime'.
Secondly, many factors can cause no output issue, for example: A where clause in the query filtered out their events that prevented outputs from being generated; Timestamp for events is before the job start time and therefore events are being dropped etc.
For detailed steps to debug with Azure Stream Analytics jobs, please check the Diagnose and solve problems on Azure portal or this article: Troubleshooting guide for Azure Stream Analytics.
I am using Azure Event Hub for collection of timebased events. Connected Azure Stream Analytics (ASA) to it.
This results in losing the timezone info at ASA level.
What I ascertained is the following:
I have sent data in JSON format containing a string with a timestamp compatible with ISO 8601. e.g.:
"event_timestamp": "2016-09-02T19:51:38.657+02:00"
I checked by means of ServiceBus Explorer (thanks to the guys who wrote this tool) that this string arrived exactly as-is in Event Hub.
In the Stream Analytics I added the event hub as a input. When I use the option SAMPLE DATA in the Azure portal this result in data containing: "event_timestamp":"2016-09-02T17:51:38.6570000"
Why is Stream Analytics removing timezone info???
According to ISO 8601 not specifying a timezone in a timestamp means that de timestamp is converted to localtime. Does that mean the timezone where the Azure resource is running? How can I use geo-replication in that case?
This means that after consuming the data and presenting it in a dashboard all times are related to the time of the server where the stream analytic runs?
Do I need to add the timezone information seperately in the JSON payload and reconstruct it afterwards?
My conclusion is that actually ASA removes/destruct information from my data stream.
Imagine this ASA query: SELECT * INTO [myoutput] FROM [myinput]
This would change the content (*) of my data. All strings that appear to be a datetime with timezone info will be converted.
In my opinion this is very unwanted behaviour.
I am very interested in the opinions of others in this forum.
Everything in Azure runs in UTC Timezone, unless otherwise supported and explicitly configured (there are not many services which support setting timezone).
If you look at your quoted samples closely you will notice that the timestamp is converted to UTC in the ASA, that's why the TimeZone info is missing:
Sent to event hub: "event_timestamp": "2016-09-02T19:51:38.657+02:00"
Received in ASA: "event_timestamp":"2016-09-02T21:51:38.6570000"
Note that your event is sent in 19:51:38.657 +2:00 and ASA reads 21:51:38.6570000 which is absolutely the same.
UPDATE
I am not expert on ISO standard, but here are some exerpts from ASA Docu:
Azure Stream Analytics data types
datetime Defines a date that is combined with a time of day with
fractional seconds that is based on a 24-hour clock and relative to
UTC (time zone offset 0).
convertions:
datetime string converted to datetime following ISO 8601 standard
It is documented that date time is in UTC. Hence no need to explicitly specify it. Whether this comforts with the ISO I cannot tell, first because WikiPedia is not ISO Document, second because I am not ISO expert.
I've an Event Hub in Azure Cloud that takes messages where I have a timestamp value and other parameters.
The timestamp is aligned to stream analytics using the command
TIMESTAMP AS [TimeStamp]
This is the Stream Analytics query (the input is the Event Hub, the output in this case is a blob)
SELECT
DateAdd(minute, -1, System.Timestamp) as FromTimestamp, System.Timestamp as ToTimestamp,
[MachineType], [MachineNumber], [Part], [PartNumber], [ValueKind], AVG(Value) AS AverageValue
INTO
[blob-avg]
FROM
[input]
TIMESTAMP BY [TimeStamp]
WHERE [ValueKind]='RPM' OR [ValueKind]='CUR' OR [ValueKind]='POW'
GROUP BY [MachineType], [MachineNumber], [Part], [PartNumber], [ValueKind], SlidingWindow(minute, 1)
I think that the timestamp of the message will be considered as the timestamp to compare but, how it is scanned?
At the Utc time? Let me say, in the message I've a timestamp 12:00 (GMT+2), and the UTC now is 10.00
Does the tumbling consider the data arrived 2 hours ago instead of the actual? (at timestamp 10:00 (GMT+2)) (actually it seems to me something like that).
And what append if a message arrives with a delay greater than the 2 hours? let me say that a message arrives with one day of delay, will the tumbling be recalcoulated?
[Timestamp] column will be converted to datetime, if the format was GMT, it will be taken into account and it is safe to assume that everything will be converted to UTC when time related calculations are made.
Azure Stream Analytics is continuously reading data from the source. And late arrival policy + window decides how to handle late events.
Please have a look at
https://msdn.microsoft.com/en-us/library/azure/mt674682.aspx
and
https://blogs.msdn.microsoft.com/streamanalytics/2015/05/17/out-of-order-events/
for more details about out of order policies.
For your specific example, if the message arrives with a delay of greater than 2 hours and your late arrival policy is to drop events, the events will be dropped. If it is adjust, the timestamp will be adjusted to current processing time.
I have a very basic setup, in which I never get any output if I use the TIMESTAMP BY statement.
I have a stream analytics job which is reading from Event Hub and writing to the table storage.
The query is the following:
SELECT
*
INTO
MyOutput
FROM
MyInput TIMESTAMP BY myDateTime;
If the query uses timestamp statement, I never get any output events. I do see incoming events in the monitoring, there are no errors neither in monitoring nor in the maintenance logs. I am pretty sure that the source data has the right column in the right format.
If I remove the timestamp statement, then everything is working fine. The reason why I need the timestamp statement in the first place is because I need to write a number of queries in the same job, writing various aggregations to different outputs. And if I use timestamp in one query, I am required to use it in all other queries itself.
Am I doing something wrong? Perhaps SELECT * does not play well with TIMESTAMP BY? I just did not find any documentation explaining that...
{"myDateTime":"2015-08-02T10:59:02.0000000Z", "EventEnqueuedUtcTime":"2015-08-07T10:59:07.6980000Z"}
Late tolerance window: 00.00:00:05
All of your events are considered late arriving because myDateTime is 5 days before EventEnqueuedUtcTime. Can you try sending new events where myDateTime is in UTC and is "now" so it matches within a couple of seconds?
Also, when you started the job, what did you pick as the job start date time? Can you make sure you pick a date before the myDateTime values? You might try this first.