I'm trying to stream data from my device to Azure IoT Hub to Stream Analytics to Power BI.
Power BI implemented a new way to display streaming data. I would like to generate a line chart via the "Add tile" button on a Power BI Dashboard. This takes care of autorefresh of my streaming data chart.
My current streaming data (which works excellent when displayed statically in Power BI via "Create report"...) produces a rather weird line chart in streaming data mode:
image.
My guess is that the arrival of new data in Power BI is not in chronological order. New data may be placed in the line chart in the correct temporal position but the line connecting the values is drawn in the order of arrival. This might cause the line to "jump back" in time?!
To minimize wrong ordering I am trying to prevent "adjusting other events" as well as accepting wrong ordering in Stream Analytics: configuration
The problem: with this configuration the Stream Analytics Job creates no output.
My ASA Query looks like this:
SELECT
Name,
Value,
Timecreated,
CAST (latest AS float) AS latest,
COUNT(*)
INTO
[ToPowerBI]
FROM
[Eing-CANdata] TIMESTAMP BY Timecreated
GROUP BY
Name, Value,Timecreated,latest,
tumblingWindow(Duration(Second, 1))
The "Timecreated" is formatted this way:
2017-03-06T11:51:22.246235Z
its accepted by Azure as timestamp.
Changing the configuration to accepting "out of order events with a timestamp in" the range of 10 seconds doesn't produce any output either.
The only way to create output is changing the configuration to "adjusting other events." But the Azure information tells me that "Adjust keeps the events and changes their timestamps". This would reorder the data which is not what I want.
My goals:
get data through Stream Analytics as fast as possible
avoid adjusting the timestamp as I need the original one!!
ultimately get a proper (& "real-time-like") streaming data line chart in PowerBI
My question(s): Why is Stream Analytics not outputting any data in "Drop other events" mode? How can I get output from Stream Analytics in this mode?
(I have an important presentation coming up and your help would be greatly appreciated!)
Related
My Azure Stream Analytics Job does not detect any input events if I use reference data in the query. When I'm using only streaming data it works well.
Here is my query:
SELECT v.localization as Station, v.lonn as Station_Longitude, v.latt as Station_Latitude, d.lat as My_Latitude, d.lon as My_Longitude
INTO [closest-station]
FROM eventhub d
CROSS JOIN [stations] v
WHERE ST_DISTANCE(CreatePoint(d.lat, d.lon), CreatePoint(v.latt, v.lonn) ) < 300
I used eventhub and blob as the input and the result was the same - works only without reference data
Inb4
When I'm testing the query with sample reference data (I'm uploading the exact same file as stored in the reference data location) it returns expected values
I've tested both inputs and tests were conducted successfully
The data comes from the logic app which copies it from dropbox to the eventhub or storage account (I've tested both scenarios) that are used in Azure Stream Analytics as inputs. Even if see this ran successfully, still no input events in ASA appear.
The idea is to get coordinates of the stations closer than 300 m to my localization.
Solved - you have to specify explicitly the reference file in the reference data input path pattern. Specifying container only doesn't work even if there is only one file inside.
Stream Analytics job will wait indefinitely for the blob to become available
As described here: Use referenece data for lookups in Stream Analytics
I am using Stream Analytics to insert data into table storage. This works when all I want to do is add new rows. However, I now want to insert or update existing rows. Is this possible with Stream Analytics/Table storage?
The current implementation of Stream Analytics output to Azure Table uses InsertOrReplace API. So as long as your new data is cumulative (not just the deltas) it should simply work.
On the other hand, if you would like only upsert (insert or update), you could consider DocumentDB output.
If you like something more customized, You could also consider a trigger in your SQL table output.
cheers
Chetan
In short, no. Stream Analytics isn't an ETL tool.
However, you might be able to pass the output to a downstream SQLDB table. Then have a second stream job and query that joins the first to the table using left/right and inner joins. Just an idea, not tested, and not recommended.
OR
Maybe output the streamed data to a SQL DB landing table or Data Lake Store. Then perform a merge there before producing the output dataset. This would be a more natural approach.
I created a Stream Analytics query with the input as Event Hub and output as Power BI Data Set. The stream analytics query dumps all the logs into the Data Set. As the data is continuous, the size of the dataset becomes huge and the visualizations takes a lot of time to render.
Is there any way to retain only the latest 2000 values in the dataset to reduce visualization rendering time?
I tried using sliding window but that also does not seem to solve the problem.
You can use developer portal to purge old data in your table. http://docs.powerbi.apiary.io/#reference/datasets/table-rows/clear-the-rows-in-a-table?console=1
I have stream analytics in azure that sends data to power bi. The data is a complex JSON and I am trying to write a proper query for stream analytics to get the data I need.
I wrote some query, but it returns a record and I cannot figure out what fields are inside the record. In power bi if I drag the field, it just says Record. I assume this is probably a list of JSON records. Is there any way to see the record as JSON or raw text in Power BI (or anywhere else)? I want to see the output of stream analytics, so I could improve my query.
Instead of outputting to Power BI, you could output to blob storage, where you'll be able to see the raw data.
Alternatively, you could use the Query tab in the old Azure portal to "test" your Stream Analytics query on a sample input file. The testing feature will show you the schema of your output.
So I am trying to use Azure Data Factory to replace the SSIS system we have in place, and I am having some trouble...
The process I want to follow is to take a list of projects and a list of clients and create a report of the clients and projects we have. These lists update frequently, so I want to update this report every hour. To combine the data, I will be using Power BI Pro, so Data Factory just needs to load the data into a usable format.
My source right now is a call to an API that returns a list of projects. However, this data isn't separated by time at all. I don't see any sort of history. Same goes for the list of clients.
What should the availability for my dataset be?
you may use the custom activity in ADF to call the API that returns list of projects. The custom activity will then write that data in the right format to the destination.
Example of a custom activity in ADF: https://azure.microsoft.com/en-us/documentation/articles/data-factory-use-custom-activities/
The frequency will be the cadence at which you wish to run this operation.