So I am querying data directly from OMS Log analytics using PowerBI Desktop, and I believe there is an 8MB hard limit on the data returned by the query. The problem I have is that I need to query about 30 000 rows, but hit the 8MB limit around 18 000 rows. Is it possible to break the query up, for example, query1 would return rows 1 - 18 000, query2 would return 18 001 - 28 000 and so on, then I can merge the queries in PowerBI to give me a view of all the data?
Problem is my experience in this field, DAX in particular is quite limited, so I don't know how to specify this in the advanced editor. Any help here would be highly appreciated.
Thanks!
Same Issue. Solved it.
My Need:
I have a table in Azure LogAnalytics (LA) that accumulates about ~35K rows per day. I needed to get all rows from LA into PowerBi for analysis.
My Issue:
I crafted the KQL query I wanted in the LA Logs Web UX. I then selected the "Export -> Export to PowerBI M query" feature. Pasted it into a BLANK query in PowerBi. Authed. And I noticed a few bizarre behaviors:
1) - Like you said, I was getting a rolling ~35K rows of data, each query would trim just a bit off the first date in my KQL range.
2) - Also, I found that for each day, the query would opportunistically trim off some rows - like it was 'guessing' what data I didn't need to fit into a limit.
3) - No matter what KQL |where TimeGenerated >= ago(xd) clause I wrote, I clearly wasn't going to get back more than the limits it held me to.
My Solution - and it works great.
In PowerQuery, i created a new blank table in PowerQuery/M (not a DAX table!). In that table I used DateTimeZone.UtcNow() to start it off with Today's date, then I added a col called [Days Back] and added rows for -1,-2,-3...-7. Then, with some M, I added another col that subtracts Today from Days Back, given me a history of dates..
Now, I have a table from which I can iterate over each Date in history and pass to my KQL query a parm1 for: | where TimeGeneratedDate == todatetime('"& Date.ToText(TimeGeneratedDateLoop) & "')
As you can see, after I edited my main LA query to use TimeGeneratedDateLoop as a parm, I can now get each full day's amount of records w/o hitting the LA limit. Note, that in my case, no single day breaches the 8MB limit. If yours does, then you can attack this problem with making 12-hour breakdowns, instead of full a day.
Here's my final M-query for the function.:
NOTE: I also removed this line from the pre-generated query: "prefer"="ai.response-thinning=true" <- I don't know if it helped, but setting it to false didn't work.
let
FxDailyQuery = (TimeGeneratedDateLoop as date) =>
let
AnalyticsQuery =
let
Source = Json.Document(Web.Contents(
"https://api.loganalytics.io/v1/workspaces/xxxxx-202d-xxxx-a351-xxxxxxxxxxxx/query",
[
Query = [#"query"
= "YourLogAnalyticsTbl
| extend TimeGeneratedDate = bin(TimeGenerated, 1d)
| where notempty(Col1)
| where notempty(Col2)
| where TimeGenerated >= ago(30d)
| where TimeGeneratedDate == todatetime('"& Date.ToText(TimeGeneratedDateLoop) & "')
", #"x-ms-app" = "OmsAnalyticsPBI"],
Timeout = #duration(0, 0, 4, 0)
]
)),
TypeMap = #table({"AnalyticsTypes", "Type"}, {
{"string", Text.Type},
{"int", Int32.Type},
{"long", Int64.Type},
{"real", Double.Type},
{"timespan", Duration.Type},
{"datetime", DateTimeZone.Type},
{"bool", Logical.Type},
{"guid", Text.Type},
{"dynamic", Text.Type}
}),
DataTable = Source[tables]{0},
Columns = Table.FromRecords(DataTable[columns]),
ColumnsWithType = Table.Join(Columns, {"type"}, TypeMap, {"AnalyticsTypes"}),
Rows = Table.FromRows(DataTable[rows], Columns[name]),
Table = Table.TransformColumnTypes(Rows, Table.ToList(
ColumnsWithType,
(c) => {c{0}, c{3}}
))
in
Table
in
AnalyticsQuery
in
FxDailyQuery
Related
In Cognos Analytics I have a dataset containing rows with data (used disk space in MB), with each row being either February or June. Because I want to compare the two months, I want to create two new variables: one with the February data and one with the June data.
In the Query editor I've tried: count (MB) when month = 'February'. This, and a couple of other entries don't work.
I wonder if anyone can provide me the right line of code.
Thanks in advance!
Try this:
Go to query explorer
Create a query for each month
Join the 2 queries (this will result in 3rd query)
At this point you should be able to handle each month as a separate data item
I have a Kusto Query to analyze my app data.
My query shall only show the data from the live app from store, and not the data from our test installations of QA.
So far my only way to distinguish the data is, that the live data is one continuous data every day without gap from a certain date up to now.
The testing data is also generation data, but that is visible as chunks of data for about one or two days in a row and then having a gap.
Here is a screenshot to show how it looks like.
so basically what i want is to cut off all the data before the app went live.
And no, i don't want to manually edit my script each time we go live and change the release date. I want to somehow find out the release date with a sophisticated Kusto query.
Like: Get me all timestamps where every consecutive day has data
i just have no idea how to put this into Kusto
Can you guys help me here?
Best Regards,
Maverick
You can find the last gap in the data using a combination of summarize and prev function, then filter to include only the data after the gap (assuming T is the source dataset):
let lastGap = toscalar(T
| summarize by Timestamp=bin(Timestamp, 1d)
| order by Timestamp asc
| extend gap = Timestamp - prev(Timestamp)
| where gap > 1d
| summarize lastGap = max(Timestamp));
T
| where Timestamp >= lastGap
I am using make-series to create an error dashboard showing events over a given period at a specified interval like so:
make-series dcount(id) default=0 on timestamp from ago(30d) to now() step 8h
This works great, and displays the data as expected. However this specifies an exact date range (30 days ago to now), and I would like to make this use the time range picked by the user on the dashboard (24 hours, 48 hours, etc.).
I know it is possible to get this behavior using summarize, however summarize does not easily allow for setting a default value of zero per timestamp bin (as far as I know).
Is it possible to use the make-series operator without defining a hardcoded date range, and instead use the time range set for a dashboard?
Unfortunately, it is impossible as of now.
You can take a look at this user feedback and upvote for it: Retrieve the portal time span and use it inside the kusto query.
Whilst this is not officially supported (i.e. there is no variable you can use to retrieve the values), you can work around this with a bit of a hack.
For context, I am displaying some aggregations from Azure Container Insights on my dashboards and I wanted to use make-series instead of summarize - the latter does not return empty bins so leaves gaps in graphs where you have no data returned in that bin; however, make-series requires explicit start/end times and a grain.
Given the nature of the above, I have access to a large table of data that is constantly updated (ContainerLog), which gives me a way to find a close approximation of the date range (and any inaccuracy is not a problem as I am reporting on the data of this table anyway).
// All tables with Timestamp or TimeGenerated columns are implicitly filtered, so we can retrieve a very close approximation of min and max here
let startDate = toscalar(ContainerLog | summarize min(TimeGenerated));
let endDate = toscalar(ContainerLog | summarize max(TimeGenerated));
// The regular query sits here, and the above variables can be passed in to make-series
MyLogFunction
| make-series Count=count() default=0 on Timestamp in range(startDate, endDate, 30m) by Severity
| render columnchart with ( legend=hidden )
I have a column of data [Sales ID] that bringing in duplicate data for an analysis. My goal is to try and limit the data to pull unique sales ID's for the max day of every month in the analysis only (instead of daily). Im basically trying to get it to only pull in unique sales ID values for the last the day of every month in the analysis ,and if the current day is the last day so far then it should pull that in. So it should pull in the MAX date in any given month. Please how do i write an expresion with the [Sales ID] column and [Date ] column to acieve this?
Probably the two easiest options are to
1) Adjust the SQL as niko mentioned
2) Limit the visualization with the "Limit Data Using Expression" option, using the following:
Rank(Day([DATE]), "desc", Month([DATE]), Year([DATE])) = 1
If you had to do it in the Data on Demand section (maybe the IL itself is a usp or you don't have permission to edit it), my preference would be to create another data table that only has the max dates for each month, and then filter your first data table by that.
However, if you really need to do it in the Data on Demand section, then I'm guessing you don't have the ability to create your own information links. This would mean you can't key off additional data tables, and you're probably going to have to get creative.
Constraints of creativity include needing to know the "rules" of your data -- are you pulling the data in daily? Once a week? Do you have today's data, or today - 2? You could probably write a python script to grab the last day of every month for the last 10 years, and then whatever yesterday's date was, and throw all those values into a document property. This would allow you to do a "Values from Property".
(Side Note: I want to say you could also do it directly in the expression portion with something like an extremely long
Date(DateTimeNow()),DateAdd("dd",-1,Date(Year(DateTimeNow()), Month(DateTimeNow()), 1))
But Spotfire is refusing to accept that as multiple values. Interestingly, when I pull the logic for a StringList property, it gives this: $map("${udDates}", ","), which suggests commas are an accurate methodology, but I get an error reading "Expected 'End of expression' but found ','" . Uncertain if this is a Spotfire issue, or related to my database connection)
tl;dr -- Doing it in the Data on Demand section is probably convoluted. Recommend adjusting in SQL if possible, and otherwise limiting in the visualization
Working in Cognos Report Studio 10.2.1. I have two query items. First query item is the base table which results in some million records. Second query item is coming from a different table. I need to LEFT OUTER JOIN the first query item with other. In the third query item post the join, I am filtering on a date column which is in formatYYYYMM to give me records falling under 201406 i.e the current Month and Year. This is the common column in both the table apart from AcctNo which is used to join both the tables. The problem is, when I try to view Tabular datathe report takes forever to run. After waiting patiently for 30 mins, I just have to cancel the report. When I add the same filter criteria to the 1st query item on the date column and then view the third query item, it gives me the output. But in the long run, I have to join multiple tables with this base table and in one of the table the filter criteria needs to give output for two months. I am converting a SAS code to Cognos, In SAS code, there is no filter on the base table and even then the join query takes few seconds to run.
My question is: Is there any way to improve the performance of the query so that it runs and more importantly runs in less time? Pl note: Modelling my query in FM is not an option in this case.
I was able to get this resolved myself after many trial and errors.
What I did is created a copy of 1st Query item, and filtered 1st query item with current month and year and the for the copy of 1st query item added a filter for two months. That way I was able to run my query and get the desired results.
Though this is a rare case scenario, hope it helps someone else.