Performance tuning in Cognos Report Studio - cognos

Working in Cognos Report Studio 10.2.1. I have two query items. First query item is the base table which results in some million records. Second query item is coming from a different table. I need to LEFT OUTER JOIN the first query item with other. In the third query item post the join, I am filtering on a date column which is in formatYYYYMM to give me records falling under 201406 i.e the current Month and Year. This is the common column in both the table apart from AcctNo which is used to join both the tables. The problem is, when I try to view Tabular datathe report takes forever to run. After waiting patiently for 30 mins, I just have to cancel the report. When I add the same filter criteria to the 1st query item on the date column and then view the third query item, it gives me the output. But in the long run, I have to join multiple tables with this base table and in one of the table the filter criteria needs to give output for two months. I am converting a SAS code to Cognos, In SAS code, there is no filter on the base table and even then the join query takes few seconds to run.
My question is: Is there any way to improve the performance of the query so that it runs and more importantly runs in less time? Pl note: Modelling my query in FM is not an option in this case.

I was able to get this resolved myself after many trial and errors.
What I did is created a copy of 1st Query item, and filtered 1st query item with current month and year and the for the copy of 1st query item added a filter for two months. That way I was able to run my query and get the desired results.
Though this is a rare case scenario, hope it helps someone else.

Related

Power Query - Alternative for Join to filter Records

I have two tables:
Table: One Row per Order with the Status (Online / Offline)
Table: Multiple Rows per Order
Now I would like to reduce the number of record/ rows in the second table based on the status (Offline) from Table 1.
Is there any alternative to a right join? The first table is filtered on Status 'Offline'
We are talking about several millions of rows which takes some time to Join.
Any thoughts on this from your sight?
Some thoughts:
Create a relationship between these two tables and filter to "Offline".
You could create a join (Merge queries) in Power query and only select the On/Off State column to append. Then the import needs more time, but you're getting a flat dataset in PowerBI
Create a new column in PowerBI with DAX and use LOOKUPVALUE
Without seeing the data I think I would try the first one. If it's too slow, the I think the only way is the second point. Even it takes some more time for importing.
The third one might be the slowest.

Spotfire- limiting Information link colum expression

I have a column of data [Sales ID] that bringing in duplicate data for an analysis. My goal is to try and limit the data to pull unique sales ID's for the max day of every month in the analysis only (instead of daily). Im basically trying to get it to only pull in unique sales ID values for the last the day of every month in the analysis ,and if the current day is the last day so far then it should pull that in. So it should pull in the MAX date in any given month. Please how do i write an expresion with the [Sales ID] column and [Date ] column to acieve this?
Probably the two easiest options are to
1) Adjust the SQL as niko mentioned
2) Limit the visualization with the "Limit Data Using Expression" option, using the following:
Rank(Day([DATE]), "desc", Month([DATE]), Year([DATE])) = 1
If you had to do it in the Data on Demand section (maybe the IL itself is a usp or you don't have permission to edit it), my preference would be to create another data table that only has the max dates for each month, and then filter your first data table by that.
However, if you really need to do it in the Data on Demand section, then I'm guessing you don't have the ability to create your own information links. This would mean you can't key off additional data tables, and you're probably going to have to get creative.
Constraints of creativity include needing to know the "rules" of your data -- are you pulling the data in daily? Once a week? Do you have today's data, or today - 2? You could probably write a python script to grab the last day of every month for the last 10 years, and then whatever yesterday's date was, and throw all those values into a document property. This would allow you to do a "Values from Property".
(Side Note: I want to say you could also do it directly in the expression portion with something like an extremely long
Date(DateTimeNow()),DateAdd("dd",-1,Date(Year(DateTimeNow()), Month(DateTimeNow()), 1))
But Spotfire is refusing to accept that as multiple values. Interestingly, when I pull the logic for a StringList property, it gives this: $map("${udDates}", ","), which suggests commas are an accurate methodology, but I get an error reading "Expected 'End of expression' but found ','" . Uncertain if this is a Spotfire issue, or related to my database connection)
tl;dr -- Doing it in the Data on Demand section is probably convoluted. Recommend adjusting in SQL if possible, and otherwise limiting in the visualization

How to invert a merge query in power query

I have a single column table of customer account numbers and a main table containing 400,000 records pulling from an access database. I want to remove all records from the table where the customer account number can be found in the single column table.
The merge query capability in power query allows me to return only the records where there is a match on the customer list (in addition to a variety of other variations on this theme) but I would like to know whether there is a way to invert this so that I return all records where the customer number does not appear in this list.
I have achieved this already by using the List.Contains function and adding a custom column to identify the rows to exclude and then filtering them out, but I think this is severely impacting the performance of my workbook. Refreshing the table that initially has 400,000 rows prior to this series of transformations takes a very long time, and all queries that depend on this table then also take a long time to refresh.
Thank you
If you do a Left Anti Join of your table with a single column, this will give you your table filtered to only have the rows which do not match to the single column.

Excel Query looking up multiple values for the same name and presenting averages

Apologies if this has been asked before. I would be surprised if it hasn't but I am just not hitting the correct syntax to search and get the answer.
I have a table of raw data for my staff, it contains data on the name of the employee who completed a job and the start and finish times, among other things. I have no unique ID's other than name, and I cant change that as I'm part of a large organisation and I have to make do with the data I'm given.
what I would like to do it present a table (Table 2) that shows the name of the employee and then takes the start/finish times for all of their jobs on table 1 and presents the average time taken across all of their jobs.
I have used Vlookup in the past but I'm not sure it will cut it here. the raw data table contains approx 6000 jobs each month.
On table 1 i work out the time taken for each job with this formula;
=IF(V6>R6,V6-R6,24-R6+V6) (R= started Time) (V= Completed Time) in 24hr clock.
I have gone this route as some jobs are started before midnight and completed afterwards. Although my raw data also contains dates (started/completed) in separate columns so I am open to an experts feedback on this and if there is a better way to work out the total time form start to completion.
I believe the easiest way to tackle this would be with a Pivot Table. Calculate the time taken for each Name and Job combination in Table 1; create a pivot table with the Name in the Row Labels and the Time in the Values -- change the Time Values to be an average instead of a sum:
Alternatively, you could create a unique list of names, perhaps with Data > Remove Duplicates and then use an =AVERAGEIF formula:
Thanks this give me the thread to pull on, I have unique names as its the persons full name, but ill try pivot tables to hopefully make it a little more future proof for other things to be reports on later.

Access Crosstab Report

I am fairly new to access and have a database report issue I can't seem to figure out (even with several posts on the topic :/ )
The database houses audit information from 200+ stores. The audit answers are text, not numeric, and the audit date can be any day. I want to create a report that lists the audit question as row headers, the most recent three audit dates as column headers, and the audit answers as the data.
I have a form that allows the user to select the store, and that feeds the crosstab query. It works, except it does not limit to most recent three dates. The table that feeds the query also lists visit number, so I thought I could do something with the max of visit number but to no avail.
The main issue though is now I cannot get a crosstab report to generate any data. I have found several example pieces of code from back in the day that I have tried, but each tries to generate a row and grand total. Since these are text data fields, the totals will not work and while I have tried removing the pieces of code I think are appropriate, it still does not generate the correct report.
A second option I thought of trying was to export the crosstab query to excel but I am also stuck there.
Any help would be VERY appreciated. Thank you!
Depending on your skill level, this may be difficult. My approach would be a bunch of subqueries, but it won't be pretty. Let's say your stores are uniquely identified by StoreID, your audit dates are Audit Date, and your audit results are in a single text field AuditResults. It'll be up to you to figure out the details, but here's a rough outline:
Get all your data into a single table with 1-3 rows per StoreID (1 per audit date):
Make a simple query that groups by StoreID and returns max (AuditDate). Call it Qry1a.
Join Qry1a back to the audit table to return the results from the most recent audit. Include a daterank column that is hardcoded to 1. Call it Qry1b.
Repeat a and b two more times to get the results from the 2nd and 3rd most recent audits (You can left join onto Qry1 and Qry2 to remove those results). Call those Qry2b and Qry3b.
Using inner joins with your source table and Qry1b,Qry2b, and Qry3b, make a query that return a max of 3 rows per StoreID.
With this new table, make a new query and group by StoreID:
For the first columns, you'll use something like LastAudit: max(iif(DateRank=1,AuditResults,""))
Repeat the previous step to get the next 2 audits changing daterank to 2 and 3 respectively.
The advantage of this approach is that it can handle stores with less than 3 audits or if you have a constantly changing list of stores (because storeIDs aren't hardcoded anywhere).

Resources