Speeding up SQL Query with CASE in JOIN - excel

working on a DB2 database using SQL in VBA to make some custom reports. I came up with the below query, but it took like 45-minutes to run with 80,000 results... the table BILLPRC probably has twice that in total before the WHERE clause. Just wondering if there is a better way to write the query to speed it up. This might be a pretty ambiguous question, so I can explain further if you need more info.
SELECT b.BLCO# || RIGHT('00000' || b.BLACCT, 5) Acct, b.BLRECT Type, b.BLREC# Record, i.INAME || ' ' || i.INAME2 Desc, b.BLPPGM# Promo, SUBSTR(b.BLPEFFD, 4, 2) || SUBSTR(b.BLPEFFD, 6, 2) || SUBSTR(b.BLPEFFD, 2, 2) Eff, SUBSTR(b.BLPENDD, 4, 2) || SUBSTR(b.BLPENDD, 6, 2) || SUBSTR(b.BLPENDD, 2, 2) Exp, CASE
WHEN (p.PPPRC1 = '0') THEN (p.PPPRC2)
WHEN (p.PPPRC1 != '0') THEN (p.PPPRC1) END Price,
CASE
WHEN (p.PPPRC1 = '0') THEN (p.PPCST2)
WHEN (p.PPPRC1 != '0') THEN (p.PPCST1) END Allow, i.ILASTC Cost
FROM QS36F.BILLPRC b
LEFT JOIN QS36F.PROMO p
ON b.BLREC# || b.BLPPGM# = p.PPREC || p.PPPGM#
LEFT JOIN QS36F.ITEM i
ON CASE
WHEN (b.BLRECT = 'I') AND (b.BLREC# = i.IMFGR || i.ICOLOR || i.IPATT) THEN 1
WHEN (b.BLRECT = 'P') AND (b.BLREC# = i.IPRCCD) THEN 1
END = 1
WHERE (b.BLPPGM# != '') AND (b.BLSTS != 'H')
ORDER BY (b.BLACCT)
I have a feeling there isn't given the amount of records and CASEs being checked.

You may likely get better performance by changing your WHERE predicate (b.BLPPGM# != '') AND (b.BLSTS != 'H') by rewriting it in the join condition(s).
Try not to concatenate your fields used for joining, if possible. For example, instead of
LEFT JOIN QS36F.PROMO p
ON b.BLREC# || b.BLPPGM# = p.PPREC || p.PPPGM#
you could say
LEFT JOIN QS36F.PROMO p
ON b.BLREC# = p.PPREC
and b.BLPPGM# = p.PPPGM#
or equivalently
LEFT JOIN QS36F.PROMO p
ON (b.BLREC#, b.BLPPGM#) = (p.PPREC, p.PPPGM#)
adding in the logic from your WHERE clause
LEFT JOIN QS36F.PROMO p
ON (b.BLREC#, b.BLPPGM#) = (p.PPREC, p.PPPGM#)
Of course, this assumes that the corresponding fields are the same type, or compatible types, and ideally the fields will be the same size. DB2 can often automatically convert one type to another compatible type. String types are often compatible with each other, and numeric types often with each other.
Then, most importantly, make sure you have the right indexes available. (I'm not too familiar with the S/36 mode environment, but I'm assuming you can create indexes over it.) Indexes can also be built over a derived column, ie. an expression such as concatenation.

For reporting purposes, use a materialized query table, unless if you can't really afford the CPU time. I know it's a bit lazy, but MQTs are meant for when you are okay with getting for instance hourly updated results, and don't want to start normalizing and optimizing your database. You have a working logic, and you apparently understand it, so go with it.

This looks like it's DB2 for i....you might want to tag it appropriately
Assuming it is, have you used the "Run & Explain" option in iNav's Run SQL scripts to see if the DB recommends any new indexes?

Related

Cosmos DB paginated query with custom order by clause

I want to do a select query in Cosmos DB that returns a maximum number of results (say 50) and then gives me the continuation token so I can continue the search where I left off.
Now let's say my query has 2 equality conditions in my where clause, e.g.
where prop1 = "a" and prop2 = "w" and prop3 = "g"
In the results that are returned, I want the records that satisfy prop1 = "a" to appear first, followed by the results that have prop2 = "w" followed by the ones with prop3 = "g".
Why do I need it? Because while I could just get all the data to my application and sort it there, I can't pull all records obviously as that would mean pulling in too much data. So if I can't order it this way in cosmos itself, in the results that I get, I might only have those records that don't have prop1 = "a" at all. Now I could keep retrying this till I get the ones with prop1 = "a" (I need this because I want to show the results with prop1 = "a" as the first set of results to the user) but I might have to pull like a 100 times to get the first record since I have a huge dataset sitting in my Cosmos DB.
How can I handle this scenario in Cosmos? Thanks!
So if I am understanding your question correctly, you want to accomplish this:
SELECT * FROM c
WHERE
c.prop1 = 'a'
AND
c.prop2 = 'b'
AND
c.prop3 = 'c'
ORDER BY
c.prop1, c.prop2, c.prop3
OFFSET 0 LIMIT 25
Now, luckily you can now do this in CosmosDB SQL. But, there is a caveat. You have to set up a composite index in your collection to allow for this.
So, for this collection, my composite index would look like this:
Now, if I wanted to change it to this:
SELECT * FROM c
WHERE
c.prop1 = 'a'
AND
c.prop2 = 'b'
AND
c.prop3 = 'c'
ORDER BY
c.prop1 DESC, c.prop2, c.prop3
OFFSET 0 LIMIT 25
I could add another composite index to cover that use-case. You can see in your settings it's an array of arrays so you can add as many combinations as you'd like.
This should get you to where you need to be if I understood your question correctly.

Many to many AQL query

I have 2 collections and one edge collection. USERS, FILES and FILES_USERS.
Im trying to get all FILES documents, that has the field "what" set to "video", for a specific user, but also embed another document, also from the collection FILES, but where the "what" is set to "trailer" and belongs to the "video" into the results.
I have tried the below code but its not working correctly, im getting a lot of duplicate results...its a mess. Im definitely doing it wrong.
FOR f IN files
FILTER f.what=="video"
LET trailer = (
FOR f2 IN files
FILTER f2.parent_key==f._key
AND f2.what=="trailer"
RETURN f2
)
FOR x IN files_users
FILTER x._from=="users/18418062"
AND x.owner==true
RETURN DISTINCT {f,trailer}
There may be a better way to do this with graph query syntax, but try this. Adjust the UNIQUE functions based on your data-model.
LET user_files = UNIQUE(FOR u IN FILES_USERS
FILTER u._from == "users/18418062" AND u.owner
RETURN u._to)
FOR uf IN user_files
FOR f IN files
FILTER f._key == uf AND f.what == "video"
LET trailers = UNIQUE(FOR t IN files
FILTER t.parent_key == f._key AND t.what == "trailer"
RETURN t)
RETURN {"video": f, "trailers": trailers}
Well, check to see If you have duplicate data as suggested by TMan, however check your query syntax too. It appears that you have no link between your f subquery and the x in the main query. That would cause the query to potentially return a lot of dups if there are multiple records in collection files_users for user users/18418062
Try adding a join in the main query. Something like:
FOR x IN files_users
FILTER x._from=="users/18418062"
AND x.owner==true
AND x._to == f._id
RETURN DISTINCT {f,trailer}
On a related note, if you run into performance issues doing a subquery for trailers , you could instead try just doing a join and array expansion and see if that works for your case

Core Data predicate for any row of the associated entity

In the following I will describe a simplification of my Core Data schema that I am pretty sure to be equivalent to my real situation.
I have two entity First and Second linked by a many-to-many relationship r.
Second has got two Boolean attribute, let us call it take and you_should_not_take.
I want to make a query that select a row of First iff it exits an associated row of Second for with take = true.
I have tried this predicate:
NSPredicate(format: "ANY (%K == YES && %K == NO)", "r.take","r.you_should_not_take")
but Core Data gives me the following error: "[General] Unable to parse the format string".
Maybe I have to use this:
NSPredicate(format: "((ANY %K == YES) && (ANY %K == NO))", "r.take","r.you_should_not_take")
but I am afraid that this last query will select a row of First iff it exists a row x of Second for which x.take == YES && it exists a row y if Second for which y.you_should_not_take == NO, with no guarantee that x == y.
An SQL query would be very simple to be made, but I am not so experienced with Core Data, so while I will try some more queries (and I will test if the second query does what I think it does) I have also asked here hoping the answer would be as simple as in SQL.
To select all First objects which are related to (at least one) Second object for which take==true and you_should_not_take==false you have
to use a SUBQUERY. Something like (untested):
SUBQUERY(r, $x, $x.take == YES AND $x.you_should_not_take == NO).#count > 0

Nested subquery in FOR ALL ENTRIES

Consultant sent me this code example, here is something he expects to get
SELECT m1~vbeln_im m1~vbelp_im m1~mblnr smbln
INTO CORRESPONDING FIELDS OF TABLE lt_mseg
FROM mseg AS m1
INNER JOIN mseg AS m2 ON m1~mblnr = m2~smbln
AND m1~mjahr = m2~sjahr
AND m1~zeile = m2~smblp
FOR ALL ENTRIES IN lt_vbfa
WHERE
AND m2~bwart = '102'
AND 0 = ( select SUM( ( CASE
when SHKZG = 'S' THEN 1
when SHKZG = 'H' THEN -1
else 0
END ) *MENGE ) MENGE
into lt_mseg-summ
from mseg
where
VBELN_IM = m1~vbeln_im
and VBELP_IM = m1~vbelp_im
).
The problem is I don't see how that should work in current syntax. I think about deriving internal select and using it as condition to main one, but is there a proper way to write this nested construction?
As i get it, if nested statement = 0, then main query executes. The problem here is the case inside nested statement. Is it even possible in ABAP? And in my opinion this check could be used outside from main SQL query.
Any suggestions are welcome.
the logic that you were given is part of Native/Open SQL and has some shortcomings that you need to be aware of.
the statement you are showing has to be placed between EXEC SQL and ENDEXEC.
the logic is platform dependent.
there is no syntax checking performed between the EXEC and ENDEXEC
the execution of this bypasses the database buffering process, so its slower
To me, I would investigate a better way to capture the data that performs better outside of open/native sql.
If you want to move forward with this type of logic, below are a couple of links which should be helpful. There is an example select using a nested select with a case statement.
Test Program
Example Logic
This is probably what you need, it works at least since ABAP 750.
SELECT vbeln UP TO 100 ROWS
FROM vbfa
INTO TABLE #DATA(lt_vbfa).
DATA(rt_vbeln) = VALUE range_vbeln_va_tab( FOR GROUPS val OF <line> IN lt_vbfa GROUP BY ( low = <line>-vbeln ) WITHOUT MEMBERS ( sign = 'I' option = 'EQ' low = val-low ) ).
SELECT m1~vbeln_im, m1~vbelp_im, m1~mblnr, m2~smbln
INTO TABLE #DATA(lt_mseg)
FROM mseg AS m1
JOIN mseg AS m2
ON m1~mblnr = m2~smbln
AND m1~mjahr = m2~sjahr
AND m1~zeile = m2~smblp
WHERE m2~bwart = '102'
AND m1~vbeln_im IN ( SELECT vbelv FROM vbfa WHERE vbelv IN #rt_vbeln )
GROUP BY m1~vbeln_im, m1~vbelp_im, m1~mblnr, m2~smbln
HAVING SUM( CASE m1~shkzg WHEN 'H' THEN 1 WHEN 'S' THEN -1 ELSE 0 END * m1~menge ) = 0.
Yes, aggregating and FOR ALL ENTRIES is impossible in one SELECT, but you can trick the system with range and subquery. Also you don't need three joins for summarizing reversed docs, your SUM subquery is redundant here.
If you need to select documents not only by delivery number but also by position this will be more complicated for sure.

Complex queries in linq to nhibernate

We are using accountability pattern for organizational structure. I using linq to nhibernate to find some departments and position but I have two problem.
var query =
Repository<Party>.Find(p => p.IsInternal == true)
.Where(p => p.Parents.Any(c => c.Parent.PartyId == id))
.Where(p =>
(
p.PartyType.PartyTypeId == (int)PartyTypeDbId.Department &&
_secretariat.Departments.Select(c => c.PartyId).Contains(p.PartyId)
)
||
(
p.PartyType.PartyTypeId == (int)PartyTypeDbId.Position &&
p.Children.Any(c =>
c.AccountabilityType.AccountabilityTypeId == (int)AccountabilityTypeDbId.TenurePersonOfPosition &&
((Person)c.Child).UserName != null)
)
);
First : I got 'Unhandled Expression Type: 1003' for this part of query : '_secretariat.Departments.Select(c => c.PartyId).Contains(p.PartyId)'
and I got Property not found 'UserName'
We have many complex queries i think we need to use stored procedure.
Sorry for bad Inglish!
One nice thing that you can do with LINQ is break your queries into multiple parts. Since you are building an expression tree that won't get executed until the results are enumerated, you don't have to do it all in one line (like SQL).
You can even make some reusable "filters" that you can apply to IQueryable. These filter functions accept an IQueryable as an argument, and return one as a result. You can build these as extension methods if you like (I like to).
As for your immediate problem, you may want to try a join on _secretariat instead of attempting a subquery. I've seen them work in scenarios where subqueries don't.
In addition to Lance's comments you might want to look at a compiled Linq query and breaking up some of the responsibilties to follow SOLID principles.
I've just also found out that there are issues with the Contains when containing Any linq methods. However, Any seems to work well within Any, hence:
_secretariat.Departments.Select(c => c.PartyId).Any(x => x == p.PartyId)

Resources