Informix - Count and make pagination at same time - pagination

Currently I´m doing this (get pagination and count), in Informix:
select a.*, b.total from (select skip 0 first 10 * from TABLE) a,(select count(*) total from TABLE) b
The problem is that I´m repeating the same pattern - I get the first ten results from all and then I count all the results.
I want to make something like this:
select *, count(*) from TABLE:
so I can make my query much faster. It is possible?

Related

Correct way to get the last value for a field in Apache Spark or Databricks Using SQL (Correct behavior of last and last_value)?

What is the correct behavior of the last and last_value functions in Apache Spark/Databricks SQL. The way I'm reading the documentation (here: https://docs.databricks.com/spark/2.x/spark-sql/language-manual/functions.html) it sounds like it should return the last value of what ever is in the expression.
So if I have a select statement that does something like
select
person,
last(team)
from
(select * from person_team order by date_joined)
group by person
I should get the last team a person joined, yes/no?
The actual query I'm running is shown below. It is returning a different number each time I execute the query.
select count(distinct patient_id) from (
select
patient_id,
org_patient_id,
last_value(data_lot) data_lot
from
(select * from my_table order by data_lot)
where 1=1
and org = 'my_org'
group by 1,2
order by 1,2
)
where data_lot in ('2021-01','2021-02')
;
What is the correct way to get the last value for a given field (for either the team example or my specific example)?
--- EDIT -------------------
I'm thinking collect_set might be useful here, but I get the error shown when I try to run this:
select
patient_id,
last_value(collect_set(data_lot)) data_lot
from
covid.demo
group by patient_id
;
Error in SQL statement: AnalysisException: It is not allowed to use an aggregate function in the argument of another aggregate function. Please use the inner aggregate function in a sub-query.;;
Aggregate [patient_id#89338], [patient_id#89338, last_value(collect_set(data_lot#89342, 0, 0), false) AS data_lot#91848]
+- SubqueryAlias spark_catalog.covid.demo
The posts shown below discusses how to get max values (not the same as last in a list ordered by a different field, I want the last team a player joined, the player may have joined the Reds, the A's, the Zebras, and the Yankees, in that order timewise, I'm looking for the Yankees) and these posts get to the solution procedurally using python/r. I'd like to do this in SQL.
Getting last value of group in Spark
Find maximum row per group in Spark DataFrame
--- SECOND EDIT -------------------
I ended up using something like this based upon the accepted answer.
select
row_number() over (order by provided_date, data_lot) as row_num,
demo.*
from demo
You can assign row numbers based on an ordering on data_lots if you want to get its last value:
select count(distinct patient_id) from (
select * from (
select *,
row_number() over (partition by patient_id, org_patient_id, org order by data_lots desc) as rn
from my_table
where org = 'my_org'
)
where rn = 1
)
where data_lot in ('2021-01','2021-02');

STRING_SPLIT on INNER JOIN takes different indexes and bad performance

I have a query using STRING_SPLIT on an INNER JOIN and it looks very different behaviours deppending on how i join.
EXAMPLE 1 (11 seconds - 161K rows)
DECLARE #SucursalFisicaIDs AS TABLE (value int)
INSERT INTO #SucursalFisicaIDs
SELECT value FROM STRING_SPLIT('16531,16532,16533,16534,16536,16537,16538,16539,16541,16543,16591,16620,17071',',')
SELECT ArticuloID, SUM(Existencias) AS Existencias
FROM ArticulosExistenciasSucursales_VIEW WITH (NOEXPAND)
INNER JOIN #SucursalFisicaIDs AS SucursalFisicaIDs ON SucursalFisicaIDs.value = ArticulosExistenciasSucursales_VIEW.SucursalFisicaID
GROUP BY ArticuloID
EXAMPLE 2 (2 seconds - 161K rows)
SELECT ArticuloID, SUM(Existencias) AS Existencias
FROM ArticulosExistenciasSucursales_VIEW WITH (NOEXPAND)
WHERE SucursalFisicaID NOT IN (16531,16532,16533,16534,16536,16537,16538,16539,16541,16543,16591,16620,17071)
GROUP BY ArticuloID
Both Queries read from the view (it is indexed).
EXAMPLE 3 (3 seconds - 161K rows)
SELECT ArticuloID, SUM(Existencias) AS Existencias
FROM ArticulosExistenciasSucursales_VIEW WITH (NOEXPAND)
INNER JOIN STRING_SPLIT('16531,16532,16533,16534,16536,16537,16538,16539,16541,16543,16591,16620,17071',',') AS SucursalFisicaIDs ON SucursalFisicaIDs.value = ArticulosExistenciasSucursales_VIEW.SucursalFisicaID
GROUP BY ArticuloID
EXAMPLE 4 (6 seconds - 161K rows)
DECLARE #SucursalFisicaIDs AS TABLE (value int)
INSERT INTO #SucursalFisicaIDs
SELECT value FROM STRING_SPLIT('16531,16532,16533,16534,16536,16537,16538,16539,16541,16543,16591,16620,17071',',')
SELECT ArticuloID, SUM(Existencias) AS Existencias
FROM ArticulosExistenciasSucursales_VIEW WITH (NOEXPAND)
WHERE SucursalFisicaID NOT IN (Select Value From #SucursalFisicaIDs)
GROUP BY ArticuloID
Can anyopne tell me why doesnt the first example work? It shoul be the same and better than the others. Is anything on the types i shlould do?
Note in example 4, the tablescan on SucursalFisicaIDs Table, the number of rows. (3M???)
Regards.

Cassandra paging: How to page trough entire column family (table) and have part of compound key in resultset

I have a table as follows:
CREATE TABLE someTable (
user_id uuid,
id uuid,
someField string,
anotherField string,
PRIMARY KEY (user_id, id)
);
I know that there's a way to do paging in cassandra (https://docs.datastax.com/en/developer/java-driver/2.1/manual/paging/)
However, what I need to do is:
page trough entire table (it's large, so paging is required)
get all rows of a user_id
do something with these rows.
In short I need to fetch all the results of 1 user and do this for every record there is. (No, I don't have a unique list of user_ids here)
Also, I know I could do this programatically: paging through all the pages, assume i get it ordered by user_id, and append the last user_id (where rows are cut off) to the next page of results so data of that user gets in the same set.
However, I was hoping there would be a more elegant solution for this?
However, what I need to do is:
page trough entire table (it's large, so paging is required).
Assuming you don't know the **user_id**. And you want to fetch all the users data. To do this use token function to make a range query to get the user_ids.Displaying rows from an unordered partitioner with the TOKEN function something like select * from someTable where token(a_id) > token(other_id);
get all rows of a user_id
Now you know the user_id and want to fetch all the rows of that user_id. Use range query based on id starting from the MIN_UUID. Like:
select * from someTable where user_id = 123 and id > MIN_UUID limit 100
After that query choose the 100th uuid to fetch other rows. such that:
select * from someTable where user_id = 123 and id > [previous_quries_100th_id(uuid)] limit 100
Keep querying until you fetched all the rows.
do something with these rows.
It depends on you what you want to do with all of those rows. Use language specific ResultSet and iterate over rows to do something on there.

Get the average value between a specified set of rows

Everyone, I'm building a report using Visual Studio 2012
I want to be able to average a group of values between a specific set of rows.
What I have so far is something like this.
=(Count(Fields!SomeField.Value))*.1
and
=(Count(Fields!SomeField.Value))*.9
I want to use those two values to get the Average of Fields!SomeField.Value between those to numbers of the row. Basically I'm removing the top and bottom 10% of the data and get the middle 80% to average out. Maybe there is a better way to do this? Thanks for any help.
Handle it in SQL itself.
Method 1:
Use NTILE function. Go through this link to learn more about NTILE.
Try something like this
WITH someCTE AS (
SELECT SomeField, NTILE(10) OVER (ORDER BY SomeField) as percentile
FROM someTable)
SELECT AVG(SomeField) as myAverage
FROM someCTE WHERE percentile BETWEEN 2 and 9
if your dataset is bigger
WITH someCTE AS (
SELECT SomeField, NTILE(100) OVER (ORDER BY SomeField) as percentile
FROM someTable)
SELECT AVG(SomeField) as myAverage
FROM someCTE WHERE percentile BETWEEN 20 and 90
METHOD 2:
SELECT Avg(SomeField) myAvg
From someTable
Where SomeField NOT IN
(
SELECT Top 10 percent someField From someTable order by someField ASC
UNION ALL
SELECT Top 10 percent someField From someTable order by someField DESC
)
Note:
Test for boundary conditions to make sure you are getiing what you need. If needed tweak above sql code.
For NTILE: Make sure you NTILE parameter is less(or equal) than the number of rows in the table.

How to order results based on number of search term matches?

I am using the following InnoDB tables in mysql to describe records that can have multiple searchtags associated with them:
TABLE records
ID
title
desc
TABLE searchTags
ID
name
TABLE recordSearchTags
recordID
searchTagID
To SELECT records based on arbitrary search input, I have a statement that looks sort of like this:
SELECT
recordSearchTags.recordID
FROM
recordSearchTags
LEFT JOIN searchTags
ON recordSearchTags.searchTagID = searchTags.ID
WHERE
searchTags.name LIKE CONCAT('%','$search1','%') OR
searchTags.name LIKE CONCAT('%','$search2','%') OR
searchTags.name LIKE CONCAT('%','$search3','%') OR
searchTags.name LIKE CONCAT('%','$search4','%');
I'd like to ORDER this resultset, so that rows that match with more search terms are displayed in front of rows that match with fewer search terms.
For example, if a row matches all 4 search terms, it will be top of the list. A row that matches only 2 search terms will be somewhere in the middle. And a row that matches just one search term will be at the end.
Any suggestions on what is the best way to do this?
Thanks!
* Replaced answer, since fulltext isn't an option
Alright, it's not pretty, but you should be able to do something like this:
ORDER BY (searchTags.name LIKE CONCAT('%','$search1','%')
+ searchTags.name LIKE CONCAT('%','$search2','%')
+ searchTags.name LIKE CONCAT('%','$search3','%')
+ searchTags.name LIKE CONCAT('%','$search4','%'))
DESC;
LIKE returns 1 on a match or 0 if there is no match, so you should just be able to add the results together.
This isn't very pretty but one way would be to union the 4 likes in 4 statements like
select ... where searchTags.name LIKE CONCAT('%','$search1','%')
union
select ...
and so on. Wrap that in a:
select recordSearchTags.recordID, count(*) from (<inner unions>)
group by recordSearchTags.recordID
order by count(*)

Resources