Select data of a partition with DolphinDB? - partition

This is the table I used for querying.
dbName = "dfs://trade"
tbName = "trade"
if(existsDatabase(dbName)){
dropDatabase(dbName)
}
db1 = database(, VALUE, 2020.01.01..2022.01.01)
db2 = database(, HASH, [SYMBOL, 5])
db = database(dbName, COMPO, [db1, db2], , "TSDB")
schemaTable = table(
array(SYMBOL, 0) as SecurityID,
array(SYMBOL, 0) as Market,
array(TIMESTAMP, 0) as TradeTime,
array(DOUBLE, 0) as TradePrice,
array(INT, 0) as TradeQty,
array(DOUBLE, 0) as TradeAmount,
array(INT, 0) as BuyNum,
array(INT, 0) as SellNum
)
db.createPartitionedTable(schemaTable, tbName, `TradeTime`SecurityID, {TradeTim:"delta"}, sortColumns=`SecurityID`TradeTime, keepDuplicates=ALL)
With TradeTime and SecurityID as the partitioning columns, how can I retrieve data for each stock in a specified partition on a day?

To obtain the data of the HASH 1 partition of SecurityID on the date 2020.01.02:
select count(*) from loadTable("dfs://trade", "trade") where date(TradeTime)=2020.01.02, partition(SecurityID, 1)

Related

Query limitations due to composite partition keys in cassandra?

If i have two table structures, one with:
Let this be A,
PRIMARY KEY (measureid, statename, reportyear, countyname)
and another with, (Let this be B):
PRIMARY KEY ((measureid, statename, reportyear), countyname)
What are the query limitations of table structure B over A ?
In what queries having composite partition key will pose a problem?
In table A where:
PRIMARY KEY (measureid, statename, reportyear, countyname)
You can query the table with just measureid and it will return rows of statename. Specifically:
SELECT FROM ... WHERE measureid = ?
Alternatively, you can also query with:
SELECT FROM ... WHERE measureid = ? AND statename = ?
SELECT FROM ... WHERE measureid = ? AND statename = ? AND reportyear = ?
SELECT FROM ... WHERE measureid = ? AND statename = ? AND reportyear = ? AND countyname = ?
In table B where:
PRIMARY KEY ((measureid, statename, reportyear), countyname)
You must specify all of measureid, statename, reportyear to query the data. This will return all the rows of one partition:
SELECT FROM ... WHERE measureid = ? AND statename = ? AND reportyear = ?
To retrieve one specific row of one partition:
SELECT FROM ... WHERE measureid = ? AND statename = ? AND reportyear = ? AND countyname = ?
To be clear, you cannot query table B with the following:
SELECT FROM ... WHERE measureid = ?
SELECT FROM ... WHERE measureid = ? AND statename = ?
since you must specify the 3 columns of the partition key. I've explained why in this post https://community.datastax.com/questions/7866/. Cheers!

How to pass main query column value to nested sub query Where condition?

I am writing this query with nested subquery to find PREPARED_BY, VERIFIED_BY, AUTHORIZED_BY depending on CONDATE from Expenditure table, but in my sub query the Expenditure table object CONDATE is not recognized and throws this error :
ORA-00904: "EX"."CONDATE": invalid identifier.
Code:
SELECT ex.conno,
ex.itemno,
ex.adv_no || ' ' || to_char(ex.condate, 'DD-MON-YYYY') chequenodate,
ex.conname,
ex.apaid,
ex.dpayment,
gf.gf_name,
expenditure_type,
ex.off_code,
ofc.officename,
ex.remarks,
(SELECT prepared_by
FROM (SELECT prepared_by
FROM authorization
WHERE (pre_last_date >= ex.condate OR pre_last_date IS NULL)
AND project_id = 128
ORDER BY id ASC)
WHERE rownum = 1) AS prepared_by,
(SELECT verified_by
FROM (SELECT verified_by
FROM authorization
WHERE (ve_last_date >= ex.condate OR ve_last_date IS NULL)
AND project_id = 128
ORDER BY id ASC)
WHERE rownum = 1) AS verified_by,
(SELECT authorized_by
FROM (SELECT authorized_by
FROM authorization
WHERE (au_last_date >= ex.condate OR au_last_date IS NULL)
AND project_id = 128
ORDER BY id ASC)
WHERE rownum = 1) AS authorized_by
FROM expenditure ex
INNER JOIN officecode ofc
ON ofc.off_code = ex.off_code
INNER JOIN coa_category ca
ON ca.coa_cat_id = ex.coa_cat_id
INNER JOIN g_fund_type gf
ON gf.gf_type_id = ca.gf_type_id
WHERE ex.conno = 'MGSP/PMU/NON/145'
AND ex.itemno = 149;
The problem you're experiencing is that parent table can only be referenced by a subquery one level down. You're trying to access columns from the parent table in the subquery two levels down, hence why you're getting the error.
In order to access the parent column in your subquery, you're going to need to rewrite it so that it's only one level down.
This can be achieved by using the KEEP FIRST/LAST aggregate function, e.g.:
SELECT ex.conno,
ex.itemno,
ex.adv_no || ' ' || to_char(ex.condate, 'DD-MON-YYYY') chequenodate,
ex.conname,
ex.apaid,
ex.dpayment,
gf.gf_name,
expenditure_type,
ex.off_code,
ofc.officename,
ex.remarks,
(SELECT MAX(a.prepared_by) KEEP (dense_rank FIRST ORDER BY a.id ASC)
FROM authorizatiion a
WHERE (a.pre_last_date >= ex.condate OR a.pre_last_date IS NULL)
AND a.project_id = 128) prepared_by,
(SELECT MAX(a.verified_by) KEEP (dense_rank FIRST ORDER BY a.id ASC)
FROM authorizatiion a
WHERE (a.ve_last_date >= ex.condate OR a.ve_last_date IS NULL)
AND a.project_id = 128) verified_by,
(SELECT MAX(a.authorized_by) KEEP (dense_rank FIRST ORDER BY a.id ASC)
FROM authorizatiion a
WHERE (a.au_last_date >= ex.condate OR a.au_last_date IS NULL)
AND a.project_id = 128) authorized_by
FROM expenditure ex
INNER JOIN officecode ofc ON ofc.off_code = ex.off_code
INNER JOIN coa_category ca ON ca.coa_cat_id = ex.coa_cat_id
INNER JOIN g_fund_type gf ON gf.gf_type_id = ca.gf_type_id
WHERE ex.conno = 'MGSP/PMU/NON/145'
AND ex.itemno = 149;
N.B. I have used MAX and FIRST here; this means that if there are multiple rows with the same lowest id, the highest value of the prepared_by column will be used. You could change this to MIN if you wanted the lowest value. This is only relevant if you have more than one row per id, otherwise it simply returns the value of the prepared_by column for the lowest id.

Could not add table 'SELECT('

Error given when adding query that runs in SQL developer but not in MS Query. Seems to not like my nested query.
Code I am using:
SELECT ORDER_DATE
,SALES_ORDER_NO
,CUSTOMER_PO_NUMBER
,DELIVER_TO
,STATUS
,ITEM_NUMBER
,DESCRIPTION
,ORD_QTY
,SUM(QUANTITY) AS ON_HAND
,PACKAGE_ID
,PACKAGE_STATUS
,MAX(TRAN_DATE) AS LAST_TRANSACTION
,MIN(DAYS) AS DAYS
FROM (
SELECT TRUNC(SH.ORDER_DATE) AS ORDER_DATE
,SH.SALES_ORDER_NO
,SH.CUSTOMER_PO_NUMBER
,SH.SHIP_CODE AS DELIVER_TO
,SH.STATUS
,SB.ITEM_NUMBER
,IM.DESCRIPTION
,SB.ORD_QTY
,BID.QUANTITY
,SPM.PACKAGE_ID
,CASE
WHEN SPM.SHIPPED = 'Y'
THEN 'SHIPPED'
WHEN SPM.STATUS = 'C'
THEN 'PACKED'
WHEN BID.QUANTITY IS NOT NULL
THEN 'AVAILABLE'
WHEN BID.QUANTITY IS NULL
THEN 'UNAVAILABLE'
END AS PACKAGE_STATUS
,CASE
WHEN SPM.SHIPPED = 'Y'
THEN TRUNC(SPM.BILLING_DATE)
WHEN SPM.STATUS = 'C'
THEN TRUNC(SPM.END_TIME)
WHEN BID.QUANTITY IS NOT NULL
THEN TRUNC(BID.ACTIVATION_TIME)
END AS TRAN_DATE
,CASE
WHEN SPM.SHIPPED = 'Y'
THEN ROUND(SYSDATE - SPM.BILLING_DATE, 0)
WHEN SPM.STATUS = 'C'
THEN ROUND(SYSDATE - SPM.END_TIME, 0)
WHEN BID.QUANTITY IS NOT NULL
THEN ROUND(SYSDATE - BID.ACTIVATION_TIME, 0)
END AS DAYS
FROM SO_HEADER SH
LEFT JOIN SO_BODY SB ON SB.SO_HEADER_TAG = SH.SO_HEADER_TAG
LEFT JOIN SO_PACKAGE_MASTER SPM ON SPM.PACKAGE_ID = SB.PACKAGE_ID
LEFT JOIN ITEM_MASTER IM ON IM.ITEM_NUMBER = SB.ITEM_NUMBER
LEFT JOIN V_BIN_ITEM_DETAIL BID ON BID.ITEM_NUMBER = SB.ITEM_NUMBER
WHERE SH.ORDER_TYPE = 'MSR'
AND (
SB.REASON_CODE IS NULL
OR SB.REASON_CODE NOT LIKE 'CANCEL%'
)
AND (
IM.DESCRIPTION NOT LIKE '%CKV%'
OR IM.DESCRIPTION IS NULL
AND IM.ITEM_NUMBER IS NOT NULL
)
)
WHERE PACKAGE_STATUS <> 'SHIPPED'
GROUP BY ORDER_DATE
,SALES_ORDER_NO
,CUSTOMER_PO_NUMBER
,DELIVER_TO
,STATUS
,ITEM_NUMBER
,DESCRIPTION
,ORD_QTY
,PACKAGE_ID
,PACKAGE_STATUS
ORDER BY (
CASE PACKAGE_STATUS
WHEN 'AVAILABLE'
THEN 1
WHEN 'UNAVAILABLE'
THEN 2
WHEN 'PACKED'
THEN 3
END
)
,LAST_TRANSACTION;
Is there any option I can select that will allow me to run this query?

Count query in GridGain

Is there a count query in GridGain?
GridCacheQuery<Map.Entry<Long, Person>> qry =
queries.createSqlQuery(Person.class, "select count() from Person where street = ?");
int count = qry.execute("streetname").get();
Try SQL fields query which can select specific columns instead of the whole class:
GridCacheQuery<List<?>> qry = queries.createSqlFieldsQuery(
"select count() from Person where street = ?");
Collection<List<?>> rows = qry.execute("streetname").get();
List<?> firstRow = rows.get(0);
int count = (Integer)firstRow.get(0);

How to do multiget in CQL3 for composite row key?

CF schema:
CREATE TABLE mytable (
upperId int,
lowerId int,
hour timestamp,
counter text,
succ int,
fail int,
PRIMARY KEY ((upperId, lowerId), hour, counter));
each record is keyed by composite id upperId:lowerid, how can I do multiget with CQL3?
This is not valid:
select * from mytable where (upperid, lowerid) in ((10000, 1), (10000, 2), (20000, 1));
I can't do this either:
select * from mytable where (upperid = 10000 and lowerid in (1, 2)) or (upperid = 20000 and lowerid = 1);
I got error: missing EOF at ')'.
Please help point to effective way to do multiget for composite row key in CQL3.
Thanks,
William
CQL does not yet support a logical "or" in select statements.
Instead, in your application your could combine the result sets from the two queries:
select * from mytable where upperid = 10000 and lowerid in (1, 2);
select * from mytable where upperid = 20000 and lowerid = 1;
Reference:
SO question: Alternative for OR condition after where clause in select statement Cassandra
Latest CQL docs

Resources