How concatenating selective query results as string in Neo4j? - text

Initial situation
I’ve written a working Cypher query, which returns four distinct quantities.
MATCH
<complex satement>
WITH
count(DISTINCT typeA) AS amountA,
count(DISTINCT typeB) AS amountB,
count(DISTINCT typeC) AS amountC,
count(DISTINCT typeD) AS amountD
RETURN
amountA, amountB, amountC, amountD;
Target solution
Instead of a four-column table I want to return a single text string now, where all four quantities are concatenated including a descriptive label. However the quantity must only be part of the string, if its amount is greater than zero.
╒════════════════════════════════════════════════════╕
│"formattedQuantities" │
╞════════════════════════════════════════════════════╡
│"amountA: 123456, amountC: 9876543, amountD: 2018" │
└────────────────────────────────────────────────────┘
(Because the value of amountB is 0, it is omitted in the result.)
I use this Cyper query for several millions of rows. Because of a performance impact apprehension I don’t want to create and call a custom plugin.
So, how can I return the quantities as string with Cypher and Neo4j? Can you please give me an advice how to solve this challenge? Many thanks in advance for pointing me into the right direction!
approach to the problem / preliminary result
Cypher statement:
MATCH
<complex satement>
WITH
count(DISTINCT typeA) AS amountA,
count(DISTINCT typeB) AS amountB,
count(DISTINCT typeC) AS amountC,
count(DISTINCT typeD) AS amountD
WITH
['amountA: ', amountA, ', amountB: ', amountB, ', amountC: ', amountC, ', amountD: ', amountD] AS quantities
RETURN
reduce(result = toString(head(quantities)), n IN tail(quantities) | result + n) AS formattedQuantities;
Result:
╒═════════════════════════════════════════════════════════════════╕
│"formattedQuantities" │
╞═════════════════════════════════════════════════════════════════╡
│"amountA: 123456: 1, amountB: 0, amountC: 9876543, amountD: 2018"│
└─────────────────────────────────────────────────────────────────┘
Still open:
filtering of amountB because of value 0

You want to use the FILTER function
MATCH
<complex satement>
WITH
count(DISTINCT typeA) AS amountA,
count(DISTINCT typeB) AS amountB,
count(DISTINCT typeC) AS amountC,
count(DISTINCT typeD) AS amountD
// Reformat to list
WITH
[{name:'amountA', value:amountA}, {name:'amountB', value:amountB}, {name:'amountC', value:amountC}, {name:'amountD', value:amountD}] AS quantities
// Filter out 0's
WITH filter(x IN quantities WHERE x.value > 0) AS quantities
// Convert list to string
RETURN
reduce(result = quantities[0].name + ": " + quantities[0].value, n IN tail(quantities) | result + ", " + n.name + ": " + n.value) AS formattedQuantities; AS formattedQuantities;
Note that this returns null if all values are 0 (null + string = null)

Related

MssqlRow to json string without knowing structure and data type on compile time [duplicate]

Using PostgreSQL I can have multiple rows of json objects.
select (select ROW_TO_JSON(_) from (select c.name, c.age) as _) as jsonresult from employee as c
This gives me this result:
{"age":65,"name":"NAME"}
{"age":21,"name":"SURNAME"}
But in SqlServer when I use the FOR JSON AUTO clause it gives me an array of json objects instead of multiple rows.
select c.name, c.age from customer c FOR JSON AUTO
[{"age":65,"name":"NAME"},{"age":21,"name":"SURNAME"}]
How to get the same result format in SqlServer ?
By constructing separate JSON in each individual row:
SELECT (SELECT [age], [name] FOR JSON PATH, WITHOUT_ARRAY_WRAPPER)
FROM customer
There is an alternative form that doesn't require you to know the table structure (but likely has worse performance because it may generate a large intermediate JSON):
SELECT [value] FROM OPENJSON(
(SELECT * FROM customer FOR JSON PATH)
)
no structure better performance
SELECT c.id, jdata.*
FROM customer c
cross apply
(SELECT * FROM customer jc where jc.id = c.id FOR JSON PATH , WITHOUT_ARRAY_WRAPPER) jdata (jdata)
Same as Barak Yellin but more lazy:
1-Create this proc
CREATE PROC PRC_SELECT_JSON(#TBL VARCHAR(100), #COLS VARCHAR(1000)='D.*') AS BEGIN
EXEC('
SELECT X.O FROM ' + #TBL + ' D
CROSS APPLY (
SELECT ' + #COLS + '
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER
) X (O)
')
END
2-Can use either all columns or specific columns:
CREATE TABLE #TEST ( X INT, Y VARCHAR(10), Z DATE )
INSERT #TEST VALUES (123, 'TEST1', GETDATE())
INSERT #TEST VALUES (124, 'TEST2', GETDATE())
EXEC PRC_SELECT_JSON #TEST
EXEC PRC_SELECT_JSON #TEST, 'X, Y'
If you're using PHP add SET NOCOUNT ON; in the first row (why?).

Athena query results show null values despite is not null condition in query

I have the following query which I run in Athena. I would like to receive all the results that contain a tag in the 'resource_tags_aws_cloudformation_stack_name'. However, when I run the query my results show me rows where the 'resource_tags_aws_cloudformation_stack_name' is empty and I don't know what I am doing wrong.
SELECT
cm.line_item_usage_account_id,
pr.line_of_business,
cm.resource_tags_aws_cloudformation_stack_name,
SUM(CASE WHEN cm.line_item_product_code = 'AmazonEC2'
THEN line_item_unblended_cost * 0.97
ELSE cm.line_item_unblended_cost END) AS discounted_cost,
CAST(cm.line_item_usage_start_date AS DATE) AS start_day
FROM cost_management cm
JOIN prod_cur_metadata pr ON cm.line_item_usage_account_id = pr.line_item_usage_account_id
WHERE cm.line_item_usage_account_id IN ('1234504482')
AND cm.resource_tags_aws_cloudformation_stack_name IS NOT NULL
AND cm.line_item_usage_start_date
BETWEEN date '2020-01-01'
AND date '2020-01-30'
GROUP BY cm.line_item_usage_account_id,pr.line_of_business, cm.resource_tags_aws_cloudformation_stack_name, CAST(cm.line_item_usage_start_date AS DATE), pr.line_of_business
HAVING sum(cm.line_item_blended_cost) > 0
ORDER BY cm.line_item_usage_account_id
I modified my query to exclude ' ' and that seems to work:
SELECT
cm.line_item_usage_account_id,
pr.line_of_business,
cm.resource_tags_aws_cloudformation_stack_name,
SUM(CASE WHEN cm.line_item_product_code = 'AmazonEC2'
THEN line_item_unblended_cost * 0.97
ELSE cm.line_item_unblended_cost END) AS discounted_cost,
CAST(cm.line_item_usage_start_date AS DATE) AS start_day
FROM cost_management cm
JOIN prod_cur_metadata pr ON cm.line_item_usage_account_id = pr.line_item_usage_account_id
WHERE cm.line_item_usage_account_id IN ('1234504482')
AND NOT cm.resource_tags_aws_cloudformation_stack_name = ' '
AND cm.line_item_usage_start_date
BETWEEN date '2020-01-01'
AND date '2020-01-30'
GROUP BY cm.line_item_usage_account_id,pr.line_of_business, cm.resource_tags_aws_cloudformation_stack_name, CAST(cm.line_item_usage_start_date AS DATE), pr.line_of_business
HAVING sum(cm.line_item_blended_cost) > 0
ORDER BY cm.line_item_usage_account_id
You can try space use case as below
AND Coalesce(cm.resource_tags_aws_cloudformation_stack_name,' ') !=' '
Or if you have multiple spaces try. The below query is not good if spaces required in actual data
AND Regexp_replace(cm.resource_tags_aws_cloudformation_stack_name,' ') is not null
Adding to this you may also have special char like CR or LF in data. Although its rare scenario

How to pass main query column value to nested sub query Where condition?

I am writing this query with nested subquery to find PREPARED_BY, VERIFIED_BY, AUTHORIZED_BY depending on CONDATE from Expenditure table, but in my sub query the Expenditure table object CONDATE is not recognized and throws this error :
ORA-00904: "EX"."CONDATE": invalid identifier.
Code:
SELECT ex.conno,
ex.itemno,
ex.adv_no || ' ' || to_char(ex.condate, 'DD-MON-YYYY') chequenodate,
ex.conname,
ex.apaid,
ex.dpayment,
gf.gf_name,
expenditure_type,
ex.off_code,
ofc.officename,
ex.remarks,
(SELECT prepared_by
FROM (SELECT prepared_by
FROM authorization
WHERE (pre_last_date >= ex.condate OR pre_last_date IS NULL)
AND project_id = 128
ORDER BY id ASC)
WHERE rownum = 1) AS prepared_by,
(SELECT verified_by
FROM (SELECT verified_by
FROM authorization
WHERE (ve_last_date >= ex.condate OR ve_last_date IS NULL)
AND project_id = 128
ORDER BY id ASC)
WHERE rownum = 1) AS verified_by,
(SELECT authorized_by
FROM (SELECT authorized_by
FROM authorization
WHERE (au_last_date >= ex.condate OR au_last_date IS NULL)
AND project_id = 128
ORDER BY id ASC)
WHERE rownum = 1) AS authorized_by
FROM expenditure ex
INNER JOIN officecode ofc
ON ofc.off_code = ex.off_code
INNER JOIN coa_category ca
ON ca.coa_cat_id = ex.coa_cat_id
INNER JOIN g_fund_type gf
ON gf.gf_type_id = ca.gf_type_id
WHERE ex.conno = 'MGSP/PMU/NON/145'
AND ex.itemno = 149;
The problem you're experiencing is that parent table can only be referenced by a subquery one level down. You're trying to access columns from the parent table in the subquery two levels down, hence why you're getting the error.
In order to access the parent column in your subquery, you're going to need to rewrite it so that it's only one level down.
This can be achieved by using the KEEP FIRST/LAST aggregate function, e.g.:
SELECT ex.conno,
ex.itemno,
ex.adv_no || ' ' || to_char(ex.condate, 'DD-MON-YYYY') chequenodate,
ex.conname,
ex.apaid,
ex.dpayment,
gf.gf_name,
expenditure_type,
ex.off_code,
ofc.officename,
ex.remarks,
(SELECT MAX(a.prepared_by) KEEP (dense_rank FIRST ORDER BY a.id ASC)
FROM authorizatiion a
WHERE (a.pre_last_date >= ex.condate OR a.pre_last_date IS NULL)
AND a.project_id = 128) prepared_by,
(SELECT MAX(a.verified_by) KEEP (dense_rank FIRST ORDER BY a.id ASC)
FROM authorizatiion a
WHERE (a.ve_last_date >= ex.condate OR a.ve_last_date IS NULL)
AND a.project_id = 128) verified_by,
(SELECT MAX(a.authorized_by) KEEP (dense_rank FIRST ORDER BY a.id ASC)
FROM authorizatiion a
WHERE (a.au_last_date >= ex.condate OR a.au_last_date IS NULL)
AND a.project_id = 128) authorized_by
FROM expenditure ex
INNER JOIN officecode ofc ON ofc.off_code = ex.off_code
INNER JOIN coa_category ca ON ca.coa_cat_id = ex.coa_cat_id
INNER JOIN g_fund_type gf ON gf.gf_type_id = ca.gf_type_id
WHERE ex.conno = 'MGSP/PMU/NON/145'
AND ex.itemno = 149;
N.B. I have used MAX and FIRST here; this means that if there are multiple rows with the same lowest id, the highest value of the prepared_by column will be used. You could change this to MIN if you wanted the lowest value. This is only relevant if you have more than one row per id, otherwise it simply returns the value of the prepared_by column for the lowest id.

Assign Column Name for Simple Coalesce Statement

While attempting to create a list of all ID's made since _____ I am able to get the results I want from the following:
DECLARE #BoID varchar(max)
SELECT #BoID = COALESCE(#BoID + ', ', '') +
CAST(ApplicationID AS varchar(10))
FROM BoList as "ID"
WHERE CreatedDate > '2017-07-01 18:14:09.210'
However, I am having issues with establishing a column name for the above statement. Where does the as "ID" need to be located at in order to give the above result a column name of "ID"?
As the query stands now, you are giving the table BoList an alias of "ID" instead of the column. Since you are selecting the value into a variable there is no output. You can do it like this...
SELECT COALESCE(#BoID + ', ', '') +
CAST(ApplicationID AS varchar(10)) as "ID"
FROM BoList
WHERE CreatedDate > '2017-07-01 18:14:09.210'
Or if you really do need to stash the value in a variable to return later as part of another query...
DECLARE #BoID varchar(max)
SELECT #BoID = COALESCE(#BoID + ', ', '') +
CAST(ApplicationID AS varchar(10))
FROM BoList
WHERE CreatedDate > '2017-07-01 18:14:09.210'
SELECT #BoID AS "ID", other columns... FROM whatever

Count null columns as zeros with oracle

I am running a query with Oracle:
SELECT
c.customer_number,
COUNT(DISTINCT o.ORDER_NUMBER),
COUNT(DISTINCT q.QUOTE_NUMBER)
FROM
Customer c
JOIN Orders o on c.customer_number = o.party_number
JOIN Quote q on c.customer_number = q.account_number
GROUP BY
c.customer_number
This works beautifully and I can get the customer and their order and quote counts.
However, not all customers have orders or quotes but I still want their data. When I use LEFT JOIN I get this error from Oracle:
ORA-24347: Warning of a NULL column in an aggregate function
Seemingly this error is caused by the eventual COUNT(NULL) for customers that are missing orders and/or quotes.
How can I get a COUNT of null values to come out to 0 in this query?
I can do COUNT(DISTINCT NVL(o.ORDER_NUMBER, 0)) but then the counts will come out to 1 if orders/quotes are missing which is no good. Using NVL(o.ORDER_NUMBER, NULL) has the same problem.
Try using inline views:
SELECT
c.customer_number,
o.order_count,
q.quote_count
FROM
customer c,
( SELECT
party_number,
COUNT(DISTINCT order_number) AS order_count
FROM
orders
GROUP BY
party_number
) o,
( SELECT
account_number,
COUNT(DISTINCT quote_number) AS quote_count
FROM
quote
GROUP BY
account_number
) q
WHERE 1=1
AND c.customer_number = o.party_number (+)
AND c.customer_number = q.account_number (+)
;
Sorry, but I'm not working with any databases right now to test this, or to test whatever the ANSI SQL version might be. Just going on memory.

Resources