Get distinct value from multiple rows based on key from other column

Get distinct value from multiple rows based on key from other column - excel

I have two tables with a relation to the attribute Sys_ID in Excel PowerPivot.
Sys_Value in Table2 is a Lookup from Table1 ( =Related(Table1[Sys_Value]) )
Table1
Sys_ID Sys_Value
Sys-1 10
Sys-2 20
Table2
ID Org_ID Sys_ID_FK Sys_ValueLookUp
1 Org-1 Sys-1 10
2 Org-2 Sys-1 10
3 Org-3 Sys-1 10
4 Org-2 Sys-2 20
5 Org-3 Sys-2 20
In a PowerPivot chart, I need Sys_ID_FK, Sys_Value_LookUp and to filter on Org_ID
I am getting the following result in the pivot chart/table:
Filter: Not set (all)
Result:
Sys-1 30
Sys-2 40
This is wrong and the correct result should be:
Filter: Not set (all)
Result:
Sys-1 10
Sys-2 20
or second example
Filter: Org-1
Result:
Sys-1 10
How can I get a result that is counting only one value per "Sys"?
Or is there a way to apply the Org-filter from table2 to table1?

The pivot table is summing the Sys_Value_Lookup for all selected rows. If you don't want that, then you can switch the aggregation to Max instead of Sum under the Value Field Settings.

Related

looping string list and get no record count from table

I have string values get from a table using listagg(column,',')
so I want to loop this string list and set into where clause for another table
then I want to get a count when no any records in the table (Number of times with no record)
I'm writing this inside the plsql procedure
order_id
name
10
test1
20
test2
22
test3
25
test4
col_id
product
order_id
1
pro1
10
2
pro2
30
3
pro2
38
expected result : count(Number of times with no record) in 2nd table
count = 3
because there is no any record of 20,22,25 order ids in 2nd table
only have record for order_id - 10
my queries
SELECT listagg(ord.order_id,',')
into wk_orderids
from orders ord,
where ord.id_no = wk_id_no;
loop
-- do my stuff
end loop
wk_orderids values = ('10','20','22','25')
I want to loop this one(wk_orderids) and set it one by one into a select query where clause
then want to get the count Number of times with no record

If you want to count ORDER_IDs in the 2nd table that don't exist in ORDER_ID column of the 1st table, then your current approach looks as if you were given a task to do that in the most complicated way. Aggregating values, looping through them, adding values into a where clause (which then requires dynamic SQL) ... OK, but - why? Why not simply
select count(*)
from (select order_id from first_table
minus
select order_id from second_table
);

How to get a subset of teradata table i.e. from nth row to n+3th row values

Assume I have a table A with 100 records in it in Teradata. Now I have to pass 20-20 rows 5 times to a specific process. I am struggling to segment that whole table with 100 records into 5 subparts, any clue of any SQL which can give me such data.
Example:
table A
A AA
B BB
C CC
D DD
E EE
F FF
Here I have 6 records, I want to fetch first 2 and then second 2 and then last 2 records one by one, any SQL help

If there's some unique column(s) you can apply ROW_NUMBERs:
select *
from table
QUALIFY
ROW_NUMBER() OVER (ORDER BY unique_column(s)) BETWEEN 3 AND 4
;
Of course, this is not very efficient on a big table.

Way to add same keys to delta table merge

I have a delta table. Inside this delta table, contains duplicate keys. For example:
id age
1 22
1 23
1 25
2 22
2 11
When merging a new table to the delta table that looks like this:
id age
1 23
1 24
1 23
2 21
2 12
Using this function:
def upsertToDelta(microBatchOutputDF):
(student_table.alias("t").merge(
microBatchOutputDF.alias("s"),
"s.id = t.id")
.whenMatchedUpdateAll()
.whenNotMatchedInsertAll()
.execute())
It throws an error:
Cannot perform Merge as multiple source rows matched and attempted to modify the same
I understand why this is happening, but what I'd like to know is how I can remove the old keys and insert the new keys even though the ids are the same. So the resulting table should look like this:
id age
1 23
1 24
1 23
2 21
2 12
Is there a way to do this?

This looks like SCD type 1 change, where we overwrite the old data with the new ones. To handle this, you must have atleast one unique to act as merge key. A simple row_number can also be sufficient in your case, like this:
Before Merge:
Add row_number, partitioned by id column, in new data. This is handled in the merge statement below. (Just printing here for understanding)
Merge SQL:
MERGE INTO student_table AS target
USING (
SELECT id AS merge_key, id, age
FROM microBatchOutputDF
WHERE id IN (
SELECT DISTINCT id
FROM student_table
)
UNION ALL
SELECT NULL AS merge_key, id, age
FROM microBatchOutputDF
WHERE id IN (
SELECT DISTINCT id
FROM student_table
)
) AS source
ON target.id = source.id
AND target.id = source.merge_key
WHEN MATCHED
THEN
DELETE
WHEN NOT MATCHED AND source.merge_key IS NULL
THEN
INSERT (target.id, target.row_num, target.age)
VALUES (source.id, 1, source.age)
;
The result:

How to get row with largest value?

What I thought would work is:
SELECT *
FROM customer_sale
WHERE sale_date < '2019-02-01'
GROUP BY customer_id
HAVING sale_date = MAX(sale_date)
But running this results in an error
HAVING clause expression references column sale_date which is
neither grouped nor aggregated
Is there another way to achieve this in Spanner? And more generally, why isn't the above allowed?
Edit
Example of data in customer_sale table:
customer_id sale_date
-------------------------------
1 Jan 15
1 Jan 30
1 Feb 2
1 Feb 4
2 Jan 15
2 Feb 2
And the expected result:
customer_id sale_date
-------------------------------
1 Jan 30
2 Jan 15

A HAVING clause in SQL specifies that an SQL SELECT statement should
only return rows where aggregate values meet the specified conditions.
It was added to the SQL language because the WHERE keyword could not
be used with aggregate functions
This is the test table I am using:
index, customer_id, sale_date
1 1 2017-08-25T07:00:00Z
2 1 2017-08-26T07:00:00Z
3 1 2017-08-27T07:00:00Z
4 1 2017-08-28T07:00:00Z
5 2 2017-08-29T07:00:00Z
6 2 2017-08-30T07:00:00Z
With this query:
Select customer_id, max(sale_date) as max_date
from my_test_table
group by customer_id;
I get this result:
customer_id max_date
1 2017-08-28T07:00:00Z
2 2017-08-30T07:00:00Z
Also including where statement:
Select customer_id, max(sale_date) as max_date
from my_test
where sale_date < '2017-08-28'
group by customer_id;

I had the same problem and this way I was able to solve. If you have a quite big table it might take some time.
Basically, joining your normal table with the table which has records with maximum values solves it.
select c.* from
(select * from customer_sale WHERE sale_date < '2019-02-01') c
inner join
(SELECT customer_id, max(sale_date) as max_sale_date FROM customer_sale WHERE
sale_date < '2019-02-01' group by customer_id) max_c
on c.customer_id = max_c.customer_id and c.sale_date = max_c.sale_date

How can I get a percentage field with power pivot?

This should be a fairly easy question for Power Pivot users since I'm a newbie. I am trying to do the following. After pivoting a table I get a crosstab table like this
rating count of id
A 1
B 2
Grand Total 3
You can imagine the original table only has two columns (rating and id) and three rows (1 id for A and two different id's for the B rating). What DAX formula do I have to write in order to create a measure that simply shows
rating percent of id
A 1/3
B 2/3
Grand Total 3/3
By 1/3 of course I mean 0.3333, I wrote it like that so that it is clear that I simply want that percent of id is the count for each rating divided by the total count. Thank you very much

You need to divide the count for each row by the total count.
DIVIDE (
COUNT ( Table1[ID] ),
CALCULATE ( COUNT ( Table1[ID] ), ALL ( Table1 ) )
)
For this particular calculation, you don't have to write DAX though. You can just set it in the Value Field Settings.
Summarize Value By : Count
Show Values As : % of Column Total

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Get distinct value from multiple rows based on key from other column - excel

The pivot table is summing the Sys_Value_Lookup for all selected rows. If you don't want that, then you can switch the aggregation to Max instead of Sum under the Value Field Settings.

Related

looping string list and get no record count from table

How to get a subset of teradata table i.e. from nth row to n+3th row values

Way to add same keys to delta table merge

How to get row with largest value?

How can I get a percentage field with power pivot?

Categories

Resources