Include Summation Row with Group By Clause - sap-ase

Query:
SELECT aType, SUM(Earnings - Expenses) "Rev"
FROM aTable
GROUP BY aType
ORDER BY aType ASC
Results:
| aType | Rev |
| ----- | ----- |
| A | 20 |
| B | 150 |
| C | 250 |
Question:
Is it possible to display a summary row at the bottom such as below by using Sybase syntax within my initial query, or would it have to be a separate query altogether?
| aType | Rev |
| ----- | ----- |
| A | 20 |
| B | 150 |
| C | 250 |
=================
| All | 320 |
I couldn't get the ROLLUP function from SQL to translate over to Sybase successfully but I'm not sure if there is another way to do this, if at all.
Thanks!

Have you tried just using a UNION ALL similar to this:
select aType, Rev
from
(
SELECT aType, SUM(Earnings - Expenses) "Rev", 0 SortOrder
FROM aTable
GROUP BY aType
UNION ALL
SELECT 'All', SUM(Earnings - Expenses) "Rev", 1 SortOrder
FROM aTable
) src
ORDER BY SortOrder, aType
See SQL Fiddle with Demo. This gives the result:
| ATYPE | REV |
---------------
| A | 10 |
| B | 150 |
| C | 250 |
| All | 410 |

May be you can work out with compute by clause in sybase like:
create table #tmp1( name char(9), earning int , expense int)
insert into #tmp1 values("A",30,20)
insert into #tmp1 values("B",50,30)
insert into #tmp1 values("C",60,30)
select name, (earning-expense) resv from #tmp1
group by name
order by name,resv
compute sum(earning-expense)
OR
select name, convert(varchar(15),(earning-expense)) resv from #tmp1
group by name
union all
SELECT "------------------","-----"
union all
select "ALL",convert(varchar(15),sum(earning-expense)) from #tmp1
Thanks,
Gopal

Not all versions of Sybase support ROLLUP. You can do it the old fashioned way:
with t as
(SELECT aType, SUM(Earnings - Expenses) "Rev"
FROM aTable
GROUP BY aType
)
select t.*
from ((select aType, rev from t) union all
(select NULL, sum(rev))
) t
ORDER BY (case when atype is NULL then 1 else 0 end), aType ASC
This is the yucky, brute force approach. If this version of Sybase doesn't support with, you can do:
select t.aType, t.Rev
from ((SELECT aType, SUM(Earnings - Expenses) "Rev"
FROM aTable
GROUP BY aType
) union all
(select NULL, sum(rev))
) t
ORDER BY (case when atype is NULL then 1 else 0 end), aType ASC
This is pretty basic, standard SQL.

Related

Spark SQL - best way to programmatically loop over a table

Say I have the following spark dataframe:
| Node_id | Parent_id |
|---------|-----------|
| 1 | NULL |
| 2 | 1 |
| 3 | 1 |
| 4 | NULL |
| 5 | 4 |
| 6 | NULL |
| 7 | 6 |
| 8 | 3 |
This dataframe represents a tree structure consisting of several disjoint trees. Now, say that we have a list of nodes [8, 7], and we want to get a dataframe containing just the nodes that are roots of the trees containing the nodes in the list.The output looks like:
| Node_id | Parent_id |
|---------|-----------|
| 1 | NULL |
| 6 | NULL |
What would be the best (fastest) way to do this with spark queries and pyspark?
If I were doing this in plain SQL I would just do something like this:
CREATE TABLE #Tmp
Node_id int,
Parent_id int
INSERT INTO #Tmp Child_Nodes
SELECT #num = COUNT(*) FROM #Tmp WHERE Parent_id IS NOT NULL
WHILE #num > 0
INSERT INTO #Tmp (
SELECT
p.Node_id
p.Parent_id
FROM
#Tmp t
LEFT-JOIN Nodes p
ON t.Parent_id = p.Node_id)
SELECT #num = COUNT(*) FROM #Tmp WHERE Parent_id IS NOT NULL
END
SELECT Node_id FROM #Tmp WHERE Parent_id IS NULL
Just wanted to know if there's a more spark-centric way of doing this using pyspark, beyond the obvious method of simply looping over the dataframe using python.
parent_nodes = spark.sql("select Parent_id from table_name where Node_id in [2,7]").distinct()
You can join the above dataframe with the table to get the Parent_id of those nodes as well.

How to combine two columns into one in Sqlite and also get the underlying value of the Foreign Key?

I want to be able to combine two columns from a table into one column then to to be able to get the actual value of the foreign keys. I can do these things individually but not together.
Following the answer below I was able to combine the two columns into one using the first sql statement below.
How to combine 2 columns into a new one in sqlite
The combining process is shown below:
+---+---+
|HT | AT|
+---+---+
|1 | 2 |
|5 | 7 |
|9 | 5 |
+---+---+
into one column as shown:
+---+
|HT |
+---+
| 1 |
| 5 |
| 9 |
| 2 |
| 7 |
| 5 |
+---+
The second SQL statement show's the actual value of each foreign key corresponding to each foreign key id. The Foreign Key Table.
+-----+------------------------+
|T_id | TN |
+-----+------------------------+
| 1 | 'Dallas Cowboys |
| 2 | 'Chicago Bears' |
| 5 | 'New England Patriots' |
| 7 | 'New York Giants' |
| 9 | 'New York Jets' |
+-----+------------------------+
sql = "SELECT * FROM (SELECT M.HT FROM M UNION SELECT M.AT FROM Match)t"
The second sql statement lets me get the foreign key values for each value in M.HT.
sql = "SELECT M.HT, T.TN FROM M INNER JOIN T ON M.HT = T.Tid WHERE strftime('%Y-%m-%d', M.ST) BETWEEN \'2015-08-01\' AND \'2016-06-30\' AND M.Comp = 6 ORDER BY M.ST"
Result of second SQL statement:
+-----+------------------------+
| HT | TN |
+-----+------------------------+
| 1 | 'Dallas Cowboys |
| 5 | 'New England Patriots' |
| 9 | 'New York Jets' |
+-----+------------------------+
But try as I might I have not been able to combine these queries!
I believe the following will work (assuming that the tables are Match and T and baring the WHERE and ORDER BY clauses for brevity/ease) :-
SELECT DISTINCT(m.ht), t.tn
FROM
(SELECT Match.HT FROM Match UNION SELECT Match.AT FROM Match) AS m
JOIN T ON t.tid = m.ht
JOIN Match ON (m.ht = Match.ht OR m.ht = Match.at)
/* WHERE and ORDER BY clauses using Match as m only has columns ht and at */
WHERE strftime('%Y-%m-%d', Match.ST)
BETWEEN \'2015-08-01\' AND \'2016-06-30\' AND Match.Comp = 6
ORDER BY Match.ST
;
Note only tested without the WHERE and ORDER BY clause.
That is using :-
DROP TABLE IF EXISTS Match;
DROP TABLE IF EXISTS T;
CREATE TABLE IF NOT EXISTS Match (ht INTEGER, at INTEGER, st TEXT DEFAULT (datetime('now')));
CREATE TABLE IF NOT EXISTS t (tid INTEGER PRIMARY KEY, tn TEXT);
INSERT INTO T (tn) VALUES('Cows'),('Bears'),('a'),('b'),('Pats'),('c'),('Giants'),('d'),('Jets');
INSERT INTO Match (ht,at) VALUES (1,2),(5,7),(9,5);
/* Directly without the Common Table Expression */
SELECT
DISTINCT(m.ht), t.tn,
Match.st /*<<<<< Added to show results of obtaining other values from Matches >>>>> */
FROM
(SELECT Match.HT FROM Match UNION SELECT Match.AT FROM Match) AS m
JOIN T ON t.tid = m.ht
JOIN Match ON (m.ht = Match.ht OR m.ht = Match.at)
/* WHERE and ORDER BY clauses here using Match */
;
Noting that limited data (just the one extra column) was used for brevity
Results in :-

With hive's complex struct data type, how to write query with where clause

I've have following hive table with complex data type, STRUCT. Can you please help writing hive query with where clause for specific city?
CREATE EXTERNAL TABLE user_t (
name STRING,
id BIGINT,
isFTE BOOLEAN,
role VARCHAR(64),
salary DECIMAL(8,2),
phones ARRAY<INT>,
deductions MAP<STRING, FLOAT>,
address ARRAY<STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>>,
others UNIONTYPE<FLOAT,BOOLEAN,STRING>,
misc BINARY
)
I'm able to use STRUCT data type in select clause but not able to use same in where clause.
Working:
select address.city from user_t;
Not working:
select address.city from user_t where address.city = 'XYZ'
Documentation says it has limitation while using group by or where clause and gave a solution as well. But I didn't understand it clearly.
Link: Documentation
Please suggest. Thank you.
Demo
create table user_t
(
id bigint
,address array<struct<street:string, city:string, state:string, zip:int>>
)
;
insert into user_t
select 1
,array
(
named_struct('street','street_1','city','city_1','state','state_1','zip',11111)
,named_struct('street','street_2','city','city_1','state','state_1','zip',11111)
,named_struct('street','street_3','city','city_3','state','state_3','zip',33333)
)
union all
select 2
,array
(
named_struct('street','street_4','city','city_4','state','state_4','zip',44444)
,named_struct('street','street_5','city','city_5','state','state_5','zip',55555)
)
;
Option 1: explode
select u.id
,a.*
from user_t as u
lateral view explode(address) a as details
where details.city = 'city_1'
;
+----+---------------------------------------------------------------------+
| id | details |
+----+---------------------------------------------------------------------+
| 1 | {"street":"street_1","city":"city_1","state":"state_1","zip":11111} |
| 1 | {"street":"street_2","city":"city_1","state":"state_1","zip":11111} |
+----+---------------------------------------------------------------------+
Option 2: inline
select u.id
,a.*
from user_t as u
lateral view inline(address) a
where a.city = 'city_1'
;
+----+----------+--------+---------+-------+
| id | street | city | state | zip |
+----+----------+--------+---------+-------+
| 1 | street_1 | city_1 | state_1 | 11111 |
| 1 | street_2 | city_1 | state_1 | 11111 |
+----+----------+--------+---------+-------+
Option 3: self join
select u.*
from user_t as u
join (select distinct
u.id
from user_t as u
lateral view inline(address) a
where a.city = 'city_1'
) as u2
on u2.id = u.id
;
+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | address |
+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 1 | [{"street":"street_1","city":"city_1","state":"state_1","zip":11111},{"street":"street_2","city":"city_1","state":"state_1","zip":11111},{"street":"street_3","city":"city_3","state":"state_3","zip":33333}] |
+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

SQL Transpose rows to columns (group by key variable)?

I am trying to transpose rows into columns, grouping by a unique identifier (CASE_ID).
I have a table with this structure:
CASE_ID AMOUNT TYPE
100 10 A
100 50 B
100 75 A
200 33 B
200 10 C
And I am trying to query it to produce this structure...
| CASE_ID | AMOUNT1 | TYPE1 | AMOUNT2 | TYPE2 | AMOUNT3 | TYPE3 |
|---------|---------|-------|---------|-------|---------|--------|
| 100 | 10 | A | 50 | B | 75 | A |
| 200 | 33 | B | 10 | C | (null) | (null) |
(assume much larger dataset with large number of possible values for CASE_ID, TYPE and AMOUNT)
I tried to use pivot but I don't need an aggregate function (simply trying to restructure the data). Now I'm trying to somehow use row_number but not sure how.
I'm basically trying to replicate and SPSS command called Casestovars, but need to be able to do it in SQL. thanks.
You can get the result by creating a sequential number with row_number() and then use an aggregate function with CASE expression:
select case_id,
max(case when seq = 1 then amount end) amount1,
max(case when seq = 1 then type end) type1,
max(case when seq = 2 then amount end) amount2,
max(case when seq = 2 then type end) type2,
max(case when seq = 3 then amount end) amount3,
max(case when seq = 3 then type end) type3
from
(
select case_id, amount, type,
row_number() over(partition by case_id
order by case_id) seq
from yourtable
) d
group by case_id;
See SQL Fiddle with Demo.
If you are using a database product that has the PIVOT function, then you can use row_number() with PIVOT, but first I would suggest that you unpivot the amount and type columns first. The basic syntax for a limited number of values in SQL Server would be:
select case_id, amount1, type1, amount2, type2, amount3, type3
from
(
select case_id, col+cast(seq as varchar(10)) as col, value
from
(
select case_id, amount, type,
row_number() over(partition by case_id
order by case_id) seq
from yourtable
) d
cross apply
(
select 'amount', cast(amount as varchar(20)) union all
select 'type', type
) c (col, value)
) src
pivot
(
max(value)
for col in (amount1, type1, amount2, type2, amount3, type3)
) piv;
See SQL Fiddle with Demo.
If you have an unknown number of values, then you can use dynamic SQL to get the result - SQL Server syntax would be:
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT ',' + QUOTENAME(col+cast(seq as varchar(10)))
from
(
select row_number() over(partition by case_id
order by case_id) seq
from yourtable
) d
cross apply
(
select 'amount', 1 union all
select 'type', 2
) c (col, so)
group by col, so
order by seq, so
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT case_id,' + #cols + '
from
(
select case_id, col+cast(seq as varchar(10)) as col, value
from
(
select case_id, amount, type,
row_number() over(partition by case_id
order by case_id) seq
from yourtable
) d
cross apply
(
select ''amount'', cast(amount as varchar(20)) union all
select ''type'', type
) c (col, value)
) x
pivot
(
max(value)
for col in (' + #cols + ')
) p '
execute sp_executesql #query;
See SQL Fiddle with Demo. Each version will give the result:
| CASE_ID | AMOUNT1 | TYPE1 | AMOUNT2 | TYPE2 | AMOUNT3 | TYPE3 |
|---------|---------|-------|---------|-------|---------|--------|
| 100 | 10 | A | 50 | B | 75 | A |
| 200 | 33 | B | 10 | C | (null) | (null) |

use row data as columns in PostgreSQL 9.1

--------------------
|bookname |author |
--------------------
|book1 |author1 |
|book1 |author2 |
|book2 |author3 |
|book2 |author4 |
|book3 |author5 |
|book3 |author6 |
|book4 |author7 |
|book4 |author8 |
---------------------
but I want the booknames as columns and authors as its rows
ex
----------------------------------
|book1 |book2 |book3 |book4 |
----------------------------------
|author1|author3 |author5|author7|
|author2|author4 |author6|author8|
----------------------------------
is it possible in postgres? How can I do this?
I tried crosstab but I failed to do this.
You can get the result using an aggregate function with a CASE expression but I would first use row_number() so you have a value that can be used to group the data.
If you use row_number() then the query could be:
select
max(case when bookname = 'book1' then author end) book1,
max(case when bookname = 'book2' then author end) book2,
max(case when bookname = 'book3' then author end) book3,
max(case when bookname = 'book4' then author end) book4
from
(
select bookname, author,
row_number() over(partition by bookname
order by author) seq
from yourtable
) d
group by seq;
See SQL Fiddle with Demo. I added the row_number() so you will return each distinct value for the books. If you exclude the row_number(), then using an aggregate with a CASE will return only one value for each book.
This query gives the result:
| BOOK1 | BOOK2 | BOOK3 | BOOK4 |
-----------------------------------------
| author1 | author3 | author5 | author7 |
| author2 | author4 | author6 | author8 |

Resources