Oracle Query - Join with comma separated data - string

Table Name : crm_mrdetails
id | mr_name | me_email | mr_mobile | mr_doctor|
-----------------------------------------------------
1 | John |abc#gmail.com | 1234555555 | ,1,2,3 |
Table Name : crm_mr_doctor
id | dr_name | specialization|
----------------------------------
1 | Abhishek | cordiologist |
2 | Krishnan | Physician |
3 | Krishnan | Nurse |
The concatenated values in mrdetails.mr_doctor are the foreign keys for mr_doctor.id. I need to join on them to produce output like this:
id | mr_name | me_email |Doctor_Specialization|
-------------------------------------------------
1 | John |abc#gmail.com |cordiologist,Physician,Nurse|
I'm new to Oracle, I'm using Oracle 12C. Any help much appreciated.

First of all we must acknowledge that is a bad data model. The column mr_doctor violates First Normal Form. This is not some abstruse theoretical point. Not being in 1NF means we must write more code to lookup the meaning of the keys instead of using standard SQL join syntax. It also means we cannot depend on the column containing valid IDs: mr_doctor can contain any old nonsense and we must write a query which will can handle that. See Is storing a delimited list in a database column really that bad? for more on this.
Anyway. Here is a solution which uses regular expressions to split the mr_doctor column into IDs and then joins them to the mr_doctor table. The specialization column is concatenated to produce the required output.
select mrdet.id,
mrdet.mr_name,
mrdet.me_email,
listagg(mrdoc.specialization, ',')
within group (order by mrdoc.specialization) as doctor_specialization
from mr_details mrdet
join (
select distinct id,
regexp_substr(mr_doctor, '(,?)([0-9]+)(,?)', 1, level, null, 2) as dr_id
from mr_details
connect by level <= regexp_count(mr_doctor, '(,?)([0-9]+)')
) as mrids
on mrids.id = mrdet.id
left outer join mr_doctor mrdoc
on mrids.dr_id = mr_doc.id
group by mrdet.id,
mrdet.mr_name,
mrdet.me_email
/
This solution is reasonably resilient despite the data model being brittle. It will return results if the string has too many commas, or spaces. It will ignore values which are letters or otherwise aren't numbers. It won't hurl if the extracted number doesn't match an ID in the mr_doctor table. Obviously the results are untrustworthy for those reasons, but that's part of the price of a shonky data model.
Can you please explain the following: (,?)([0-9]+)(,?)
The pattern matches zero or one comma followed by one or more digits followed by zero or one comma. Perhaps the (,?) in the matched patterns aren't strictly necessary. However, without them, this string 2 3 4 would match the same three IDs as this string 2,3,4. Maybe that's correct maybe it isn't. When the foreign keys are stored in a CSV column instead of being enforced through a proper constraint what does 'correct' even mean?

You have to split data in mr_doctor column into rows, join table crm_mrdoctor and then use listagg().
How to split data? Splitting string into multiple rows in Oracle
select t.id, max(mr_name) mr_name,
listagg(specialization, ', ') within group (order by rn) specs
from (
select id, mr_name, levels.column_value rn,
trim(regexp_substr(mr_doctor, '[^,]+', 1, levels.column_value)) as did
from crm_mrdetails t,
table(cast(multiset(select level
from dual
connect by level <=
length(regexp_replace(t.mr_doctor, '[^,]+')) + 1)
as sys.odcinumberlist)) levels) t
left join crm_mr_doctor d on t.did = d.id
group by t.id
Demo and result:
with crm_mrdetails (id, mr_name, mr_doctor) as (
select 1, 'John', ',1,2,3' from dual union all
select 2, 'Anne', ',4,2,6,5' from dual union all
select 3, 'Dave', ',4' from dual),
crm_mr_doctor (id, dr_name, specialization) as (
select 1, 'Abhishek', 'cordiologist' from dual union all
select 2, 'Krishnan', 'Physician' from dual union all
select 3, 'Krishnan', 'Nurse' from dual union all
select 4, 'Krishnan', 'Onkologist' from dual union all
select 5, 'Krishnan', 'Surgeon' from dual union all
select 6, 'Krishnan', 'Nurse' from dual
)
select t.id, max(mr_name) mr_name,
listagg(specialization, ', ') within group (order by rn) specs
from (
select id, mr_name, levels.column_value rn,
trim(regexp_substr(mr_doctor, '[^,]+', 1, levels.column_value)) as did
from crm_mrdetails t,
table(cast(multiset(select level
from dual
connect by level <=
length(regexp_replace(t.mr_doctor, '[^,]+')) + 1)
as sys.odcinumberlist)) levels) t
left join crm_mr_doctor d on t.did = d.id
group by t.id
Output:
ID MR_NAME SPECS
------ ------- -------------------------------------
1 John cordiologist, Physician, Nurse
2 Anne Onkologist, Physician, Nurse, Surgeon
3 Dave Onkologist

You can use a recursive sub-query and simple string functions (which may be faster than using regular expressions and a correlated hierarchical query):
Oracle Setup:
CREATE TABLE crm_mrdetails (id, mr_name, mr_doctor) as
select 1, 'John', ',1,2,3' from dual union all
select 2, 'Anne', ',4,2,6,5' from dual union all
select 3, 'Dave', ',4' from dual;
CREATE TABLE crm_mr_doctor (id, dr_name, specialization) as
select 1, 'Abhishek', 'cordiologist' from dual union all
select 2, 'Krishnan', 'Physician' from dual union all
select 3, 'Krishnan', 'Nurse' from dual union all
select 4, 'Krishnan', 'Onkologist' from dual union all
select 5, 'Krishnan', 'Surgeon' from dual union all
select 6, 'Krishnan', 'Nurse' from dual;
Query:
WITH crm_mrdetails_bounds ( id, mr_name, mr_doctor, start_pos, end_pos ) AS (
SELECT id,
mr_name,
mr_doctor,
2,
INSTR( mr_doctor, ',', 2 )
FROM crm_mrdetails
UNION ALL
SELECT id,
mr_name,
mr_doctor,
end_pos + 1,
INSTR( mr_doctor, ',', end_pos + 1 )
FROM crm_mrdetails_bounds
WHERE end_pos > 0
),
crm_mrdetails_specs ( id, mr_name, start_pos, specialization_id ) AS (
SELECT id,
mr_name,
start_pos,
TO_NUMBER(
CASE end_pos
WHEN 0
THEN SUBSTR( mr_doctor, start_pos )
ELSE SUBSTR( mr_doctor, start_pos, end_pos - start_pos )
END
)
FROM crm_mrdetails_bounds
)
SELECT s.id,
MAX( s.mr_name ) AS mr_name,
LISTAGG( d.specialization, ',' )
WITHIN GROUP ( ORDER BY s.start_pos )
AS doctor_specialization
FROM crm_mrdetails_specs s
INNER JOIN crm_mr_doctor d
ON ( s.specialization_id = d.id )
GROUP BY s.id
Output:
ID | MR_NAME | DOCTOR_SPECIALIZATION
-: | :------ | :---------------------------------
1 | John | cordiologist,Physician,Nurse
2 | Anne | Onkologist,Physician,Nurse,Surgeon
3 | Dave | Onkologist
db<>fiddle here

Please change the column names according to your requirement.
CREATE OR REPLACE Function ReplaceSpec
(String_Inside IN Varchar2)
Return Varchar2 Is
outputString Varchar2(5000);
tempOutputString crm_doc.specialization%TYPE;
Begin
FOR i in 1..(LENGTH(String_Inside)-LENGTH(REPLACE(String_Inside,',',''))+1)
LOOP
Select specialization into tempOutputString From crm_doc
Where id = PARSING_STRING(String_Inside,i);
If i != 1 Then
outputString := outputString || ',';
end if;
outputString := outputString || tempOutputString;
END LOOP;
Return outputString;
End;
/
The Parsing_String function to help split the comma separated values.
CREATE OR REPLACE Function PARSING_STRING
(String_Inside IN Varchar2, Position_No IN Number)
Return Varchar2 Is
OurEnd Number; Beginn Number;
Begin
If Position_No < 1 Then
Return Null;
End If;
OurEnd := Instr(String_Inside, ',', 1, Position_No);
If OurEnd = 0 Then
OurEnd := Length(String_Inside) + 1;
End If;
If Position_No = 1 Then
Beginn := 1;
Else
Beginn := Instr(String_Inside, ',', 1, Position_No-1) + 1;
End If;
Return Substr(String_Inside, Beginn, OurEnd-Beginn);
End;
/
Please note that I have given only a basic function to get your output. You might need to add some exceptions etc.
Eg. When the doc_id [mr_doctor] is empty, what to do.
Usage
select t1.*,ReplaceSpec(doc_id) from crm_details t1
if your mr_doctor data always starts with a comma use:
Select t1.*,ReplaceSpec(Substr(doc_id,2)) from crm_details t1

Please go through https://oracle-base.com/articles/misc/string-aggregation-techniques
String Aggregation Techniques
or
SELECT deptno,
LTRIM(MAX(SYS_CONNECT_BY_PATH(ename,','))
KEEP (DENSE_RANK LAST ORDER BY curr),',') AS employees
FROM (SELECT deptno,
ename,
ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY ename) AS curr,
ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY ename) -1 AS prev
FROM emp)
GROUP BY deptno
CONNECT BY prev = PRIOR curr AND deptno = PRIOR deptno
START WITH curr = 1
or
listagg and wm_concat an also be used as other people have used it

How about this one? I have not tested it, so there could be any syntax error.
select id,mr_name,me_email,listagg(specialization,',') within group (order by specialization) as Doctor_Specialization
from
(select dtls.id,dtls.mr_name,dtls.me_email,dr.specialization
from crm_mrdetails dtls,
crm_mr_doctor dr
where INSTR(','||dtls.mr_doctor||',' , ','||dr.id||',') > 0
) group by id,mr_name,me_email;

Related

Insert new rows, continue existing rowset row_number count

I'm attempting to perform some sort of upsert operation in U-SQL where I pull data every day from a file, and compare it with yesterdays data which is stored in a table in Data Lake Storage.
I have created an ID column in the table in DL using row_number(), and it is this "counter" I wish to continue when appending new rows to the old dataset. E.g.
Last inserted row in DL table could look like this:
ID | Column1 | Column2
---+------------+---------
10 | SomeValue | 1
I want the next rows to have the following ascending ids
11 | SomeValue | 1
12 | SomeValue | 1
How would I go about making sure that the next X rows continues the ID count incrementally such that the next rows each increases the ID column by 1 more than the last?
You could use ROW_NUMBER then add it to the the max value from the original table (ie using CROSS JOIN and MAX). A simple demo of the technique:
DECLARE #outputFile string = #"\output\output.csv";
#originalInput =
SELECT *
FROM ( VALUES
( 10, "SomeValue 1", 1 )
) AS x ( id, column1, column2 );
#newInput =
SELECT *
FROM ( VALUES
( "SomeValue 2", 2 ),
( "SomeValue 3", 3 )
) AS x ( column1, column2 );
#output =
SELECT id, column1, column2
FROM #originalInput
UNION ALL
SELECT (int)(x.id + ROW_NUMBER() OVER()) AS id, column1, column2
FROM #newInput
CROSS JOIN ( SELECT MAX(id) AS id FROM #originalInput ) AS x;
OUTPUT #output
TO #outputFile
USING Outputters.Csv(outputHeader:true);
My results:
You will have to be careful if the original table is empty and add some additional conditions / null checks but I'll leave that up to you.

Select one column (with multiple rows) 5 times from the same table with different dates in the where clause

The DB records all user activity daily. I am trying to compile a summary report to display total number of actions per day per user. The problem is I want to stack the results next to each other. I have refered to the following stackoverflow questions.
mysql Select one column twice from the same table with different dates in the where clause
Select two columns from same table with different WHERE conditions
but I still continue to get the "subquery returns more than one row error #1242". All help is appreciated. Thank you.
This is my query, just for 2 days to start with.
SELECT LOGGEDIN_USER AS EnquiryHero,
( SELECT COUNT(user_id) from applications
DATE_TIME like "2016-08-24%" group by user_id ) as Day1,
( SELECT COUNT(user_id)from applications
WHERE DATE_TIME like "2016-08-25%" group by user_id ) as Day2,
from applications WHERE DATE_TIME like "2016-08-24%" group by user_id;
--
SELECT user_id,
( SUM( IF( the_day ='2016-08-24', ct, 0 ))) AS 2016-08-24,
( SUM( IF( the_day ='2016-08-25', ct, 0 ))) AS 2016-08-25,
( SUM( IF( the_day ='2016-08-26', ct, 0 ))) AS 2016-08-26,
( SUM( IF( the_day ='2016-08-27', ct, 0 ))) AS 2016-08-27,
FROM ( select user_id, DATE(date_time) AS the_day, loggedin_user, COUNT(*) AS ct
FROM applications GROUP BY 1,2 ) AS x
GROUP BY user_id;
First focus on getting the data; then focus on "pivoting" the data.
SELECT user_id,
DATE(`date_time`) AS the_day,
COUNT(*) AS ct
FROM applications
GROUP BY 1, 2;
See if that gives you the data desired; then look at how to "pivot". See the extra tag I added.
Then pivot
SELECT user_id,
(SUM(IF(the_day = '2016-08-24', ct, 0) AS '2016-08-24',
(SUM(IF(the_day = '2016-08-25', ct, 0) AS '2016-08-25',
(SUM(IF(the_day = '2016-08-26', ct, 0) AS '2016-08-26',
(SUM(IF(the_day = '2016-08-27', ct, 0) AS '2016-08-27',
...
FROM (
the query above
) AS x
GROUP BY user_id;

DB2 splitting comma separated String to use in a IN clause.. Update: WITH clause query inside IN clause

I have a table TableA with values in ColumnA as below:
ColumnA
__________________
a,b,c
d,e
I have table TableB with values as:
ColumnB ColumnC
____________________
a 1
b 2
c 3
d 4
e 5
x 9
I want to use above values in another query:
SELECT columnC FROM TableB where ColumnB in (select ColumnA from TableA)
Obviously above query won't work.
The output should be 1, 2, 3, 4, 5.
How to do this without function i.e. in a simple query?
Update:
Based on mustoccio's comment below, I made it work using the WITH clause:
With split_data as (select ColumnA as split_string, ',' as split from TableA),
rec
(
split_string, split, row_num, column_value, pos
)
as
(
select
split_string,
split,
1,
varchar(substr(split_string, 1, decode(instr(split_string, split, 1),0,length(split_string), instr(split_string, split, 1)-1)), 255),
instr(split_string, split, 1) + length(split)
from split_data
union
all
select
split_string,
split,
row_num+1,
varchar(substr(split_string, pos, decode(instr(split_string, split, pos),0, length(split_string)-pos+1, instr(split_string, split, pos)-pos)), 255),
instr(split_string, split, pos)+length(split)
from rec
where row_num < 300000
and pos > length(split)
)
select
column_value as data
from rec
order by row_num
However, when I try to use above query inside the IN clause of my query:
SELECT columnC FROM TableB where ColumnB in (/* WITH query here */)
I get error as:
Error: DB2 SQL Error: SQLCODE=-104, SQLSTATE=42601, SQLERRMC=as;in ( With split_data;JOIN, DRIVER=3.50.152
SQLState: 42601
ErrorCode: -104
Error: DB2 SQL Error: SQLCODE=-727, SQLSTATE=56098, SQLERRMC=2;-104;42601;as|in ( With split_data|JOIN, DRIVER=3.50.152
SQLState: 56098
ErrorCode: -727
Can't we use WITH clause query inside IN clause ?
If NO, what is the solution ?
Inner join must work here.. you can use this one
select columnC from tableB inner join tableA on tableB.columnB=tableA.columnA;
enter image description here
you can see the result here

Convert row to column using pivot

This is my table now.
Name PhoneType Number
a Cellular 303-333-3333
a WorkPHone 444-444-4444
b Workphone 222-222-2222
c Cellular 111-111-1111
c WorkPHone 333-333-3333
c HomePhone 888-888-8888
d Cellular 999-999-9999
d WorkPHone 777-777-7777
d HomePhone 111-222-3333
I want to convert to:
Name Cellular Workphone Homephone
a 303-333-3333 444-444-4444 222-222-2222
b 222-222-2222
c 111-111-1111 333-333-3333 888-888-8888
d 999-999-9999 777-777-7777 111-222-3333
I tried Pivot but can't do because I can't use aggregation function on phone number since it is nvarchar. How can I convert.
You can use alternate to pivot using select query as below.
with xx as(
select 'a' namee,'cellular' typee,'303-333-3333' numberr from dual union all
select 'a' namee,'WorkPHone' typee,'444-444-4444' numberr from dual union all
select 'b' namee,'WorkPHone' typee,'222-222-2222' numberr from dual union all
select 'c' namee,'cellular' typee,'111-111-1111' numberr from dual union all
select 'c' namee,'WorkPHone' typee,'333-333-3333' numberr from dual union all
select 'c' namee,'HomePhone' typee,'888-888-8888' numberr from dual
)
select
x.namee,
max(case when x.typee='cellular' then x.numberr else '' end) as CELLULAR,
max(case when x.typee='WorkPHone' then x.numberr else '' end) as WORKPHONE,
max(case when x.typee='HomePhone' then x.numberr else '' end) as HOMEPHONE
from xx x
group by x.namee
order by x.namee;
EDIT:
xx table should be replaced with your table name. Assume that your table is named as my_table , the final query would be:
select x.Name ,
max(case when x.PhoneType='cellular' then x.Number else '' end) as CELLULAR,
max(case when x.PhoneType='WorkPHone' then x.Number else '' end) as WORKPHONE,
max(case when x.PhoneType='HomePhone' then x.Number else '' end) as HOMEPHONE
from
my_table x
group by x.Name
order by x.Name;

How to retrieve two columns data in A,B format in Oracle

I have two columns in oracle database
+---------+---------+
| Column1 | Column2 |
+---------+---------+
| A | 1 |
| A | 2 |
+---------+---------+
I want to retireive the data like i will get data as result
+---------+---------+
| Column1 | Column2 |
+---------+---------+
| A | 1,2 |
+---------+---------+
Please provide me the solution.
Tim Hall has a pretty canonical list of string aggregation techniques in Oracle.
Which technique you use depends on a number of factors including the version of Oracle and whether you are looking for a purely SQL solution. If you are using Oracle 11.2, I'd probably suggest using LISTAGG
SELECT column1, listagg( column2, ',' ) WITHIN GROUP( order by column2 )
FROM table_name
GROUP BY column1
If you are using an earlier version of Oracle, assuming you don't need a purely SQL solution, I would generally prefer using the user-defined aggregate function approach.
All abow answers are correct and I want to add one case to solve small problem. In my case my_column1 type was nvarchar2 but text was number and the bellow code does not work and display me only whitespace:
select group_id, listagg( t.my_column1 || '-' || to_char(t.doc_date,'dd.mm.yyyy') || ' ') within group(order by doc_date)
from my_table t
group by group_id
when I wrote like this it works.
select group_id, listagg( to_char(t.my_column1) || '-' || to_char(t.doc_date,'dd.mm.yyyy') || ' ') within group(order by doc_date)
from my_table t
group by group_id
I hope my feedback would save someone's time
If you have got 10g, then you have to go through the function below:
CREATE OR REPLACE FUNCTION get_comma_separated_value (input_val in number)
RETURN VARCHAR2
IS
return_text VARCHAR2(10000) := NULL;
BEGIN
FOR x IN (SELECT col2 FROM table_name WHERE col1 = input_val) LOOP
return_text := return_text || ',' || x.col2 ;
END LOOP;
RETURN LTRIM(return_text, ',');
END;
/
So, you can do like:
select col1, get_comma_separated_value(col1) from table_name
Fiddle here
If you have got oracle 11g, you can use listagg :
SELECT
col1,
LISTAGG(col2, ', ') WITHIN GROUP (ORDER BY col2) "names"
FROM table_x
GROUP BY col1
Fiddle here for Listagg
For mysql, its gonna be simple:
SELECT col1, GROUP_CONCAT(col2) FROM table_name GROUP BY col1
On my oracle version 10 it do the job:
SELECT column1, wm_concat( column2)
FROM table_name
GROUP BY column1

Resources