Get the 3rd part of the string in Postgres - string

I have data like,
ab-volt-ssn-dev
ab-volt-lnid-dev
ab-volt-ssn-hamp-dev
ab-volt-cf-apnt-test
I need output to be like,
ssn
lnid
ssn
cf

You can use split_part()
select split_part('ab-volt-ssn-hamp-dev', '-', 3);
If you need to access multiple parts, then converting it to an array might be easier:
select elements[1],
elements[2],
elements[3],
elements[4]
from (
select string_to_array(the_column, '-') as elements
from the_table
) t;

Related

REPLACE doesn't work for strings in BigQuery

I tried to use REPLACE to delete some words, like 'HOMEMAKER' and 'HOUSEWIFE', from strings. But I failed to change those words into empty space. Why is that happening?
I want to delete strings like 'HOMEMAKER' and 'HOUSEWIFE' from column EMPLOYER using REPLACE function but failed. I also tried REGEXP_REPLACE but failed again. These are the table I have (Sorry I want to build a table here but somehow it doesn't work).
EMPLOYER
RETIRED/HOMEMAKER
HOMEMAKER/HOMEMAKER
SELF-EMPLOYED/HOMEMAKER
We code is listed below:
SELECT EMPLOYER,
CASE WHEN EMPLOYER LIKE '%HOUSE%WIFE%'
THEN REGEXP_REPLACE(EMPLOYER,r'HOUSE%WIFE',' ')
WHEN EMPLOYER LIKE '%HOME%MAKER%'
THEN REGEXP_REPLACE(EMPLOYER, r'HOME%MAKER', ' ')
ELSE '0'
END AS SIGN
FROM fec.work
WHERE EMPLOYER LIKE '%HOME%MAKER%'
OR EMPLOYER LIKE '%HOUSE%WIFE%'
GROUP BY 1,2;
The result I want is like:
SIGN
RETIRED/
/
SELF-EMPLOYED/
But I got exactly the same columns in SIGN as EMPLOYER. Can anyone tell me why replace function did not make any changes?
You should use something like below
REGEXP_REPLACE(EMPLOYER, r'HOUSEWIFE|HOMEMAKER',' ')
you can use i flag to make this replacement case insensitive
REGEXP_REPLACE(EMPLOYER, r'(?i)HOUSEWIFE|HOMEMAKER',' ')
Either of these options should work for your case.
with test as (
select 'EMPLOYER' as my_str union all
select 'RETIRED/HOMEMAKER' as my_str union all
select 'HOMEMAKER/HOMEMAKER' as my_str union all
select 'SELF-EMPLOYED/HOMEMAKER' as my_str
)
select
my_str,
REPLACE(REPLACE(my_str, 'HOUSEWIFE', ' '), 'HOMEMAKER', ' ') as replaced_str,
REGEXP_REPLACE(my_str, r'HOUSEWIFE|HOMEMAKER', ' ') as regexed_str
from test

Oracle PLSQL : How to remove duplicate data in string

Step 01 : I have a column A in table tab_T contains that strings :
SELECT A FROM tab_T;
((<123>+<123>+<123>)(*<213>+<213>+<213>+<354>+<354>+<354>+1)(*<985>))(+<654>+<654>+1)
(<599>*<592>*<591>)
(<10945>)
(<736>+<736>+1)
(<216>*<518>)
(<598>*<593>)(*<594>+<594>+<594>+<597>+<595>+<595>+<595>)
...
...
I want to get :
((<123>)(*<213>+<354>+1)(*<985>))(+<654>+1)
(<599>*<591>)
(<10945>)
(<736>)
(<216>*<518>)
(<598>*<593>)(*<594>+<597>+<595>)
...
...
Step 02 : Then i will replace '+' by 'AND' and '*' by 'OR' and delete the number '1' from my string
this is my query (it works good and i share it with you if you need a help)
SELECT RTRIM(RTRIM(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(A,'+','AND'),'*','OR'),'(OR','OR('),'(AND','AND('),'(1)','')
,'OR1',''),'AND1',''),'1OR',''),'1AND',''),'ANDAND','AND'),'OROR','OR'),'AND'),'OR') AS logic
FROM tab_T
Result :
((<123>AND<123>AND<123>)OR(<213>AND<213>AND<213>AND<354>AND<354>AND<354>)OR(<985>))OR(<654>AND<654>)
(<599>OR<592>OR<591>)
(<10945>)
(<736>AND<736>)
(<216>OR<518>)
(<598>OR<593>)OR(<594>AND<594>AND<594>AND<597>AND<595>AND<595>AND<595>)
...
...
so when i apply step 01 and step 2 i will have this result
((<123>)OR(<213>AND<354>)OR(<985>))AND(<654>)
(<599>OR<591>)
(<10945>)
(<736>)
(<216>OR<518>)
(<598>OR<593>)OR(<594>AND<597>AND<595>)
...
...
I need a help or an idea for the step 01 please?
Thx
This will preserve the plus signs in-between the bracketed numbers:
select A original, regexp_replace(A, '(<\d+>)(\+?\1){1,}', '\1') fixed
from tab_T;
The regex can be read as: Remember a group of one or more digits inside of brackets when followed by a group of one or more of the SAME group of remembered numbers preceded by an optional plus sign. When this group is encountered, replace it with the first remembered group.
EDIT: For the sake of completeness, here's the whole thing done with successive CTE's breaking the replaces into logical groupings. This way it's a complete answer and I believe reduced the number of REPLACE() calls. You could do it as a bunch of nested REPLACE's, but I think this is arguably cleaner and easier to understand and maintain down the road.
with tab_T(A) as (
select '((<123>+<123>+<123>)(*<213>+<213>+<213>+<354>+<354>+<354>+1)(*<985>))(+<654>+<654>+1)' from dual union all
select '(<599>*<592>*<591>)' from dual union all
select '(<10945>)' from dual union all
select '(<736>+<736>+1)' from dual union all
select '(<216>*<518>)' from dual union all
select '(<598>*<593>)(*<594>+<594>+<594>+<597>+<595>+<595>+<595>)' from dual
),
-- Remove dups and '+1'
pass_1(original, fixed) as (
select A original, replace(regexp_replace(A, '(<\d+>)(\+?\1){1,}', '\1'), '+1') fixed
from tab_T
),
replace_ors(original, fixed) as (
select original, replace(replace(fixed, '(*', 'OR('), '*', 'OR')
from pass_1
),
replace_ands(original, fixed) as (
select original, replace(replace(fixed, '(+', 'AND('), '+', 'AND')
from replace_ors
)
select original, fixed
from replace_ands
;
I know this is not full answer for your question. But maybe it can help you:
with t as (select '((<123>+<123>+<123>)(*<213>+<213>+<213>+<354>+<354>+<354>+1)(*<985>))(+<654>+<654>+1)' as exp from dual)
, t1 as ( select distinct regexp_substr(exp, '[^+]+', 1, level) names
from t
connect by level <= length(regexp_replace(exp, '[^*+]'))+1
)
SELECT
RTrim(listagg(t1.names,'+') WITHIN GROUP (order by names desc)) string
from t1
I found it :)
select REGEXP_REPLACE
(A,
'(<[^>]+>)(\+|\*?\1)*',
'\1') as logic
FROM tab_T
Thank you anyway ;)

Find Specific number from list

I have millions records like this but im sharing here few records
what i need is just take 8 charchers from this recodrs so many have (.) and some have (/) so remove (.) abd (/) please see the sample output
Records in Table
GBR.FCL.AT.245448C.A
GBR.FCL.AT.225405L.A
at286623da
EASA UK/AT/311969F/A
AT/332092H/A
AT238691G/A
Output should be like this
245448CA
225405LA
286623da
311969FA
332092HA
Assuming we can rely on the sample as complete and representative (not always a safe assumption in SO) the desired output is the last eight characters of the string, ignoring . and \.
So the simplest thing that could possibly work would be to strip out the unwanted characters using translate() then return the last eight characters:
select substr(translate(str, 'a.\', 'a'), -8) as extracted_str
from your_table
A slightly more engineered solution would apply regex to fine a string of the format 999999AA:
select regexp_replace(translate(str, 'a.\', 'a'),
'^(.*)([[:digit:]]{6}[[:alpha:]]{2})(.*)$', '\2'
) as extracted_str
from your_table
Assuming that you need to get 8 characters, excluding / and ., starting from the string AT ( no matter the case) and that there is exactly one occurrence of AT (in any case combination) in the input string, this should be what you need:
with input(x) as (
select 'GBR.FCL.AT.245448C.A' from dual union all
select 'GBR.FCL.AT.225405L.A' from dual union all
select 'at286623da' from dual union all
select 'EASA UK/AT/311969F/A' from dual union all
select 'AT/332092H/A' from dual union all
select 'AT238691G/A' from dual
)
select x as yourString,
substr(translate(x, 'x/.', 'x'), instr(translate(upper(x), '/.x', 'x'), 'AT')+2, 8) as result
from input
Which gives:
YOURSTRING RESULT
-------------------- --------------------------------
GBR.FCL.AT.245448C.A 245448CA
GBR.FCL.AT.225405L.A 225405LA
at286623da 286623da
EASA UK/AT/311969F/A 11969FA
AT/332092H/A 332092HA
AT238691G/A 238691GA

Using REGEXP_SUBSTR to get key-value pair data

I have a column with below values,
User_Id=446^User_Input=L307-60#/25" AP^^
I am trying to get each individual value based on a specified key.
All value after User_Id= until it encounters ^
All value after User_Input= until it encounters ^
I tried for and so far I have this,
SELECT LTRIM(REGEXP_SUBSTR('User_Id=446^User_Input=L307-60#/25" AP^'
,'[0-9]+',1,1),'^') User_Id
from dual
How do I get the value for the User_Input??
P.S: User input can have anything, like ',", *,% including a ^ in the middle of the string (that is, not as a delimiter).
Any help would be greatly appreciated..
This can be easily solved using boring old INSTR to calculate the offsets of the start and end points for the KEY and VALUE strings.
The trick is to use the optional occurrence parameter to identify each the correct instance of =. Because the input can contain carets which aren't intended as delimiters we need to use a negative position to identify the last ^.
with cte as (
select kv
, instr(kv, '=', 1, 1)+1 as k_st -- first occurrence
, instr(kv, '^', 1) as k_end
, instr(kv, '=', 1, 2)+1 as v_st -- second occurrence
, instr(kv, '^', -1) as v_end -- counting from back
from t23
)
select substr(kv, k_st, k_end - k_st) as user_id
, substr(kv, v_st, v_end - v_st) as user_input
from cte
/
Here is the requisite SQL Fiddle to prove it works. I think it's much easier to understand than any regex equivalent.
If there is no particular need to use Regex, something like this returns the value.
WITH rslt AS (
SELECT 'User_Id=446^User_Input=L307-60#/25" AP^' val
FROM dual
)
SELECT LTRIM(SUBSTR(val
,INSTR(val, '=', 1, 2) + 1
,INSTR(val, '^', 1, 2) - (INSTR(val, '=', 1, 2) + 1)))
FROM rslt;
Of course, if you can't guarantee that there will not be any carets that are valid text characters, this will possibly return partial results.
Assuming that you will always have 'User_Id=' and 'User_Input=' in your string, I would use a character group approach to parsing
Use the starting anchor,^, and ending anchor, $. Look for 'User_Id=' and 'User_Input='
Associate the value you are searching for with a character group.
SCOTT#dev>
1 SELECT REGEXP_SUBSTR('User_Id=446^User_Input=L307-60#/25" AP^','^User_Id=(.*\^)User_Input=(.*\^)$',1, 1, NULL, 1) User_Id
2* FROM dual
SCOTT#dev> /
USER
====
446^
SCOTT#dev>
1 SELECT REGEXP_SUBSTR('User_Id=446^User_Input=L307-60#/25" AP^','^User_Id=(.*\^)User_Input=(.*\^)$',1, 1, NULL, 2) User_Input
2* FROM dual
SCOTT#dev> /
USER_INPUT
================
L307-60#/25" AP^
SCOTT#dev>
Got this answer from a friend of mine.. Looks simple and works great...
SELECT
regexp_replace('User_Id=446^User_Input=L307-60#/25" AP^^', '.*User_Id=([^\^]+).*', '\1') User_Id,
regexp_replace('User_Id=446^User_Input=L307-60#/25" AP^^', '.*User_Input=(.*)[\^]$', '\1') User_Input
FROM dual
Posting here just in case any of you find it interesting..

MYSQL: Using GROUP BY with string literals

I have the following table with these columns:
shortName, fullName, ChangelistCount
Is there a way to group them by a string literal within their fullName? The fullname represents file directories, so I would like to display results for certain parent folders instead of the individual files.
I tried something along the lines of:
GROUP BY fullName like "%/testFolder/%" AND fullName like "%/testFolder2/%"
However it only really groups by the first match....
Thanks!
Perhaps you want something like:
GROUP BY IF(fullName LIKE '%/testfolder/%', 1, IF(fullName LIKE '%/testfolder2/%', 2, 3))
The key idea to understand is that an expression like fullName LIKE foo AND fullName LIKE bar is that the entire expression will necessarily evaluate to either TRUE or FALSE, so you can only get two total groups out of that.
Using an IF expression to return one of several different values will let you get more groups.
Keep in mind that this will not be particularly fast. If you have a very large dataset, you should explore other ways of storing the data that will not require LIKE comparisons to do the grouping.
You'd have to use a subquery to derive the column values you'd like to ultimately group on:
FROM (SELECT SUBSTR(fullname, ?)AS derived_column
FROM YOUR_TABLE ) x
GROUP BY x.derived_column
Either use when/then conditions or Have another temporary table containing all the matches you wish to find and group. Sample from my database.
Here I wanted to group all users based on their cities which was inside address field.
SELECT ut.* , c.city, ua.*
FROM `user_tracking` AS ut
LEFT JOIN cities AS c ON ut.place_name LIKE CONCAT( "%", c.city, "%" )
LEFT JOIN users_auth AS ua ON ua.id = ut.user_id

Resources