Oracle PLSQL : How to remove duplicate data in string

Oracle PLSQL : How to remove duplicate data in string - string

Step 01 : I have a column A in table tab_T contains that strings :
SELECT A FROM tab_T;
((<123>+<123>+<123>)(*<213>+<213>+<213>+<354>+<354>+<354>+1)(*<985>))(+<654>+<654>+1)
(<599>*<592>*<591>)
(<10945>)
(<736>+<736>+1)
(<216>*<518>)
(<598>*<593>)(*<594>+<594>+<594>+<597>+<595>+<595>+<595>)
...
...
I want to get :
((<123>)(*<213>+<354>+1)(*<985>))(+<654>+1)
(<599>*<591>)
(<10945>)
(<736>)
(<216>*<518>)
(<598>*<593>)(*<594>+<597>+<595>)
...
...
Step 02 : Then i will replace '+' by 'AND' and '*' by 'OR' and delete the number '1' from my string
this is my query (it works good and i share it with you if you need a help)
SELECT RTRIM(RTRIM(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(A,'+','AND'),'*','OR'),'(OR','OR('),'(AND','AND('),'(1)','')
,'OR1',''),'AND1',''),'1OR',''),'1AND',''),'ANDAND','AND'),'OROR','OR'),'AND'),'OR') AS logic
FROM tab_T
Result :
((<123>AND<123>AND<123>)OR(<213>AND<213>AND<213>AND<354>AND<354>AND<354>)OR(<985>))OR(<654>AND<654>)
(<599>OR<592>OR<591>)
(<10945>)
(<736>AND<736>)
(<216>OR<518>)
(<598>OR<593>)OR(<594>AND<594>AND<594>AND<597>AND<595>AND<595>AND<595>)
...
...
so when i apply step 01 and step 2 i will have this result
((<123>)OR(<213>AND<354>)OR(<985>))AND(<654>)
(<599>OR<591>)
(<10945>)
(<736>)
(<216>OR<518>)
(<598>OR<593>)OR(<594>AND<597>AND<595>)
...
...
I need a help or an idea for the step 01 please?
Thx

This will preserve the plus signs in-between the bracketed numbers:
select A original, regexp_replace(A, '(<\d+>)(\+?\1){1,}', '\1') fixed
from tab_T;
The regex can be read as: Remember a group of one or more digits inside of brackets when followed by a group of one or more of the SAME group of remembered numbers preceded by an optional plus sign. When this group is encountered, replace it with the first remembered group.
EDIT: For the sake of completeness, here's the whole thing done with successive CTE's breaking the replaces into logical groupings. This way it's a complete answer and I believe reduced the number of REPLACE() calls. You could do it as a bunch of nested REPLACE's, but I think this is arguably cleaner and easier to understand and maintain down the road.
with tab_T(A) as (
select '((<123>+<123>+<123>)(*<213>+<213>+<213>+<354>+<354>+<354>+1)(*<985>))(+<654>+<654>+1)' from dual union all
select '(<599>*<592>*<591>)' from dual union all
select '(<10945>)' from dual union all
select '(<736>+<736>+1)' from dual union all
select '(<216>*<518>)' from dual union all
select '(<598>*<593>)(*<594>+<594>+<594>+<597>+<595>+<595>+<595>)' from dual
),
-- Remove dups and '+1'
pass_1(original, fixed) as (
select A original, replace(regexp_replace(A, '(<\d+>)(\+?\1){1,}', '\1'), '+1') fixed
from tab_T
),
replace_ors(original, fixed) as (
select original, replace(replace(fixed, '(*', 'OR('), '*', 'OR')
from pass_1
),
replace_ands(original, fixed) as (
select original, replace(replace(fixed, '(+', 'AND('), '+', 'AND')
from replace_ors
)
select original, fixed
from replace_ands
;

I know this is not full answer for your question. But maybe it can help you:
with t as (select '((<123>+<123>+<123>)(*<213>+<213>+<213>+<354>+<354>+<354>+1)(*<985>))(+<654>+<654>+1)' as exp from dual)
, t1 as ( select distinct regexp_substr(exp, '[^+]+', 1, level) names
from t
connect by level <= length(regexp_replace(exp, '[^*+]'))+1
)
SELECT
RTrim(listagg(t1.names,'+') WITHIN GROUP (order by names desc)) string
from t1

I found it :)
select REGEXP_REPLACE
(A,
'(<[^>]+>)(\+|\*?\1)*',
'\1') as logic
FROM tab_T
Thank you anyway ;)

Related

Get the 3rd part of the string in Postgres

I have data like,
ab-volt-ssn-dev
ab-volt-lnid-dev
ab-volt-ssn-hamp-dev
ab-volt-cf-apnt-test
I need output to be like,
ssn
lnid
ssn
cf

You can use split_part()
select split_part('ab-volt-ssn-hamp-dev', '-', 3);
If you need to access multiple parts, then converting it to an array might be easier:
select elements[1],
elements[2],
elements[3],
elements[4]
from (
select string_to_array(the_column, '-') as elements
from the_table
) t;

Find Specific number from list

I have millions records like this but im sharing here few records
what i need is just take 8 charchers from this recodrs so many have (.) and some have (/) so remove (.) abd (/) please see the sample output
Records in Table
GBR.FCL.AT.245448C.A
GBR.FCL.AT.225405L.A
at286623da
EASA UK/AT/311969F/A
AT/332092H/A
AT238691G/A
Output should be like this
245448CA
225405LA
286623da
311969FA
332092HA

Assuming we can rely on the sample as complete and representative (not always a safe assumption in SO) the desired output is the last eight characters of the string, ignoring . and \.
So the simplest thing that could possibly work would be to strip out the unwanted characters using translate() then return the last eight characters:
select substr(translate(str, 'a.\', 'a'), -8) as extracted_str
from your_table
A slightly more engineered solution would apply regex to fine a string of the format 999999AA:
select regexp_replace(translate(str, 'a.\', 'a'),
'^(.*)([[:digit:]]{6}[[:alpha:]]{2})(.*)$', '\2'
) as extracted_str
from your_table

Assuming that you need to get 8 characters, excluding / and ., starting from the string AT ( no matter the case) and that there is exactly one occurrence of AT (in any case combination) in the input string, this should be what you need:
with input(x) as (
select 'GBR.FCL.AT.245448C.A' from dual union all
select 'GBR.FCL.AT.225405L.A' from dual union all
select 'at286623da' from dual union all
select 'EASA UK/AT/311969F/A' from dual union all
select 'AT/332092H/A' from dual union all
select 'AT238691G/A' from dual
)
select x as yourString,
substr(translate(x, 'x/.', 'x'), instr(translate(upper(x), '/.x', 'x'), 'AT')+2, 8) as result
from input
Which gives:
YOURSTRING RESULT
-------------------- --------------------------------
GBR.FCL.AT.245448C.A 245448CA
GBR.FCL.AT.225405L.A 225405LA
at286623da 286623da
EASA UK/AT/311969F/A 11969FA
AT/332092H/A 332092HA
AT238691G/A 238691GA

Very strange column string "single-quote-comment-brace-time-char"-bug substitutions in postgres selects - maybe related to JDBC only

If you ever had something similar and ripped your hair out to find the problem - in short:
--'
select '{t'::text
text
-----
TIME
???

What a nasty bug and interesting how this could lead to really messed up data!
It seems to be known since 9.1, may be related to the JDBC driver only and happened with our 9.3 as well!:
Here are some more details I found out (for helping to find and fix this ad hoc in your code or eventually by the "source hackers" ;) ) so far with example code below:
a single quote must appear somewhere in a single-line or multi-line comment above the select (e.g. --', /*'*/, -- foo's cool)
explicitely given strings must contain {t or {d to be substituted by TIME or DATE respectively
one of the strings must be explicitely (maybe also implicitely) of type text
a closing brace in the same or another string somewhere is necessary to continue this substition (e.g. select 'foo } bar')
a closing brace in a comment disables the behaviour again (e.g. --})
.
--'
select '{t'::text
union all select '{ta}'
union all select '{tfoo bar'
-- these are untouched
union all select '{ t}'
union all select 'foo { t}'
-- there seems to be an opening/closing "{" "}" match behaviour behind
-- it since the 2nd row below
union all select '{t'
union all select '{ta}'
union all select '{tfoo bar'
-- also "d" seems to be a "trigger"
union all select '}{d}'
-- a closing brace in a comment seems to disable it completely again
union all select '{d'
union all select '{d'
-- }
union all select 'a}{d}'
text
------------
TIME
{ta
TIME foo bar
{ t
foo { t}
TIME
{ta
TIME foo bar
DATE
DATE
{d
a}{d}

compare string sets in access

I have a problem where I think you can also use when needed. I wish to compare word sets on separate tables identified by their Item No and Order (ordinality) and the word or value.
Here is a snapshot of the table:
Then the result I wish to accomplish was like this:
The comparison is LIKE rather than equality.

You can do a join and have like in predicate:
select * from table1 t1
inner join table2 t2 on t1.ValueA like '%' + t2.ValueB + '%' or
t2.ValueB like '%' + t1.ValueA + '%'
You may need use * instead of % and & instead of +. I don't remember the correct syntax for Access, but it is along the lines.

Using REGEXP_SUBSTR to get key-value pair data

I have a column with below values,
User_Id=446^User_Input=L307-60#/25" AP^^
I am trying to get each individual value based on a specified key.
All value after User_Id= until it encounters ^
All value after User_Input= until it encounters ^
I tried for and so far I have this,
SELECT LTRIM(REGEXP_SUBSTR('User_Id=446^User_Input=L307-60#/25" AP^'
,'[0-9]+',1,1),'^') User_Id
from dual
How do I get the value for the User_Input??
P.S: User input can have anything, like ',", *,% including a ^ in the middle of the string (that is, not as a delimiter).
Any help would be greatly appreciated..

This can be easily solved using boring old INSTR to calculate the offsets of the start and end points for the KEY and VALUE strings.
The trick is to use the optional occurrence parameter to identify each the correct instance of =. Because the input can contain carets which aren't intended as delimiters we need to use a negative position to identify the last ^.
with cte as (
select kv
, instr(kv, '=', 1, 1)+1 as k_st -- first occurrence
, instr(kv, '^', 1) as k_end
, instr(kv, '=', 1, 2)+1 as v_st -- second occurrence
, instr(kv, '^', -1) as v_end -- counting from back
from t23
)
select substr(kv, k_st, k_end - k_st) as user_id
, substr(kv, v_st, v_end - v_st) as user_input
from cte
/
Here is the requisite SQL Fiddle to prove it works. I think it's much easier to understand than any regex equivalent.

If there is no particular need to use Regex, something like this returns the value.
WITH rslt AS (
SELECT 'User_Id=446^User_Input=L307-60#/25" AP^' val
FROM dual
)
SELECT LTRIM(SUBSTR(val
,INSTR(val, '=', 1, 2) + 1
,INSTR(val, '^', 1, 2) - (INSTR(val, '=', 1, 2) + 1)))
FROM rslt;
Of course, if you can't guarantee that there will not be any carets that are valid text characters, this will possibly return partial results.

Assuming that you will always have 'User_Id=' and 'User_Input=' in your string, I would use a character group approach to parsing
Use the starting anchor,^, and ending anchor, $. Look for 'User_Id=' and 'User_Input='
Associate the value you are searching for with a character group.
SCOTT#dev>
1 SELECT REGEXP_SUBSTR('User_Id=446^User_Input=L307-60#/25" AP^','^User_Id=(.*\^)User_Input=(.*\^)$',1, 1, NULL, 1) User_Id
2* FROM dual
SCOTT#dev> /
USER
====
446^
SCOTT#dev>
1 SELECT REGEXP_SUBSTR('User_Id=446^User_Input=L307-60#/25" AP^','^User_Id=(.*\^)User_Input=(.*\^)$',1, 1, NULL, 2) User_Input
2* FROM dual
SCOTT#dev> /
USER_INPUT
================
L307-60#/25" AP^
SCOTT#dev>

Got this answer from a friend of mine.. Looks simple and works great...
SELECT
regexp_replace('User_Id=446^User_Input=L307-60#/25" AP^^', '.*User_Id=([^\^]+).*', '\1') User_Id,
regexp_replace('User_Id=446^User_Input=L307-60#/25" AP^^', '.*User_Input=(.*)[\^]$', '\1') User_Input
FROM dual
Posting here just in case any of you find it interesting..

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Oracle PLSQL : How to remove duplicate data in string - string

I found it :) select REGEXP_REPLACE (A, '(<[^>]+>)(\+|\?\1)', '\1') as logic FROM tab_T Thank you anyway ;)

Related

Get the 3rd part of the string in Postgres

Find Specific number from list

Very strange column string "single-quote-comment-brace-time-char"-bug substitutions in postgres selects - maybe related to JDBC only

compare string sets in access

Using REGEXP_SUBSTR to get key-value pair data

Categories

Resources

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Oracle PLSQL : How to remove duplicate data in string - string

I found it :) select REGEXP_REPLACE (A, '(<[^>]+>)(\+|\*?\1)*', '\1') as logic FROM tab_T Thank you anyway ;)

Related

Get the 3rd part of the string in Postgres

Find Specific number from list

Very strange column string "single-quote-comment-brace-time-char"-bug substitutions in postgres selects - maybe related to JDBC only

compare string sets in access

Using REGEXP_SUBSTR to get key-value pair data

Categories

Resources

I found it :) select REGEXP_REPLACE (A, '(<[^>]+>)(\+|\?\1)', '\1') as logic FROM tab_T Thank you anyway ;)