compare string sets in access - string

I have a problem where I think you can also use when needed. I wish to compare word sets on separate tables identified by their Item No and Order (ordinality) and the word or value.
Here is a snapshot of the table:
Then the result I wish to accomplish was like this:
The comparison is LIKE rather than equality.

You can do a join and have like in predicate:
select * from table1 t1
inner join table2 t2 on t1.ValueA like '%' + t2.ValueB + '%' or
t2.ValueB like '%' + t1.ValueA + '%'
You may need use * instead of % and & instead of +. I don't remember the correct syntax for Access, but it is along the lines.

Related

REPLACE doesn't work for strings in BigQuery

I tried to use REPLACE to delete some words, like 'HOMEMAKER' and 'HOUSEWIFE', from strings. But I failed to change those words into empty space. Why is that happening?
I want to delete strings like 'HOMEMAKER' and 'HOUSEWIFE' from column EMPLOYER using REPLACE function but failed. I also tried REGEXP_REPLACE but failed again. These are the table I have (Sorry I want to build a table here but somehow it doesn't work).
EMPLOYER
RETIRED/HOMEMAKER
HOMEMAKER/HOMEMAKER
SELF-EMPLOYED/HOMEMAKER
We code is listed below:
SELECT EMPLOYER,
CASE WHEN EMPLOYER LIKE '%HOUSE%WIFE%'
THEN REGEXP_REPLACE(EMPLOYER,r'HOUSE%WIFE',' ')
WHEN EMPLOYER LIKE '%HOME%MAKER%'
THEN REGEXP_REPLACE(EMPLOYER, r'HOME%MAKER', ' ')
ELSE '0'
END AS SIGN
FROM fec.work
WHERE EMPLOYER LIKE '%HOME%MAKER%'
OR EMPLOYER LIKE '%HOUSE%WIFE%'
GROUP BY 1,2;
The result I want is like:
SIGN
RETIRED/
/
SELF-EMPLOYED/
But I got exactly the same columns in SIGN as EMPLOYER. Can anyone tell me why replace function did not make any changes?
You should use something like below
REGEXP_REPLACE(EMPLOYER, r'HOUSEWIFE|HOMEMAKER',' ')
you can use i flag to make this replacement case insensitive
REGEXP_REPLACE(EMPLOYER, r'(?i)HOUSEWIFE|HOMEMAKER',' ')
Either of these options should work for your case.
with test as (
select 'EMPLOYER' as my_str union all
select 'RETIRED/HOMEMAKER' as my_str union all
select 'HOMEMAKER/HOMEMAKER' as my_str union all
select 'SELF-EMPLOYED/HOMEMAKER' as my_str
)
select
my_str,
REPLACE(REPLACE(my_str, 'HOUSEWIFE', ' '), 'HOMEMAKER', ' ') as replaced_str,
REGEXP_REPLACE(my_str, r'HOUSEWIFE|HOMEMAKER', ' ') as regexed_str
from test

Oracle PLSQL : How to remove duplicate data in string

Step 01 : I have a column A in table tab_T contains that strings :
SELECT A FROM tab_T;
((<123>+<123>+<123>)(*<213>+<213>+<213>+<354>+<354>+<354>+1)(*<985>))(+<654>+<654>+1)
(<599>*<592>*<591>)
(<10945>)
(<736>+<736>+1)
(<216>*<518>)
(<598>*<593>)(*<594>+<594>+<594>+<597>+<595>+<595>+<595>)
...
...
I want to get :
((<123>)(*<213>+<354>+1)(*<985>))(+<654>+1)
(<599>*<591>)
(<10945>)
(<736>)
(<216>*<518>)
(<598>*<593>)(*<594>+<597>+<595>)
...
...
Step 02 : Then i will replace '+' by 'AND' and '*' by 'OR' and delete the number '1' from my string
this is my query (it works good and i share it with you if you need a help)
SELECT RTRIM(RTRIM(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(A,'+','AND'),'*','OR'),'(OR','OR('),'(AND','AND('),'(1)','')
,'OR1',''),'AND1',''),'1OR',''),'1AND',''),'ANDAND','AND'),'OROR','OR'),'AND'),'OR') AS logic
FROM tab_T
Result :
((<123>AND<123>AND<123>)OR(<213>AND<213>AND<213>AND<354>AND<354>AND<354>)OR(<985>))OR(<654>AND<654>)
(<599>OR<592>OR<591>)
(<10945>)
(<736>AND<736>)
(<216>OR<518>)
(<598>OR<593>)OR(<594>AND<594>AND<594>AND<597>AND<595>AND<595>AND<595>)
...
...
so when i apply step 01 and step 2 i will have this result
((<123>)OR(<213>AND<354>)OR(<985>))AND(<654>)
(<599>OR<591>)
(<10945>)
(<736>)
(<216>OR<518>)
(<598>OR<593>)OR(<594>AND<597>AND<595>)
...
...
I need a help or an idea for the step 01 please?
Thx
This will preserve the plus signs in-between the bracketed numbers:
select A original, regexp_replace(A, '(<\d+>)(\+?\1){1,}', '\1') fixed
from tab_T;
The regex can be read as: Remember a group of one or more digits inside of brackets when followed by a group of one or more of the SAME group of remembered numbers preceded by an optional plus sign. When this group is encountered, replace it with the first remembered group.
EDIT: For the sake of completeness, here's the whole thing done with successive CTE's breaking the replaces into logical groupings. This way it's a complete answer and I believe reduced the number of REPLACE() calls. You could do it as a bunch of nested REPLACE's, but I think this is arguably cleaner and easier to understand and maintain down the road.
with tab_T(A) as (
select '((<123>+<123>+<123>)(*<213>+<213>+<213>+<354>+<354>+<354>+1)(*<985>))(+<654>+<654>+1)' from dual union all
select '(<599>*<592>*<591>)' from dual union all
select '(<10945>)' from dual union all
select '(<736>+<736>+1)' from dual union all
select '(<216>*<518>)' from dual union all
select '(<598>*<593>)(*<594>+<594>+<594>+<597>+<595>+<595>+<595>)' from dual
),
-- Remove dups and '+1'
pass_1(original, fixed) as (
select A original, replace(regexp_replace(A, '(<\d+>)(\+?\1){1,}', '\1'), '+1') fixed
from tab_T
),
replace_ors(original, fixed) as (
select original, replace(replace(fixed, '(*', 'OR('), '*', 'OR')
from pass_1
),
replace_ands(original, fixed) as (
select original, replace(replace(fixed, '(+', 'AND('), '+', 'AND')
from replace_ors
)
select original, fixed
from replace_ands
;
I know this is not full answer for your question. But maybe it can help you:
with t as (select '((<123>+<123>+<123>)(*<213>+<213>+<213>+<354>+<354>+<354>+1)(*<985>))(+<654>+<654>+1)' as exp from dual)
, t1 as ( select distinct regexp_substr(exp, '[^+]+', 1, level) names
from t
connect by level <= length(regexp_replace(exp, '[^*+]'))+1
)
SELECT
RTrim(listagg(t1.names,'+') WITHIN GROUP (order by names desc)) string
from t1
I found it :)
select REGEXP_REPLACE
(A,
'(<[^>]+>)(\+|\*?\1)*',
'\1') as logic
FROM tab_T
Thank you anyway ;)

how to use like and substring in where clause in sql

Hope one can help me and explain this query for me,
why the first query return result but the second does not:
EDIT:
first query:
select name from Items where name like '%abc%'
second Query:
select name from Items where name like substring('''%abc%''',1,10)
why the first return result but the second return nothing while
substring('''%abc%''',1,10)='%abc%'
If there are a logic behind that, Is there another approach to do something like the second query,
my porpuse is to transform a string like '''abc''' to 'abc' in order to use like statement,
You can concatenate strings to form your LIKE string. To trim the first 3 and last 3 characters from a string use the SUBSTRING and LEN functions. The following example assumes your match string is called #input and starts and ends with 3 quote marks that need to be removed to find a match:
select name from Items where name like '%' + SUBSTRING(#input, 3, LEN(#input) - 4) + '%'

Check if a sentence is in a string in pl/pgsql

I have a pl/pgsql script that needs to check if a word/sentence is in a string, and it must take care of word boundaries, and case insenstive.
Example:
String: "my label xx zz yy", Pattern: "my label", MATCH
String: "xx my label zz", Pattern: "my label", MATCH
String: "my labelxx zz", Pattern: "my label", NO MATCH
So the obvious solution is to use a regex, like this:
select _label ~* (E'\\y' || _pattern || E'\\y') into _match;
It works but is slow, compared to a simple
select _label ilike '%' || _pattern || '%' into _match;
This is wrapped in a function that my script calls A LOT (in the tens of millions, I do a lot of recursion), and with this requirement the overall runtime doubled.
Now my question is, is there a faster way to implement this ?
Thanks.
EDIT: ended up using this:
if _label ilike '%' || _pattern || '%' then
select _label ~* (E'\\m' || _pattern || E'\\M') into _match;
end if;
and it is significantly faster.
I would consider the full text search capabilities, but from what you're describing, I'd likely implement this using PostgreSQL arrays.
First: define a function that takes a label, lowercases it (or uppercase if you prefer), splits it on word boundaries, and returns an array. Say:
CREATE OR REPLACE FUNCTION label_to_array(text) RETURNS text[] AS $$
SELECT regexp_split_to_array(lower($1), E'\\W');
$$ LANGUAGE sql IMMUTABLE;
$ select label_to_array('my label xx zz yy');
label_to_array
---------------------
{my,label,xx,zz,yy}
Now, create a GIN index over this function:
CREATE INDEX sometable_label_array_key ON sometable
USING GIN((label_to_array(label));
From here, PostgreSQL can use this index for many queries involving array operators, such as "contains":
SELECT *
FROM sometable
WHERE label_to_array(label) #> label_to_array('my label');
This query would split 'my label' into {my,label}, and would then use the index to find a list of rows containing my, intersect that with the list of rows containing label, and then return the result. This isn't exactly equivalent to your original query (since it doesn't check their order), but since it uses an index to eliminate most of the rows in the table, adding the original check on the end would work just fine:
SELECT *
FROM sometable
WHERE label_to_array(label) <# label_to_array('my label')
AND label ~* (E'\\y' || 'my label' || E'\\y');

MYSQL: Using GROUP BY with string literals

I have the following table with these columns:
shortName, fullName, ChangelistCount
Is there a way to group them by a string literal within their fullName? The fullname represents file directories, so I would like to display results for certain parent folders instead of the individual files.
I tried something along the lines of:
GROUP BY fullName like "%/testFolder/%" AND fullName like "%/testFolder2/%"
However it only really groups by the first match....
Thanks!
Perhaps you want something like:
GROUP BY IF(fullName LIKE '%/testfolder/%', 1, IF(fullName LIKE '%/testfolder2/%', 2, 3))
The key idea to understand is that an expression like fullName LIKE foo AND fullName LIKE bar is that the entire expression will necessarily evaluate to either TRUE or FALSE, so you can only get two total groups out of that.
Using an IF expression to return one of several different values will let you get more groups.
Keep in mind that this will not be particularly fast. If you have a very large dataset, you should explore other ways of storing the data that will not require LIKE comparisons to do the grouping.
You'd have to use a subquery to derive the column values you'd like to ultimately group on:
FROM (SELECT SUBSTR(fullname, ?)AS derived_column
FROM YOUR_TABLE ) x
GROUP BY x.derived_column
Either use when/then conditions or Have another temporary table containing all the matches you wish to find and group. Sample from my database.
Here I wanted to group all users based on their cities which was inside address field.
SELECT ut.* , c.city, ua.*
FROM `user_tracking` AS ut
LEFT JOIN cities AS c ON ut.place_name LIKE CONCAT( "%", c.city, "%" )
LEFT JOIN users_auth AS ua ON ua.id = ut.user_id

Resources