REPLACE doesn't work for strings in BigQuery - string

I tried to use REPLACE to delete some words, like 'HOMEMAKER' and 'HOUSEWIFE', from strings. But I failed to change those words into empty space. Why is that happening?
I want to delete strings like 'HOMEMAKER' and 'HOUSEWIFE' from column EMPLOYER using REPLACE function but failed. I also tried REGEXP_REPLACE but failed again. These are the table I have (Sorry I want to build a table here but somehow it doesn't work).
EMPLOYER
RETIRED/HOMEMAKER
HOMEMAKER/HOMEMAKER
SELF-EMPLOYED/HOMEMAKER
We code is listed below:
SELECT EMPLOYER,
CASE WHEN EMPLOYER LIKE '%HOUSE%WIFE%'
THEN REGEXP_REPLACE(EMPLOYER,r'HOUSE%WIFE',' ')
WHEN EMPLOYER LIKE '%HOME%MAKER%'
THEN REGEXP_REPLACE(EMPLOYER, r'HOME%MAKER', ' ')
ELSE '0'
END AS SIGN
FROM fec.work
WHERE EMPLOYER LIKE '%HOME%MAKER%'
OR EMPLOYER LIKE '%HOUSE%WIFE%'
GROUP BY 1,2;
The result I want is like:
SIGN
RETIRED/
/
SELF-EMPLOYED/
But I got exactly the same columns in SIGN as EMPLOYER. Can anyone tell me why replace function did not make any changes?

You should use something like below
REGEXP_REPLACE(EMPLOYER, r'HOUSEWIFE|HOMEMAKER',' ')
you can use i flag to make this replacement case insensitive
REGEXP_REPLACE(EMPLOYER, r'(?i)HOUSEWIFE|HOMEMAKER',' ')

Either of these options should work for your case.
with test as (
select 'EMPLOYER' as my_str union all
select 'RETIRED/HOMEMAKER' as my_str union all
select 'HOMEMAKER/HOMEMAKER' as my_str union all
select 'SELF-EMPLOYED/HOMEMAKER' as my_str
)
select
my_str,
REPLACE(REPLACE(my_str, 'HOUSEWIFE', ' '), 'HOMEMAKER', ' ') as replaced_str,
REGEXP_REPLACE(my_str, r'HOUSEWIFE|HOMEMAKER', ' ') as regexed_str
from test

Related

how do I get rid of leading/trailing spaces in SAS search terms?

I have had to look up hundreds (if not thousands) of free-text answers on google, making notes in Excel along the way and inserting SAS-code around the answers as a last step.
The output looks like this:
This output contains an unnecessary number of blank spaces, which seems to confuse SAS's search to the point where the observations can't be properly located.
It works if I manually erase superflous spaces, but that will probably take hours. Is there an automated fix for this, either in SAS or in excel?
I tried using the STRIP-function, to no avail:
else if R_res_ort_txt=strip(" arild ") and R_kom_lan=strip(" skåne ") then R_kommun=strip(" Höganäs " );
If you want to generate a string like:
if R_res_ort_txt="arild" and R_kom_lan="skåne" then R_kommun="Höganäs";
from three variables, let's call them A B C, then just use code like:
string=catx(' ','if R_res_ort_txt=',quote(trim(A))
,'and R_kom_lan=',quote(trim(B))
,'then R_kommun=',quote(trim(C)),';') ;
Or if you are just writing that string to a file just use this PUT statement syntax.
put 'if R_res_ort_txt=' A :$quote. 'and R_kom_lan=' B :$quote.
'then R_kommun=' C :$quote. ';' ;
A saner solution would be to continue using the free-text answers as data and perform your matching criteria for transformations with a left join.
proc import out=answers datafile='my-free-text-answers.xlsx';
data have;
attrib R_res_ort_txt R_kom_lan length=$100;
input R_res_ort_txt ...;
datalines4;
... whatever all those transforms will be performed on...
;;;;
proc sql;
create table want as
select
have.* ,
answers.R_kommun_answer as R_kommun
from
have
left join
answers
on
have.R_res_ort_txt = answers.res_ort_answer
& have.R_kom_lan = abswers.kom_lan_answer
;
I solved this by adding quotes in excel using the flash fill function:
https://www.youtube.com/watch?v=nE65QeDoepc

Get the 3rd part of the string in Postgres

I have data like,
ab-volt-ssn-dev
ab-volt-lnid-dev
ab-volt-ssn-hamp-dev
ab-volt-cf-apnt-test
I need output to be like,
ssn
lnid
ssn
cf
You can use split_part()
select split_part('ab-volt-ssn-hamp-dev', '-', 3);
If you need to access multiple parts, then converting it to an array might be easier:
select elements[1],
elements[2],
elements[3],
elements[4]
from (
select string_to_array(the_column, '-') as elements
from the_table
) t;

Cognos Report Studio: CASE and IF Statements

I'm very new in using Cognos report studio and trying to filter some of the values and replace them into others.
I currently have values that are coming out as blanks and want to replace them as string "Property Claims"
what i'm trying to use in my main query is
CASE WHEN [Portfolio] is null
then 'Property Claims'
ELSE [Portfolio]
which is giving me an error. Also have a different filter i want to put in to replace windscreen flags to a string value rather than a number. For example if the flag is 1 i want to place it as 'Windscreen Claims'.
if [Claim Windscreen Flag] = 1
then ('Windscreen')
Else [Claim Windscreen Flag]
None of this works with the same error....can someone give me a hand?
Your first CASE statement is missing the END. The error message should be pretty clear. But there is a simpler way to do that:
coalesce([Portfolio], 'Property Claims')
The second problem is similar: Your IF...THEN...ELSE statement is missing a bunch of parentheses. But after correcting that you may have problems with incompatible data types. You may need to cast the numbers to strings:
case
when [Claim Windscreen Flag] = 1 then ('Windscreen')
else cast([Claim Windscreen Flag], varchar(50))
end
In future, please include the error messages.
it might be syntax
IS NULL (instead of = null)
NULL is not blank. You might also want = ' '
case might need an else and END at the bottom
referring to a data type as something else can cause errors. For example a numeric like [Sales] = 'Jane Doe'
For example (assuming the result is a string and data item 2 is also a string),
case
when([data item 1] IS NULL)Then('X')
when([data item 1] = ' ')Then('X')
else([data item 2])
end
Also, if you want to show a data item as a different type, you can use CAST

how to use like and substring in where clause in sql

Hope one can help me and explain this query for me,
why the first query return result but the second does not:
EDIT:
first query:
select name from Items where name like '%abc%'
second Query:
select name from Items where name like substring('''%abc%''',1,10)
why the first return result but the second return nothing while
substring('''%abc%''',1,10)='%abc%'
If there are a logic behind that, Is there another approach to do something like the second query,
my porpuse is to transform a string like '''abc''' to 'abc' in order to use like statement,
You can concatenate strings to form your LIKE string. To trim the first 3 and last 3 characters from a string use the SUBSTRING and LEN functions. The following example assumes your match string is called #input and starts and ends with 3 quote marks that need to be removed to find a match:
select name from Items where name like '%' + SUBSTRING(#input, 3, LEN(#input) - 4) + '%'

Pl/Sql using instr to find exact match

I am trying to find if a string exist in a word and extract it. I have uses the instr() function but this works as the LIKE function: if part or the whole word exists it returns it.
Here I want to get the string 'Services' out, it works but if I change 'Services' to 'Service' it still works. I don't want that. If 'Service' is entered it should return null and not 'Services'
Modified:
What I am trying to do here is abbreviate certain parts of the company name.
This is what my database table looks like :
Word | Abb
---------+-----
Company | com
Limited | ltd
Service | serv
Services | servs
Here is the code:
Declare
Cursor Words Is
SELECT word,abb
FROM abbWords
processingWord VARCHAR2(50);
abbreviatedName VARCHAR(120);
fullName = 'A.D Company Services Limited';
BEGIN
FOR eachWord IN Words LOOP
--find the position of the word in name
wordPosition := INSTR(fullName, eachWord.word);
--extracts the word form the full name that matches the database
processingWord := Substr(fullName,instr(fullName,eachWord.word), length(eachWord.word));
--only process words that exist in name
if wordPosition > 0 then
abbreviatedName = replace(fullName, eachWord.word,eachWord.abb);
end if;
END lOOP;
END;
So if the user enters 'Service' I don't want 'Services' to be returned. By this I mean word position should be 0 if the word 'Service' in not found instead of returning the position for the word 'Services'
One way of doing it:
DECODE(INSTR('A.D Company Seervices Limited','Services'),
0,
NULL,
SUBSTR('A.D Company Services Limited',
INSTR('A.D Company Services Limited','Services'),
length('Services')))
INSTR() will return 0 if text is not found. DECODE() will evaluate the first argument, compare to the second, if match, return third argument, if not, return fourth argument. (sqlfiddle link)
Arguably not the most elegant way, but matches your requirement.
I think you're over-complicating this. You can do everything with regular expressions. For instance; given the following table:
create table names ( name varchar2(100));
insert into names values ('A.D Company Services Limited');
insert into names values ('A.D Company Service Limited');
This query will only return the name 'A.D Company Services Limited'.
select *
from names
where regexp_like( name
, '(^|[[:space:]])services($|[[:space:]])'
, 'i' )
This means match the beginning of the string, ^, or a space followed by services followed the end of the string, $, or a space. This is what differentiates regular expressions from using instr etc. You can make your matches easily conditional on other factors.
However, though this seems to be your question I don't think this is what you're trying to do. You're trying to replace the string 'serv' in your wider string without replacing 'services' or 'service'. For this you need to use regexp_replace().
If I add the following row to the table:
insert into names values ('A.D Company Serv Limited');
and run this query:
select regexp_replace( name
, '(^|[[:space:]])serv($|[[:space:]])'
, ' Services '
, 1, 0, 'i' )
from names
The only thing that will change is ' Serv ', which in this newest line, will be replaced with ' Services '. Note the spaces; as you don't want to replace 'Services' with 'ServServices' these are very important.
Here's a little SQL Fiddle to demonstrate.
Another alternative is to use something like:
select replace(name,' serv ', ' Services ')
from names;
This will replace only the word 'Serv' situated between 2 spaces.
Thank you,
Alex.
INSTR returns a number: the index of the first occurrence of the matching string. You should use regexp_substr instead (10g+):
SQL> select regexp_substr('A.D Company Services Limited', 'Services') match,
2 regexp_substr('A.D Company Service Limited', 'Services') unmatch
3 from dual;
MATCH UNMATCH
-------- -------
Services

Resources