Spss select case by starting character - string

I have a data set and i have a variable which contains text, i need to select the cases that do not contain
RT
at the start. How can i go about doing this? i do not want to compute a new variable just want to set a filter.
I know that i need to use the Select cases -> If condition is satisfied... but i can not figure out what function or formula i can use to achieve this.
Help is much appreciated.

SELECT IF CHAR.SUBSTR(UPCASE(MyVar),1,2) NE "RT".

Related

how to fill missing values with a specific set on string values

This is my DB,
its got too many missing cases for me to do it manually, and I can not use flash fill in this case,
so I want to randomlly fill these case with a specific set of strings :"TCS", "TCLSH","TCEO", and "TCT"
how can I do it in excell please help and thanks a lot in advance
You can do this in a separate column using index with randbetween (per screenshot / this sheet) as follows:
=IF(B3="",INDEX($H$3:$H$6,RANDBETWEEN(1,COUNTA($H$3:$H$6))),B3)

Creating a variable if key term appears in text

I am trying to create a variable "thromboembolism death", 0 if it is not the cause of death, 1 if it is.
Is there any way to sort through this data set through spss / excel in order to create a new variable if one of the key terms e.g (DVT, Pulmonary embolism, thromboembolism) appear in the line of text? Here is what my data looks like right now.
https://i.stack.imgur.com/WDrBs.png
Also the data set is very large. 250000+ cases. I am new to data analysis, thanks for the help!
In SPSS, assuming you have a variable named death_cause with description verbatims:
COMPUTE thromboembolism_death = (INDEX(UPCASE(death_cause),'DVT') > 0)
OR (INDEX(UPCASE(death_cause),'PULMONARY EMBOLISM') > 0)
OR (INDEX(UPCASE(death_cause),'THROMBOEMBOLISM') > 0).
EXE .
In Excel, you could take a similar approach. Assuming your text verbatims are in column A:
=IF(OR(ISNUMBER(SEARCH("DVT",A1)),ISNUMBER(SEARCH("PULMONARY EMBOLISM",A1)),ISNUMBER(SEARCH("THROMBOEMBOLISM",A1))),1,0)
Alternatively, if you're comfortable using SUMPRODUCT(), the formula gets a bit shorter. Assuming you list your "strings to search for" in cells C2:C5:
=SUMPRODUCT(--ISNUMBER(SEARCH(C2:C5,A1)))>0
Note that all of the above options are case-insensitive.

Ignore numeric values in string, Presto

I have a column in a database which is all postcodes. I want to use that column to get statistical data about specific regions. To do this, I want to extract just the first non numeric characters of the postcode (B for Birmingham, BT for Belfast).
I can see solutions in other SQL formats using a CASE WHEN with ISNUMERIC but that function doesn't work in Presto. Are there any solutions to this?
As always, any advice would be greatly appreciated.
Many thanks,
Barry
I think you'll want to look at using regular expression functions to either extract the non-numeric characters using regexp_extract or replace all non-numeric characters using regex_replace.
See https://prestodb.io/docs/current/functions/regexp.html
I managed to get around this using
select concat(substr(postcode,1,1), case when substr(postcode,2,1) in ('1','2','3','4','5','6','7','8','9','0') then '' else substr(postcode,2,1) end)

Looking at multiple values with one statement w/o OR statement

Say I have multiple tasks: quoting, binding, rating that have same response time of 3 hours... I was wondering if there was a way to make an IF statement such that I could just say for example:
=IF(B2="*Quoting,Binding,Rating", C2+3, NA)
I haven't been able to get it to work, and I'm trying to avoid using an OR statement with the IF statement to get the values, but is it possible to do it this way? It sounds simple, "If it's task x,y,z then add 3 hours to the start time column (C2)". Any advice guys? Thanks!
This may achieve what you're after
=IF(NOT(ISERROR(SEARCH(B2,"Quoting,Binding,Rating"))), C2+3, "NA")
Hope that helps
Use the OR statement: =(IF(OR(B2="Quoting",B2="Binding",B2="Rating"),C3+3,NA()))
If you're looking to shorten the formula, you can put the values you want to check into a named range (I used "List" to reference "I:I" but you could put the list on another sheet) and use SUMPRODUCT.
=IF(SUMPRODUCT(--(B2=List))>0,C3+3,NA())

How to include Multiple IFs in Excel formula?

I have the formula below that has two IF Statements however it is given an error. Any help is appreciated.
=IF((AND(B11>65,D15>65)),Sheet!D$47,VLOOKUP(D15,Sheet!$A$2:$K$51,4,FALSE)),IF((AND(B11>61,D15>61)),Sheet!D$44,VLOOKUP(D15,Sheet!$A$2:$K$51,4,FALSE))
What I am trying to Achieve is the following:
IF B11>65 and D15 >65 Then select value from D$47. IF B11<65 and D15<65 Then Select value from D$44. Else VLOOKUP(D15,Sheet!$A$2:$K$51,4,FALSE))
You can have multiple IF() calls embedded such as this:
=IF(AND(), Sheet!D$47, IF(AND(), Sheet!D$44, VLOOKUP())
The complete form in your case would be:
=IF(AND(B11>65,D15>65), Sheet!D$47, IF(AND(B11>61,D15>61), Sheet!D$44, VLOOKUP(D15,Sheet!$A$2:$K$51,4,FALSE)))
Basically, this uses the 'Else' clause as a means to add additional clauses. It is possible to do it the other way around but I personally find that harder to read.

Resources