I have had to look up hundreds (if not thousands) of free-text answers on google, making notes in Excel along the way and inserting SAS-code around the answers as a last step.
The output looks like this:
This output contains an unnecessary number of blank spaces, which seems to confuse SAS's search to the point where the observations can't be properly located.
It works if I manually erase superflous spaces, but that will probably take hours. Is there an automated fix for this, either in SAS or in excel?
I tried using the STRIP-function, to no avail:
else if R_res_ort_txt=strip(" arild ") and R_kom_lan=strip(" skåne ") then R_kommun=strip(" Höganäs " );
If you want to generate a string like:
if R_res_ort_txt="arild" and R_kom_lan="skåne" then R_kommun="Höganäs";
from three variables, let's call them A B C, then just use code like:
string=catx(' ','if R_res_ort_txt=',quote(trim(A))
,'and R_kom_lan=',quote(trim(B))
,'then R_kommun=',quote(trim(C)),';') ;
Or if you are just writing that string to a file just use this PUT statement syntax.
put 'if R_res_ort_txt=' A :$quote. 'and R_kom_lan=' B :$quote.
'then R_kommun=' C :$quote. ';' ;
A saner solution would be to continue using the free-text answers as data and perform your matching criteria for transformations with a left join.
proc import out=answers datafile='my-free-text-answers.xlsx';
data have;
attrib R_res_ort_txt R_kom_lan length=$100;
input R_res_ort_txt ...;
datalines4;
... whatever all those transforms will be performed on...
;;;;
proc sql;
create table want as
select
have.* ,
answers.R_kommun_answer as R_kommun
from
have
left join
answers
on
have.R_res_ort_txt = answers.res_ort_answer
& have.R_kom_lan = abswers.kom_lan_answer
;
I solved this by adding quotes in excel using the flash fill function:
https://www.youtube.com/watch?v=nE65QeDoepc
I need to validate for valid code name.
So, my string can have values like below:
String test = "C000. ", "C010. ", "C020. ", "C030. ", "CA00. ","C0B0. ","C00C. "
So my function needs to validate below conditions:
It should start with C
After that next 3 characters should be numeric before .
Rest it can be anything.
So in above string values, only ["C000.", "C010.", "C020.", "C030."] are valid ones.
EDIT:
Below is the code I tried:
if (nameObject.Title.StartsWith(String.Format("^[C][0-9]{3}$",nameObject.Title)))
I'd suggest a regex, for example (written off the top of my head, may need work):
string s = "C030.";
Regex reg = new Regex("C[0-9]{3,3}\\.");
bool isMatch = reg.IsMatch(s);
This regex should do the trick:
Regex.IsMatch(input, #"C[0-9]{3}\..*")
Check out http://www.techotopia.com/index.php/Working_with_Strings_in_C_Sharp
for a quick tutorial on (among other things) individual access of string elements, so you can test each element for your criteria.
If you think your criteria may change, using regular expressions gives you maximum flexibility (but is more runtime intensive than regular string-element evaluation). In your case, it may be overkill, IMHO.
I have a column with a long list of folder and file names. The folders and file names vary. I want to extract the file name from the column into another column but I struggling to do this in Excel.
Example of column data:(files and folder altered to hide details that should not be public)
c:\data\1\nc2\media\ss\system media\ne\d - wnd enging works v5.swf
c:\data\1\nc2\media\ss\special campaigns\samns dec 2012\trainerv5.swf
C:\Local\Messages\17362~000000001~20131231235910~4.MUF
c:\data\1\nc2\media\ss\system media\tl\nd - tfl statusv4.swf
c:\data\1\nc2\media\ss\system media\core\ss_bagage v2.swf
I know I should be able to search from the right to the first occurence of "\" but I can't figure out the syntax.
Many thanks
UPDATE:
Formula =RIGHT(B2,LEN(B2)-SEARCH("\",B2,1)) should work, but it shows incorrect results. But If I change it to search for "." it pulls out the file extension. So there is a key item I'm missing
=RIGHT(A1,LEN(A1)-FIND("~",SUBSTITUTE(A1,"\","~",LEN(A1)-LEN(SUBSTITUTE(A1,"\","")))))
copy it in any column say b drag down,you are done
VBA is a more efficient option if you have many files to parse. Create a module and add the below:
Function GetFileName(file As String) As String
Set fso = CreateObject("Scripting.FileSystemObject")
GetFileName = fso.GetFileName(file)
End Function
There are several different ways to get the text following the last slash in a string, including the following formula. In this example, H15 is the cell containing the string to search. If it can't find a slash, it returns the "-" (dash) character.
=iferror(RIGHT(H15,LEN(H15)-SEARCH("|",SUBSTITUTE(H15,"/","|",LEN(H15)-LEN(SUBSTITUTE(H15,"/",""))))),"-")
The formula first finds the number of slashes in the string. LEN gives the total length of the string, and LEN of the string without slashes after using SUBSTITUTE to eliminate the slashes in the original string - the difference is the number of slashes.
Then, you substitute in a marker character(I used "|") for the last slash. By searching for the marker, you find where the bit after the slash starts. The total length of the string minus where the marker starts tells you how many characters to take from the right, which you then do.
If you need more generic string parsing and are willing to use a little bit of VBA, you can use the split function as suggested by Jamie Bull in his answer to this question on SuperUser.
His function will use any character you choose to split the string into segments and return whichever segment you choose.
I've copied Jamie's function here for convenient reference:
Function STR_SPLIT(str, sep, n) As String
Dim V() As String
V = Split(str, sep)
STR_SPLIT = V(n - 1)
End Function
I have an Excel spreadsheet containing a list of strings. Each string is made up of several words, but the number of words in each string is different.
Using built in Excel functions (no VBA), is there a way to isolate the last word in each string?
Examples:
Are you classified as human? -> human?
Negative, I am a meat popsicle -> popsicle
Aziz! Light! -> Light!
This one is tested and does work (based on Brad's original post):
=RIGHT(A1,LEN(A1)-FIND("|",SUBSTITUTE(A1," ","|",
LEN(A1)-LEN(SUBSTITUTE(A1," ","")))))
If your original strings could contain a pipe "|" character, then replace both in the above with some other character that won't appear in your source. (I suspect Brad's original was broken because an unprintable character was removed in the translation).
Bonus: How it works (from right to left):
LEN(A1)-LEN(SUBSTITUTE(A1," ","")) – Count of spaces in the original string
SUBSTITUTE(A1," ","|", ... ) – Replaces just the final space with a |
FIND("|", ... ) – Finds the absolute position of that replaced | (that was the final space)
Right(A1,LEN(A1) - ... )) – Returns all characters after that |
EDIT: to account for the case where the source text contains no spaces, add the following to the beginning of the formula:
=IF(ISERROR(FIND(" ",A1)),A1, ... )
making the entire formula now:
=IF(ISERROR(FIND(" ",A1)),A1, RIGHT(A1,LEN(A1) - FIND("|",
SUBSTITUTE(A1," ","|",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))))
Or you can use the =IF(COUNTIF(A1,"* *") syntax of the other version.
When the original string might contain a space at the last position add a trim function while counting all the spaces: Making the function the following:
=IF(ISERROR(FIND(" ",B2)),B2, RIGHT(B2,LEN(B2) - FIND("|",
SUBSTITUTE(B2," ","|",LEN(TRIM(B2))-LEN(SUBSTITUTE(B2," ",""))))))
This is the technique I've used with great success:
=TRIM(RIGHT(SUBSTITUTE(A1, " ", REPT(" ", 100)), 100))
To get the first word in a string, just change from RIGHT to LEFT
=TRIM(LEFT(SUBSTITUTE(A1, " ", REPT(" ", 100)), 100))
Also, replace A1 by the cell holding the text.
A more robust version of Jerry's answer:
=TRIM(RIGHT(SUBSTITUTE(TRIM(A1), " ", REPT(" ", LEN(TRIM(A1)))), LEN(TRIM(A1))))
That works regardless of the length of the string, leading or trailing spaces, or whatever else and it's still pretty short and simple.
I found this on google, tested in Excel 2003 & it works for me:
=IF(COUNTIF(A1,"* *"),RIGHT(A1,LEN(A1)-LOOKUP(LEN(A1),FIND(" ",A1,ROW(INDEX($A:$A,1,1):INDEX($A:$A,LEN(A1),1))))),A1)
[edit] I don't have enough rep to comment, so this seems the best place...BradC's answer also doesn't work with trailing spaces or empty cells...
[2nd edit] actually, it doesn't work for single words either...
=RIGHT(TRIM(A1),LEN(TRIM(A1))-FIND(CHAR(7),SUBSTITUTE(" "&TRIM(A1)," ",CHAR(7),
LEN(TRIM(A1))-LEN(SUBSTITUTE(" "&TRIM(A1)," ",""))+1))+1)
This is very robust--it works for sentences with no spaces, leading/trailing spaces, multiple spaces, multiple leading/trailing spaces... and I used char(7) for the delimiter rather than the vertical bar "|" just in case that is a desired text item.
This is very clean and compact, and works well.
{=RIGHT(A1,LEN(A1)-MAX(IF(MID(A1,ROW(1:999),1)=" ",ROW(1:999),0)))}
It does not error trap for no spaces or one word, but that's easy to add.
Edit:
This handles trailing spaces, single word, and empty cell scenarios. I have not found a way to break it.
{=RIGHT(TRIM(A1),LEN(TRIM(A1))-MAX(IF(MID(TRIM(A1),ROW($1:$999),1)=" ",ROW($1:$999),0)))}
=RIGHT(A1,LEN(A1)-FIND("`*`",SUBSTITUTE(A1," ","`*`",LEN(A1)-LEN(SUBSTITUTE(A1," ","")))))
New answer 9/28/2022
Considering the new excel function: TEXTAFTER (check availability) you can achieve it with a simple formula:
=TEXTAFTER(A1," ", -1)
To add to Jerry and Joe's answers, if you're wanting to find the text BEFORE the last word you can use:
=TRIM(LEFT(SUBSTITUTE(TRIM(A1), " ", REPT(" ", LEN(TRIM(A1)))), LEN(SUBSTITUTE(TRIM(A1), " ", REPT(" ", LEN(TRIM(A1)))))-LEN(TRIM(A1))))
With 'My little cat' in A1 would result in 'My little' (where Joe and Jerry's would give 'cat'
In the same way that Jerry and Joe isolate the last word, this then just gets everything to the left of that (then trims it back)
Copy into a column, select that column and HOME > Editing > Find & Select, Replace:
Replace All.
There is a space after the asterisk.
Imagine the string could be reversed. Then it is really easy. Instead of working on the string:
"My little cat" (1)
you work with
"tac elttil yM" (2)
With =LEFT(A1;FIND(" ";A1)-1) in A2 you get "My" with (1) and "tac" with (2), which is reversed "cat", the last word in (1).
There are a few VBAs around to reverse a string. I prefer the public VBA function ReverseString.
Install the above as described. Then with your string in A1, e.g., "My little cat" and this function in A2:
=ReverseString(LEFT(ReverseString(A1);IF(ISERROR(FIND(" ";A1));
LEN(A1);(FIND(" ";ReverseString(A1))-1))))
you'll see "cat" in A2.
The method above assumes that words are separated by blanks. The IF clause is for cells containing single words = no blanks in cell. Note: TRIM and CLEAN the original string are useful as well. In principle it reverses the whole string from A1 and simply finds the first blank in the reversed string which is next to the last (reversed) word (i.e., "tac "). LEFT picks this word and another string reversal reconstitutes the original order of the word (" cat"). The -1 at the end of the FIND statement removes the blank.
The idea is that it is easy to extract the first(!) word in a string with LEFT and FINDing the first blank. However, for the last(!) word the RIGHT function is the wrong choice when you try to do that because unfortunately FIND does not have a flag for the direction you want to analyse your string.
Therefore the whole string is simply reversed. LEFT and FIND work as normal but the extracted string is reversed. But his is no big deal once you know how to reverse a string. The first ReverseString statement in the formula does this job.
=LEFT(A1,FIND(IF(
ISERROR(
FIND("_",A1)
),A1,RIGHT(A1,
LEN(A1)-FIND("~",
SUBSTITUTE(A1,"_","~",
LEN(A1)-LEN(SUBSTITUTE(A1,"_",""))
)
)
)
),A1,1)-2)
I translated to PT-BR, as I needed this as well.
(Please note that I've changed the space to \ because I needed the filename only of path strings.)
=SE(ÉERRO(PROCURAR("\",A1)),A1,DIREITA(A1,NÚM.CARACT(A1)-PROCURAR("|", SUBSTITUIR(A1,"\","|",NÚM.CARACT(A1)-NÚM.CARACT(SUBSTITUIR(A1,"\",""))))))
Another way to achieve this is as below
=IF(ISERROR(TRIM(MID(TRIM(D14),SEARCH("|",SUBSTITUTE(TRIM(D14)," ","|",LEN(TRIM(D14))-LEN(SUBSTITUTE(TRIM(D14)," ","")))),LEN(TRIM(D14))))),TRIM(D14),TRIM(MID(TRIM(D14),SEARCH("|",SUBSTITUTE(TRIM(D14)," ","|",LEN(TRIM(D14))-LEN(SUBSTITUTE(TRIM(D14)," ","")))),LEN(TRIM(D14)))))
You can achieve this also by reversing the string and finding the first space
=MID(C3,2+LEN(C3)-SEARCH(" ",CONCAT(MID(C3,SEQUENCE(LEN(C3),,LEN(C3),-1),1))),LEN(A1))
Reverse the string
CONCAT(MID(C3,SEQUENCE(LEN(C3),,LEN(C3),-1),1))
Find the first space in the reversed string
SEARCH(" ",...
Take the position of the space found in the reversed string off the length of the string and return that portion
=MID(C3,2+LEN(C3)-SEARCH...
I also had a task like this and when I was done, using the above method, a new method occured to me: Why don't you do this:
Reverse the string ("string one" becomes "eno gnirts").
Use the good old Find (which is hardcoded for left-to-right).
Reverse it into readable string again.
How does this sound?