Deleting part of a strings-cell array - string

This is my sample data (cell array)
>A_
'CUST_1627_PI425479659' 'Pri-miR-100u2' 'u2'
'CUST_2430_PI425479649' 'Pri-miR-L7a-3d' 'd'
'CUST_226_PI425479649' 'Pri-miR-3130-4u1' 'u1'
'CUST_1618_PI425479649' 'Pri-miR-147bu' 'u'
'CUST_1449_PI425479659' 'Pri-miR-107u' 'u'
'CUST_1546_PI425479659' 'Pri-miR-4299d1' 'd1'
The last one character or two last characters in the second column are written in the third column. I would like to remove them from strings in second column.
In a result it should look like this
>A_
'CUST_1627_PI425479659' 'Pri-miR-100' 'u2'
'CUST_2430_PI425479649' 'Pri-miR-L7a-3' 'd'
'CUST_226_PI425479649' 'Pri-miR-3130-4' 'u2'
'CUST_1618_PI425479649' 'Pri-miR-147b' 'u'
'CUST_1449_PI425479659' 'Pri-miR-107' 'u'
'CUST_1546_PI425479659' 'Pri-miR-4299' 'd1'
I tried in this way but it doesn't work.
s= {'u','u1','u2','d','d1'};
for i=1:length(A_(:,2))
A_(i,2)= erase(A_(i,2),s)
end

Use regexprep to replace the occurrences of the third column in the second column with ''.
A_(:,2) = regexprep( A_(:,2), A_(:,3), '');
or to fix your code which uses erase (introduced in R2016b):
for k=1:length(A_(:,2))
A_(k,2) = erase(A_(k,2), A_(k,3)); %You need A_(k,3) here
end
but... since erase is directly applicable on cell arrays, so you don't need a loop here i.e.
A_(:,2) = erase(A_(:,2), A_(:,3));

Related

split String Variable in few numeric Variables in SPSS

I have a string variable with comma separated numbers that I want to split into four numeric variables.
makeArr
var1a
var1b
var1c
var1d
6,8,13,10
6
8
13
10
10,11,2
10
11
2
7,1,14,3
7
1
14
3
With:
IF (CHAR.INDEX(makeArr,',') >= 1)
f12a=CHAR.SUBSTR(makeArr,1,CHAR.INDEX(makeArr,',')-1).
EXECUTE.
IF (CHAR.INDEX(makeArr,',') >= 1)
f12b=CHAR.SUBSTR(makeArr,CHAR.INDEX(makeArr,',')+1,CHAR.INDEX(makeArr,',')-1).
EXECUTE.
I always get the first variable written without any problems.
This no longer works with the second variable because it has a different length and the comma is also written here.
So I would need a split at the comma and the division of the numbers over the comma.
Since char.substr will only tell you about the location of the first occurence of the search string, you need to start the second search from a new location - AFTER the first occurence, and this gets more and more complicated as you continue. My suggestion is create a copy of your array variable, which you will cut pieces off as you proceed - so that you are only searching for the first occurence of "," every time.
First I recreate your example data to demonstrate on.
data list free/makeArr (a20).
begin data
"6,8,13,10" "10,11,2" "7,1,14,3"
end data.
Now I copy your array into a new variable #tmp. Note that I add a "," at the end so the syntax stays the same for all parts of the array. I add the "#" at the beginning of the name to make it invisible, you can remove it if you want.
It is possible to do the following calculation in steps as you started to do, but nicer to loop throug the steps (especially if this is an example for a longer array).
string f12a f12b f12c f12d #tmp (a20).
compute #tmp=concat(rtrim(makeArr),",").
do repeat nwvr=f12a f12b f12c f12d.
do IF #tmp<>"".
compute nwvr=CHAR.SUBSTR(#tmp,1,CHAR.INDEX(#tmp,',')-1).
compute #tmp=CHAR.SUBSTR(#tmp,CHAR.INDEX(#tmp,',')+1).
end if.
end repeat.
EXECUTE.
Here I found a different solution for what I think is the same problem:
https://www.ibm.com/mysupport/s/question/0D50z00006PsP3tCAF/splitting-a-string-variable-divided-by-commas-into-new-single-variables?language=es
One line of code makes the work:
spssinc trans result=var_1 to var_4 type=20/formula 're.split(", *", makeArr)'.

How to modify my code in order to list all the tweets that contain the word "grammy"

i wrote a python program in order to list all the tweets in a Excel file which contains the word "grammy", but it didn't work, and i don't know how to modify it.
In this case, line contains a list of strings. You can iterate over each cell value in line and search for your keyword as follows:
for line in csvdata:
for cell in line:
if 'grammy' in cell:
print('this line contains my word:', line)
The problem with your approach is you initialize the counter variable i, then increment that variable, but you never reset the value of i to zero. If you want to use your approach, you'd need to reset the i counter variables to 0 for each line:
for line in csvdata:
i = 0
t = t + 1
i = i + 1
... [ rest of your code ]
If you don't reset i for each line, then the first line will increase i by 1, and all subsequent lines will just keep increasing the value of i. Eventually this will cause the index error that you're wisely catching!
Another problem with this approach is each line only checks a single value for i. So for each row, you're only checking a single cell's value. One way around this is to use the for loop approach I list above (for cell in line). Another way would be to use a while loop, but if you're new to programming I'd start by mastering the for loop before you move on to while loops.
I hope this helps!

SQLite - Left-pad zeros in returned Text field

I have a text field in my SQLite database that stores a Time value, but for unrelated reasons I can't change the data type to TIME.
The values are stored in HH:MM format, and I'm having trouble trying to sort results by time because the values below '10:00' are missing a leading zero. I would prefer not to store the data with leading zero for the same unrelated reasons.
I'd like to add something to the Query that would pad the missing character if necessary, causing the results to read '08:30' when collected. I've been searching through the command and function lexicon though and I'm not finding what I need.
Is there a simple way to do this inside a query?
Thanks
I think this would work:
select your_col, case when length(your_col) < 5
then '0' || your_col else your_col end from your_table
Demo using Python
>>> conn.execute('''select c, case when length(c) < 5
then '0' || c else c end from t''').fetchall()
[(u'10:00', u'10:00'), (u'8:00', u'08:00')]
SELECT REPLACE(PRINTF('%5s', your_col), ' ', '0') FROM your_table
The PRINTF call pads the value with spaces until it's 5 characters, and the
REPLACE call replaces those spaces with zeros.

Count specified character from field(s) in excel

I have an excel sheet that recorded data like following:
_____|__A__|__B__
__1__|__x__|_____
__2__|__x__|_____
__3__|__y__|_____
__4__|__x__|_____
__5__|__x__|_____
__6__|__y__|_____
__7__|__x__|_____
__8__|__ __|_____
__9__|__x__|_____
_10__|__y__|_____
_11__|__ __|_____
_12__|__x__|_____
I would like to count all field contained 'y' and ' ' from A1 to A12. Here's what I did for now:
=COUNTIF(A1:A12, "y") + COUNTIF(A1:A12, "")
It will become longer if I count more specified character...
Would you suggest any better way ?
You can use this shorthand to achieve your result
=SUM(COUNTIF(A1:A12,{"y",""}))
This is exactly the same as
=COUNTIF(A1:A12, "y") + COUNTIF(A1:A12, "")
The shorthand allows you to easily add more characters you want to count.
You can read more about this here:
https://excelxor.com/2014/09/28/countifs-multiple-or-criteria-for-one-or-two-criteria_ranges/

How can I perform a reverse string search in Excel without using VBA?

I have an Excel spreadsheet containing a list of strings. Each string is made up of several words, but the number of words in each string is different.
Using built in Excel functions (no VBA), is there a way to isolate the last word in each string?
Examples:
Are you classified as human? -> human?
Negative, I am a meat popsicle -> popsicle
Aziz! Light! -> Light!
This one is tested and does work (based on Brad's original post):
=RIGHT(A1,LEN(A1)-FIND("|",SUBSTITUTE(A1," ","|",
LEN(A1)-LEN(SUBSTITUTE(A1," ","")))))
If your original strings could contain a pipe "|" character, then replace both in the above with some other character that won't appear in your source. (I suspect Brad's original was broken because an unprintable character was removed in the translation).
Bonus: How it works (from right to left):
LEN(A1)-LEN(SUBSTITUTE(A1," ","")) – Count of spaces in the original string
SUBSTITUTE(A1," ","|", ... ) – Replaces just the final space with a |
FIND("|", ... ) – Finds the absolute position of that replaced | (that was the final space)
Right(A1,LEN(A1) - ... )) – Returns all characters after that |
EDIT: to account for the case where the source text contains no spaces, add the following to the beginning of the formula:
=IF(ISERROR(FIND(" ",A1)),A1, ... )
making the entire formula now:
=IF(ISERROR(FIND(" ",A1)),A1, RIGHT(A1,LEN(A1) - FIND("|",
SUBSTITUTE(A1," ","|",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))))
Or you can use the =IF(COUNTIF(A1,"* *") syntax of the other version.
When the original string might contain a space at the last position add a trim function while counting all the spaces: Making the function the following:
=IF(ISERROR(FIND(" ",B2)),B2, RIGHT(B2,LEN(B2) - FIND("|",
SUBSTITUTE(B2," ","|",LEN(TRIM(B2))-LEN(SUBSTITUTE(B2," ",""))))))
This is the technique I've used with great success:
=TRIM(RIGHT(SUBSTITUTE(A1, " ", REPT(" ", 100)), 100))
To get the first word in a string, just change from RIGHT to LEFT
=TRIM(LEFT(SUBSTITUTE(A1, " ", REPT(" ", 100)), 100))
Also, replace A1 by the cell holding the text.
A more robust version of Jerry's answer:
=TRIM(RIGHT(SUBSTITUTE(TRIM(A1), " ", REPT(" ", LEN(TRIM(A1)))), LEN(TRIM(A1))))
That works regardless of the length of the string, leading or trailing spaces, or whatever else and it's still pretty short and simple.
I found this on google, tested in Excel 2003 & it works for me:
=IF(COUNTIF(A1,"* *"),RIGHT(A1,LEN(A1)-LOOKUP(LEN(A1),FIND(" ",A1,ROW(INDEX($A:$A,1,1):INDEX($A:$A,LEN(A1),1))))),A1)
[edit] I don't have enough rep to comment, so this seems the best place...BradC's answer also doesn't work with trailing spaces or empty cells...
[2nd edit] actually, it doesn't work for single words either...
=RIGHT(TRIM(A1),LEN(TRIM(A1))-FIND(CHAR(7),SUBSTITUTE(" "&TRIM(A1)," ",CHAR(7),
LEN(TRIM(A1))-LEN(SUBSTITUTE(" "&TRIM(A1)," ",""))+1))+1)
This is very robust--it works for sentences with no spaces, leading/trailing spaces, multiple spaces, multiple leading/trailing spaces... and I used char(7) for the delimiter rather than the vertical bar "|" just in case that is a desired text item.
This is very clean and compact, and works well.
{=RIGHT(A1,LEN(A1)-MAX(IF(MID(A1,ROW(1:999),1)=" ",ROW(1:999),0)))}
It does not error trap for no spaces or one word, but that's easy to add.
Edit:
This handles trailing spaces, single word, and empty cell scenarios. I have not found a way to break it.
{=RIGHT(TRIM(A1),LEN(TRIM(A1))-MAX(IF(MID(TRIM(A1),ROW($1:$999),1)=" ",ROW($1:$999),0)))}
=RIGHT(A1,LEN(A1)-FIND("`*`",SUBSTITUTE(A1," ","`*`",LEN(A1)-LEN(SUBSTITUTE(A1," ","")))))
New answer 9/28/2022
Considering the new excel function: TEXTAFTER (check availability) you can achieve it with a simple formula:
=TEXTAFTER(A1," ", -1)
To add to Jerry and Joe's answers, if you're wanting to find the text BEFORE the last word you can use:
=TRIM(LEFT(SUBSTITUTE(TRIM(A1), " ", REPT(" ", LEN(TRIM(A1)))), LEN(SUBSTITUTE(TRIM(A1), " ", REPT(" ", LEN(TRIM(A1)))))-LEN(TRIM(A1))))
With 'My little cat' in A1 would result in 'My little' (where Joe and Jerry's would give 'cat'
In the same way that Jerry and Joe isolate the last word, this then just gets everything to the left of that (then trims it back)
Copy into a column, select that column and HOME > Editing > Find & Select, Replace:
Replace All.
There is a space after the asterisk.
Imagine the string could be reversed. Then it is really easy. Instead of working on the string:
"My little cat" (1)
you work with
"tac elttil yM" (2)
With =LEFT(A1;FIND(" ";A1)-1) in A2 you get "My" with (1) and "tac" with (2), which is reversed "cat", the last word in (1).
There are a few VBAs around to reverse a string. I prefer the public VBA function ReverseString.
Install the above as described. Then with your string in A1, e.g., "My little cat" and this function in A2:
=ReverseString(LEFT(ReverseString(A1);IF(ISERROR(FIND(" ";A1));
LEN(A1);(FIND(" ";ReverseString(A1))-1))))
you'll see "cat" in A2.
The method above assumes that words are separated by blanks. The IF clause is for cells containing single words = no blanks in cell. Note: TRIM and CLEAN the original string are useful as well. In principle it reverses the whole string from A1 and simply finds the first blank in the reversed string which is next to the last (reversed) word (i.e., "tac "). LEFT picks this word and another string reversal reconstitutes the original order of the word (" cat"). The -1 at the end of the FIND statement removes the blank.
The idea is that it is easy to extract the first(!) word in a string with LEFT and FINDing the first blank. However, for the last(!) word the RIGHT function is the wrong choice when you try to do that because unfortunately FIND does not have a flag for the direction you want to analyse your string.
Therefore the whole string is simply reversed. LEFT and FIND work as normal but the extracted string is reversed. But his is no big deal once you know how to reverse a string. The first ReverseString statement in the formula does this job.
=LEFT(A1,FIND(IF(
ISERROR(
FIND("_",A1)
),A1,RIGHT(A1,
LEN(A1)-FIND("~",
SUBSTITUTE(A1,"_","~",
LEN(A1)-LEN(SUBSTITUTE(A1,"_",""))
)
)
)
),A1,1)-2)
I translated to PT-BR, as I needed this as well.
(Please note that I've changed the space to \ because I needed the filename only of path strings.)
=SE(ÉERRO(PROCURAR("\",A1)),A1,DIREITA(A1,NÚM.CARACT(A1)-PROCURAR("|", SUBSTITUIR(A1,"\","|",NÚM.CARACT(A1)-NÚM.CARACT(SUBSTITUIR(A1,"\",""))))))
Another way to achieve this is as below
=IF(ISERROR(TRIM(MID(TRIM(D14),SEARCH("|",SUBSTITUTE(TRIM(D14)," ","|",LEN(TRIM(D14))-LEN(SUBSTITUTE(TRIM(D14)," ","")))),LEN(TRIM(D14))))),TRIM(D14),TRIM(MID(TRIM(D14),SEARCH("|",SUBSTITUTE(TRIM(D14)," ","|",LEN(TRIM(D14))-LEN(SUBSTITUTE(TRIM(D14)," ","")))),LEN(TRIM(D14)))))
You can achieve this also by reversing the string and finding the first space
=MID(C3,2+LEN(C3)-SEARCH(" ",CONCAT(MID(C3,SEQUENCE(LEN(C3),,LEN(C3),-1),1))),LEN(A1))
Reverse the string
CONCAT(MID(C3,SEQUENCE(LEN(C3),,LEN(C3),-1),1))
Find the first space in the reversed string
SEARCH(" ",...
Take the position of the space found in the reversed string off the length of the string and return that portion
=MID(C3,2+LEN(C3)-SEARCH...
I also had a task like this and when I was done, using the above method, a new method occured to me: Why don't you do this:
Reverse the string ("string one" becomes "eno gnirts").
Use the good old Find (which is hardcoded for left-to-right).
Reverse it into readable string again.
How does this sound?

Resources