I have a few thousands rows in an Excel file with a line of text in each cell. In this line of text, there is sometimes a word that starts with the character "&". I would like to avoid using VBA.
If the words that start with "&" were always the same length, I would use "LEFT" or "RIGHT". What Excel function would you advise me to use to extract these words?
Other question: If I have two words that start with "&" in the same cell, is there any way to have two different functions, in two other cells, one looking for the first one starting from the beginning, the other one looking for the last one starting from the end?
Thanks.
Regarding your first question. Say in A1 there is your first string. Put in B1 this formula:
=IF(LEFT(SUBSTITUTE(A1," &"," "),1)="&",MID(SUBSTITUTE(A1," &"," "),2,10000),SUBSTITUTE(A1," &"," "))
Then drag down (copy formulas down) for cells A2, A3 ecc..
This take care of all the words that are preceded by a space and the first word in the cell. You have to look out of special case (interpunctions ecc..) as: "bla bla,&Word"
Left and Right are still a great functions to use.
Say the word under inspection is K8.
You may obtain the first character with =IF(LEFT(K8,1)="&",TRUE,FALSE).
You may obtain all the characters but the first by using =RIGHT(K8,LEN(K8)-1).
Of course, you may replace the TRUE in the first statement with the RIGHT... of the second statement; I have broken them out for clarity.
try the INSTR function to find the first occuance of &
instr(string,"&") 'returns the 1st occurance of &
then, if you need to find another occurance
instr(n,string,"&") 'returns the 1st occurance starting in position n - which can be 1+ the result of the prior line
and INSTRREV(string, "&") will find the last occurance
Related
I have a document in google sheets and the column consists of the name and version, like NLog.Config.4.3.0, NLog.Config.4.4.9 and so on.
See the image below for other examples.
I need to divide this into two columns - name and version, but I'm not familiar with regular expressions so close that I can get this info.
I can use excel and then import it to the Google doc, it doesn't matter for me how to do that.
enter image description here
You can try something like this:
Suppose you have your string in A1, then in B1 you can enter this:
=LEFT(A1,LEN(A1)-MIN(SEARCH({0,1,2,3,4,5,6,7,8,9},A1&"0123456789")))
and in C1 this:
=RIGHT(A1,LEN(A1)-MIN(SEARCH({0,1,2,3,4,5,6,7,8,9},A1&"0123456789"))+1)
you may need to do some adjustments if there are cases without numbers as it will produce an error, for example you can round it with an Iferror like this:
=IFERROR(LEFT(A1,LEN(A1)-MIN(SEARCH({0,1,2,3,4,5,6,7,8,9},A1&"0123456789"))),A1)
Note: A1&"0123456789" is a 'trick' to avoid the search to return error, as the search is looking for all the numbers in the array; we just need the position of the first one, thus the MIN().
Supposing that your raw data were in A2:A, place this in B2:
=ArrayFormula(IFERROR(REGEXEXTRACT(A2:A,"(\D+)\.(.+)"),A2:A))
The regular expression reads "Extract any number of non-digits up to but not including a period as group one, and everything remaining into group two." (In other words, "As soon as you run into a digit after a period, start group two.")
The IFERROR clause means, "If this pattern can't be found, just return the original cell data."
Assuming your content is in column A (Google Sheets), try this arrayformula in any cell other than column A:
=arrayformula(iferror(split(REGEXREPLACE($A:$A,"(\.)(\d+.+$)",char(6655)&"$2"),char(6655)),))
There are two regex groups denoted in ():
(\.) and (\d+.+$).
The first group looks for a dot . - it's escaped using \. The second group looks for a number (0-9) \d, one or more occurrences + then ending with $ one or more + of any character ..
The replacement is char(6655) (wouldn't usually be found in your dataset), and the contents of group two $2.
Then the split function divides the text into two columns by the char(6655) character.
iferror returns nothing if nothing is split.
The arrayformula works down the sheet.
I'm trying to achieve the below:
I would like cell AD3 to pull in the last 2 lines of text from cell AC3, which will be variable and often changing. The text in cell AC3 is all separated by line breaks.
In case you're wondering, currently I just have the values typed into cell AD3 for demonstration of my goal.
Thank you!
Let's break this down. For ease of writing, I use A1.
You first want to know how many rows are in the cell.
=LEN(A1)-LEN(SUBSTITUTE(A1,CHAR(10),""))
This formula returns a count of the character 10, which is the character used for a line break in a cell. If the cell has 5 rows, there will be 4 line break characters. If you want to return the last 2 rows, you want everything after the one but last line break character. To identify the one but last line break, subtract 1.
=LEN(A1)-LEN(SUBSTITUTE(A1,CHAR(10),""))-1
Replace the one but last line break character with a character that you know will otherwise not be in your cell, for example the character with code 160, which is a non-printable blank.
=SUBSTITUTE(A1,CHAR(10),CHAR(160),LEN(A1)-LEN(SUBSTITUTE(A1,CHAR(10),""))-1)
Next, you want to find the position of the character 160
=FIND(CHAR(160),SUBSTITUTE(A1,CHAR(10),CHAR(160),LEN(A1)-LEN(SUBSTITUTE(A1,CHAR(10),""))-1))
Now that you know the position of that character, you can use MID() to return the text after that character (add 1 to the position of that character). Assume that the last two rows of text in A1 are never more than 99 characters, use that for how many characters you want to return. Or use your favourite big number that will do that.
=MID(A1,FIND(CHAR(160),SUBSTITUTE(A1,CHAR(10),CHAR(160),LEN(A1)-LEN(SUBSTITUTE(A1,CHAR(10),""))-1))+1,99)
Remember to format the cell with the formula to wrap!
Many ways to do this. Here's one that is easily adaptable to returning other lines.
If you have the functions available in your version of Excel, you can use FILTERXML and TEXTJOIN
=TEXTJOIN(CHAR(10),,FILTERXML("<t><s>"&SUBSTITUTE(A1,CHAR(10),"</s><s>")&"</s></t>","//s[position()>last()-2]"))
"<t><s>"&SUBSTITUTE(A1,CHAR(10),"</s><s>")&"</s></t>" creates an XML where the line breaks divide the nodes
the xpath argument "//s[position()>last()-2]") returns the last two nodes. Obviously this can be modified an many ways to return other nodes
TEXTJOIN then joins those nodes with a linebreak.
Return the last 2 lines
In B1, enter formula :
=TRIM(RIGHT(SUBSTITUTE(A1,CHAR(10),REPT(" ",399)),799))
Let's say I have the following value in a cell "test1_test2_test3_test4_test5". In another cell it could be "test1_test2_test3" or even "test 1_t est2".
What I would like is to have a 'general' function that I can specify to only give me back e.g. all characters before the first underscore, between the first en second underscore etc...and all the characters after the last underscore. And....if there isn't anything found, don't give back an error but just empty or nothing.
Thusfar I've googled a working format for when having a maximum of 2 underscores present (each different in formula):
For locating and displaying the characters before the first underscore: =LEFT(D32; SEARCH("";D32;1)-1)
For locating the characters after the first and before the second underscore: =MID(D32;SEARCH("";D32;1)+1;SEARCH("";D32;SEARCH("";D32;1)+1)-(SEARCH("";D32;1))-1)
For locating the characters after the second underscore (not limiting untill the next one is/is not present): =RIGHT(D32;LEN(D32)-SEARCH("";D32;SEARCH("_";D32;1)+1))
Ps: because my native (excel) language is Dutch, I've done my best to translate my working Excel functions to the English syntax.
With data in A1, in B1 enter:
=TRIM(MID(SUBSTITUTE($A1,"_",REPT(" ",999)),COLUMNS($A:A)*999-998,999))
and copy across:
I suggest Text to Columns with underscore as the delimiter, count how many pieces result (COUNTA) and then pick to suit accordingly. Use IF to return blank ("") if say you want the text after the second underscore and the count is 1.
I need to replace a character, "-", with ":" in all of our product names. It needs to work for the entire column A, not just a single cell, AND there are multiple "-"'s in some of the products, but I just need the first in each product. For example, we have product Spuhr-SP-3016 and product AI-0497. I need to replace only the first - in the Spuhr one and the - in the AI one. And again, this will need to be for thousands of products, not just a single cell but a range (A2:A3000, or all of column A if range doesn't work). Is there a formula using either Replace or Substitute? All instructions I found demonstrate for one cell only instead of a range.
You can achieve this by using the FIND function in combination with LEFT and RIGHT
The FIND gets the first position of a "-" and you get the LEFT of that -1 so it ignores the "-". Then get the length, LEN, minus the same position from the first FIND without the -1 to get the rest of the string. In between both functions just add the ".".
=LEFT(A1,FIND("-",A1)-1)&"."&RIGHT(A1,LEN(A1)-FIND("-",A1))
I wish to split a cell value
The cell value will be in the form of ###x###
Where # is a 0-9 number.
Also, the number of places can vary so it could be ###x#### or ####x####
This makes me unable to use LEFT or MID function
The common thing is the two numbers I want to extract separately are joined by 'x'
I tried using search function but could not find an answer.
How can I do this? With or without VBA? Thanks!
Let's assume your string is in the A1 field.
You may use either FIND or SEARCH function to find the position of the 'x' character.
=FIND("x", A1)
You get the first number by taking the characters from the left:
=LEFT(A1, FIND("x", A1) - 1)
You get the second number by taking the characters after until the end:
=MID(A1, FIND("x", A1)+1, 999)
(Of course, the parameter separator is according to your setting, so you may need to replace all the commas by semicolon or so.)