Excel: extract only substrings beginning with a certain character - excel

I have a column of cells containing a variable amount of text up to 140 characters in length. What I would like to do is write a function that will parse these strings for only words beginning with "#" and organize them in a single adjacent cell separated by spaces.
These substrings vary in length and in their position within the string. And there might be more than one substring beginning with "#" in the cell to pull.
I have tried it in many different ways which have been unsuccessful. Thanks in advance for your advice!
Here is a way that seems to work, but is probably not the "correct" way:
The objective was to parse column C, containing tweets, for all the "mentions" (strings beginning with #) and put them in an adjacent cell in column D.
I took all the content from column C and pasted it into a new sheet. Then I did text-to-column so that the tweet was put into cells word by word. On these cells I used the function (dragged out) =IF(ISNUMBER(FIND("#",B3)),CONCATENATE(B3," "),"") in order to separate the twitter handles only into columns below. I think there needs to be a space added in case there are multiple handles to join.
Then I used another nested CONCATENATE function =CONCATENATE(IF(B34="","",B34),IF(B35="","",B35)...,IF(B65="","",B65) to put the handles, now followed by spaces, together in a single cell. It had to be written this way as a workaround for the #VALUE error for the CONCATENATE function in blank cells.
Then I selected the whole row, copied and transposed it into a column. Then selected the column, pasted values only into my original sheet in column D. The handles all line up with the corresponding tweet.
I would love to learn how to do this in the proper way.

Dim regEx As Object
Dim strPattern As String: strPattern = "^#"
Set regEx = CreateObject("VBScript.RegExp")
regEx.Pattern = strPattern
If regEx.Test(valueOfCellToCheck) Then
' do your logic here
Else
' skip cell
End If

Related

Populate cell with substring following last / in the string

I have a CSV in which column A is populated with strings, such as:
ABCDE/FGHI/JKL/MNOPQR
I need to populate column C with everything after the last occurrence of the "/". In this example, it would have to be "MNOPQR".
Is there a function that could be used for this? "RIGHT" doesn't seem to do the trick. I don't know what the length of the substring will be in each row, so I definitely have to look for the "/".
If your text is in A4, put this in another cell:
=MID(A4,LEN(LEFT(A4,FIND(CHAR(1),SUBSTITUTE(A4,"/",CHAR(1),LEN(A4)-LEN(SUBSTITUTE(A4,"/",""))))))+1,LEN(A4)-LEN(LEFT(A4,FIND(CHAR(1),SUBSTITUTE(A4,"/",CHAR(1),LEN(A4)-LEN(SUBSTITUTE(A4,"/",""))))))+1)
I think that should work. Thanks to #Jerry for the main part, where it finds the last / in a string.
edit:
Per #ScottCraner, this is shorter: =MID(A1,SEARCH("}}}",SUBSTITUTE(A1,"/","}}}",LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))‌​))+1,LEN(A1))
Here's a bit shorter formula to return the last delimited substring in a string.
=TRIM(RIGHT(SUBSTITUTE(A1,"/",REPT(" ",99)),99))
Replace the delimiter with 99 spaces, then return the rightmost 99 characters. The leading characters must be spaces also, so TRIM gets rid of them.
A simple formula approach using FilterXML might be:
=FILTERXML("<items><i>" & SUBSTITUTE(A1,"/","</i><i>") & "</i></items>","//i[position()=last()]")
Profitting from the dynamic features of vers. 2019+ you can change the cell address to a range input, e.g. A1:A10 allowing output in a so called spill range.
VBA approach
As the VBA tag has been recently added to OP,
an obvious VBA approach would be to split the string input and get the last array element via Ubound():
Dim tmp As Variant
tmp = Split("ABCDE/FGHI/JKL/MNOPQR", "/")
Debug.print tmp(Ubound(tmp)) ' ~~> MNOPQR

Formula to extract 10 or 11 digit phone numbers from random complex test string

I a newbee in field of excel formula and need your help in a complex formula, where i need to extract phone numbers from a string of random text. This does not have a fix format for the string
Example set off strings:
Dring to data add9724516002
add 08107936777 to me pler
8000069633 plz add. Me
9000088106 mujhe bhi add karo dosto
I have already tried many formulas but nothing seem to work. Only thing fixed is the length of number, it should be either 10 digits or 11 (including initial 0)
You could use a RegExp via VBA
(which seems to be coming to Excel as a formula option sometime down the track, see uservoice
code
Function GetCode(strIn As String) As String
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "\d{10,11}\b"
If .test(strIn) Then
GetCode = .Execute(strIn)(0)
Else
GetCode = "no match"
End If
End With
End Function
If they all look like the strings you have provided above, you could use Text to Columns. Let's say all of those strings were in A1:A4.
Select all four cells
Data - "Text to Columns"
Delimited
Use "space" to separate values
Finish
You will now have a large majority of your phone numbers pulled out, and it will look something like this:
(I've added a row above the data that makes every column its own set of data.. Column 1,2,3,4,5,and 6. I've also added another column in place of column A, Sort. This will be useful at a later stage)
Next, select A1:G5.
Click "Insert - Table"
"My table has headers"
OK
Your range is now a table, meaning you can sort the data via ascending order. I'm assuming you have hundreds of strings that you're sorting through. When you sort via ascending order, all numbers will show up first.
In the pic below I've sorted the first column of actual data, and there are two phone numbers at the top I can pull out:
If you ever want to revert back to your original lineup of data, click the Sort column to "Ascending"
I hope this is a good workaround to avoid VBA. You may not get all of the phone numbers, but probably a good chunk. You can also copy and paste columns C:G to the bottom of column A and sort everything at once if you only need all of the phone numbers.
If the strings that have numbers and letters attached are similar, you can also look into the RIGHT and LEFT formulas to pull out the numbers from the alphanumeric strings

Is it possible to count the number of unique characters within a cell?

Software used: Excel Mac 2011
I have a column of cells containing alphanumeric strings, and I'm looking to count the number of unique characters that appear in each cell. I'd like to have it function as pictured below:
Because of the data I'm working with, I don't need spaces be included or excluded from the character count or any distinctions to be made between uppercase or lowercase characters.
Thanks for your help.
Try this:
=SUM(IF((LEN(G13)-LEN(SUBSTITUTE(UPPER(G13),{"A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z","1","2","3","4","5","6","7","8","9","0"},""))),1,0))
As stated it is quite long. This will count the English Alpha-numeric characters both upper and lower, but as helena4 pointed out, you will need to add any other symbol to the array in the formula or they will not be counted.
Obviously change the G13 reference to the cell with the desired text to count.
If you want a UDF use this:
Function Uniquecount(Rng As Range) As Integer
Dim cUnique As Collection
Dim i As Integer
Set cUnique = New Collection
On Error Resume Next
For i = 1 To Len(Rng)
cUnique.Add CStr(Mid(Rng, i, 1)), CStr(Mid(Rng, i, 1))
Next i
On Error GoTo 0
Uniquecount = cUnique.Count
End Function
Put it in a module attached to the workbook. Then call it like any other formula: =Uniquecount(G13)
This will count everything once, including spaces.
I have come up with a new, much cleaner version of the formula first put forward by Scott.
=SUM(IF(LEN(A2)-LEN(SUBSTITUTE(UPPER(A2),CHAR(ROW(INDIRECT("1:255"))),"")),1,0))
This will find all unique ASCII characters. The ROW(INDIRECT("1:255")) creates an array of numbers from 1 to 255, which I use CHAR to convert into the full character set without having to manually type all 255 characters.
If this is my only contribution to society, I can die happy.
This allows you to count how many characters appear in the chart on another sheet I called data.
=SUMPRODUCT(--ISNUMBER(SEARCH(Data!$a$1:$a$26,A1)))
The best of this is you can add or change the list of characters easily, as long as you keep the range correct.
=ROWS(UNIQUE(MID(A1,SEQUENCE(LEN(A1)),1)))
should work assuming you have access to these newer functions (i.e. UNIQUE and SEQUENCE)

Excel: Formula to extract a string of text delimited by markers from cells

I'm messing with a spreadsheet containing postal addresses that have been inserted in the cells' comments
Each comment contain an address composed of a variable number of lines (damn UK addresses, they can have up to 7 lines!) in the following format:
Line1,
Line2,
Line3,
[...],
State
With my poor skills, I've managed to extract the comment with a VBA script, obtaining the following string on a single cell:
Line1,Line2,Line3,[...],State
At this point each string between commas must be extracted to its own cell.
I've managed to extract the 1st 3 lines with the following formulas:
For Line1:
=LEFT(A8;(SEARCH(",";A8))-1)
For Line2:
=MID(A8; SEARCH(",";A8)+1; SEARCH(","; A8; SEARCH(","; A8)+1)-SEARCH(",";A8)-1)
For Line3:
=MID(A8; SEARCH(",";A8;SEARCH(",";A8;SEARCH(",";A8;SEARCH(",";A8)))+1)+1;SEARCH(","; A8; SEARCH(","; A8;SEARCH(",";A8)+1)+1)-SEARCH(",";A8;SEARCH(",";A8)+1)-1)
From this point I start to get overflow errors from my brain... I probably need some days of sleep.
Can anybody help me to get to "line6", and finally suggest me how to pull out the "State line" which ends without comma?
I thought I could pull out the "State" line with =RIGHT(",";SEARCH(",";A8)-1) but I'm obviously doing something wrong because that pulls out a comma instead of a string.
I guess I could do everything with a VBA script, but I'm not that skilled yet :(
With comma separated data in A1, in B1 enter:
=TRIM(MID(SUBSTITUTE($A1,",",REPT(" ",999)),COLUMNS($A:A)*999-998,999))
and copy across. For example:
Note:
Why not use TextToColumns ?
The row of formulas re-calculates automatically if A1 changes.
The row of formulas will work even if A1, itself, contains a formula.
If you are wanting to do this programmatically instead of using a built-in, check out the split function for chopping up your comma separated string. It will split up your input string into an array. Then you can do whatever you like with the array.
Dim Names() As String
Names() = Split(inputValue, ",")
For i = 0 To UBound(Names)
' do what you want with each piece
Next
Gary's Student's answer is great for using the built-in functions.
If you want a VBA solution:
Sub spitString()
Dim sourceRange As Range
Dim stringArr() As String
Dim i As Integer
Set sourceRange = ActiveSheet.Range("A1")
stringArr = Split(sourceRange.Value, ",")
For i = LBound(stringArr) To UBound(stringArr)
sourceRange.Offset(0, i + 1).Value = stringArr(i)
Next i
End Sub
You could avoid adding comments: Are you aware that users can add line breaks inside a cell by pressing ALT+RETURN?
If having high rows d is a problem and you don't like that formatting, an alternative approach might be to write a simple bit of code that changes the height of the current row when a user clicks in a certain range. It would , make other rows less high. Perhaps.
Just a thought. It has benefits keeping it simple.
Harvey.

How to merge rows in a column into one cell in excel?

E.g
A1:I
A2:am
A3:a
A4:boy
I want to merge them all to a single cell "Iamaboy"
This example shows 4 cells merge into 1 cell however I have many cells (more than 100), I can't type them one by one using A1 & A2 & A3 & A4 what can I do?
If you prefer to do this without VBA, you can try the following:
Have your data in cells A1:A999 (or such)
Set cell B1 to "=A1"
Set cell B2 to "=B1&A2"
Copy cell B2 all the way down to B999 (e.g. by copying B2, selecting cells B3:B99 and pasting)
Cell B999 will now contain the concatenated text string you are looking for.
I present to you my ConcatenateRange VBA function (thanks Jean for the naming advice!) . It will take a range of cells (any dimension, any direction, etc.) and merge them together into a single string. As an optional third parameter, you can add a seperator (like a space, or commas sererated).
In this case, you'd write this to use it:
=ConcatenateRange(A1:A4)
Function ConcatenateRange(ByVal cell_range As range, _
Optional ByVal separator As String) As String
Dim newString As String
Dim cell As Variant
For Each cell in cell_range
If Len(cell) <> 0 Then
newString = newString & (separator & cell)
End if
Next
If Len(newString) <> 0 Then
newString = Right$(newString, (Len(newString) - Len(separator)))
End If
ConcatenateRange = newString
End Function
Inside CONCATENATE you can use TRANSPOSE if you expand it (F9) then remove the surrounding {}brackets like this recommends
=CONCATENATE(TRANSPOSE(B2:B19))
Becomes
=CONCATENATE("Oh ","combining ", "a " ...)
You may need to add your own separator on the end, say create a column C and transpose that column.
=B1&" "
=B2&" "
=B3&" "
In simple cases you can use next method which doesn`t require you to create a function or to copy code to several cells:
In any cell write next code
=Transpose(A1:A9)
Where A1:A9 are cells you would like to merge.
Without leaving the cell press F9
After that, the cell will contain the string:
={A1,A2,A3,A4,A5,A6,A7,A8,A9}
Source: http://www.get-digital-help.com/2011/02/09/concatenate-a-cell-range-without-vba-in-excel/
Update: One part can be ambiguous. Without leaving the cell means having your cell in editor mode. Alternatevly you can press F9 while are in cell editor panel (normaly it can be found above the spreadsheet)
Use VBA's already existing Join function. VBA functions aren't exposed in Excel, so I wrap Join in a user-defined function that exposes its functionality. The simplest form is:
Function JoinXL(arr As Variant, Optional delimiter As String = " ")
'arr must be a one-dimensional array.
JoinXL = Join(arr, delimiter)
End Function
Example usage:
=JoinXL(TRANSPOSE(A1:A4)," ")
entered as an array formula (using Ctrl-Shift-Enter).
Now, JoinXL accepts only one-dimensional arrays as input. In Excel, ranges return two-dimensional arrays. In the above example, TRANSPOSE converts the 4×1 two-dimensional array into a 4-element one-dimensional array (this is the documented behaviour of TRANSPOSE when it is fed with a single-column two-dimensional array).
For a horizontal range, you would have to do a double TRANSPOSE:
=JoinXL(TRANSPOSE(TRANSPOSE(A1:D1)))
The inner TRANSPOSE converts the 1×4 two-dimensional array into a 4×1 two-dimensional array, which the outer TRANSPOSE then converts into the expected 4-element one-dimensional array.
This usage of TRANSPOSE is a well-known way of converting 2D arrays into 1D arrays in Excel, but it looks terrible. A more elegant solution would be to hide this away in the JoinXL VBA function.
For those who have Excel 2016 (and I suppose next versions), there is now directly the CONCAT function, which will replace the CONCATENATE function.
So the correct way to do it in Excel 2016 is :
=CONCAT(A1:A4)
which will produce :
Iamaboy
For users of olders versions of Excel, the other answers are relevant.
For Excel 2011 on Mac it's different. I did it as a three step process.
Create a column of values in column A.
In column B, to the right of the first cell, create a rule that uses the concatenate function on the column value and ",". For example, assuming A1 is the first row, the formula for B1 is =B1. For the next row to row N, the formula is =Concatenate(",",A2). You end up with:
QA
,Sekuli
,Testing
,Applitools
,Visual Testing
,Test Automation
,Selenium
In column C create a formula that concatenates all previous values. Because it is additive you will get all at the end. The formula for cell C1 is =B1. For all other rows to N, the formula is =Concatenate(C1,B2). And you get:
QA,Sekuli
QA,Sekuli,Testing
QA,Sekuli,Testing,Applitools
QA,Sekuli,Testing,Applitools,Visual Testing
QA,Sekuli,Testing,Applitools,Visual Testing,Test Automation
QA,Sekuli,Testing,Applitools,Visual Testing,Test Automation,Selenium
The last cell of the list will be what you want. This is compatible with Excel on Windows or Mac.
I use the CONCATENATE method to take the values of a column and wrap quotes around them with columns in between in order to quickly populate the WHERE IN () clause of a SQL statement.
I always just type =CONCATENATE("'",B2,"'",",") and then select that and drag it down, which creates =CONCATENATE("'",B3,"'",","), =CONCATENATE("'",B4,"'",","), etc. then highlight that whole column, copy paste to a plain text editor and paste back if needed, thus stripping the row separation. It works, but again, just as a one time deal, this is not a good solution for someone who needs this all the time.
I know this is really a really old question, but I was trying to do the same thing and I stumbled upon a new formula in excel called "TEXTJOIN".
For the question, the following formula solves the problem
=TEXTJOIN("",TRUE,(a1:a4))
The signature of "TEXTJOIN" is explained as TEXTJOIN(delimiter,ignore_empty,text1,[text2],[text3],...)
I needed a general purpose Concatenate With Separator (since I don't have TEXTJOIN) so I wrote this:
Public Function ConcatWS(separator As String, ParamArray cell_range()) As String
'---concatenate with seperator
For n = LBound(cell_range) To UBound(cell_range)
For Each cell In cell_range(n)
If Len(cell) <> 0 Then
ConcatWS = ConcatWS & IIf(ConcatWS <> "", separator, "") & cell
End If
Next
Next n
End Function
Which allows us to go crazy with flexibility in including cell ranges:
=ConcatWS(" ", Fields, E1:G2, L6:M9, O6)
NOTE: "Fields" is a Named Range and the separator may be blank

Resources