Splitting very large string separated with comma and i need to split 50 items only per row - excel

im having very big string on 1st row.so 1st row contains lots of items with comma like below
12345,54322,44444,222222222,444444,121,333,44444,........
I just need to split this till 50 items in every row. lets assume there are 700 items separated with comma and I want to keep till 50 items only in 1st row and then next 50 in 2nd row and so on.
I tried with the below code which splits till 50 for sure but im not sure if this will works going forward. so need help on this
OutData = Split(InpData, ",")(50)
MsgBox OutData

You can do this in many more ways, but one would be to replace every nth comma. For example through Regular Expressions:
Sub Test()
Dim s As String: s = "1,2,3,4,5,6,7,8,9,10,11"
Dim n As Long: n = 2
Dim arr() As String
With CreateObject("vbscript.regexp")
.Global = True
.Pattern = "([^,]*(?:,[^,]*){" & n - 1 & "}),"
arr = Split(.Replace(s, "$1|"), "|")
End With
End Sub
The pattern used means:
( - Open 1st capture group;
[^,]* - Match 0+ (Greedy) characters other than comma;
(?: - Open a nested non-capture group;
,[^,]* - Match a comma and again 0+ characters other than comma;
){1} - Close the non-capture group and match n-1 times (1 time in the given example);
), - Close the capture group and match a literal comma.
Replace every match with the content of the 1st capture group and a character you know is not in the full string so we can split on that character. See an online demo
I suppose you can do whatever you like with the resulting array. You probably want to transpose it into the worksheet.

Related

Extracting barcode data using VBA

I need help. I developed an Excel sheet that as I scan an employee barcode it will extract the base-32 code information so I can get the employee ID number, first name and last name using different formulas. Excel Sheet
The only problem is the formulas to extract this data is different based on how the code starts out as seen in the Excel Sheet. I can use the IFS formula in Excel on O365 but all of our agencies use the standard desktop version of Excel.
My question; is there a way to code out in VBA that when an ID is scanned, regardless of what the scanned code starts with, that it will perform the needed formula to extract the three items I need which is ID, first name and last name? Below are the formulas I use:
Scan starting with "M"
Base-32 =MID(A2,2,7)
First Name =PROPER(MID(A2,17,20))
Last Name =PROPER(MID(A2,38,20))
Scan Starting with "N"
Base-32 =MID(A3,9,7)
First Name =PROPER(MID(A3,16,20))
Last Name =PROPER(MID(A3,36,26))
Scan Starting with "C"
Base -32 =MID(A4,8,7)
First Name =PROPER(MID(A4,15,20))
Last Name =PROPER(MID(A4,35,20))
ID NUMBER
The ID number for each of them is calculated the same (based on the cell the scan goes in to) using:
=IF(C2="","0",SUMPRODUCT(POWER(32,LEN(C2)-ROW(INDIRECT("1:"&LEN(C2)))),(CODE(UPPER(MID(C2,ROW(INDIRECT("1:"&LEN(C2))),1)))-48*(CODE(MID(C2,ROW(INDIRECT("1:"&LEN(C2))),1))<58)-55*(CODE(MID(C2,ROW(INDIRECT("1:"&LEN(C2))),1))>64))))
Thank you in advance to anyone that can help.
Not sure if this is exactly in line with your requirements, but the following UDF could be used to retrieve your data:
Function GetData(inp As String, grp As Long) As String
With CreateObject("VBScript.RegExp")
.Pattern = "^(?:M|N.{7}|C.{6})(.{7})\S*([A-Z][a-z]+)\s*(\S+)"
If .Test(inp) Then
GetData = .Execute(inp)(0).Submatches(grp - 1)
Else
GetData = "No Data Found"
End If
End With
End Function
Here is an online demo how the pattern works. It would match:
^ - Start line anchor.
(?:M|N.{7}|C.{6}) - A non-capture group to either capture a literal 'M', or a literal 'N' followed by 7 characters other than newline, or a literal 'C' followed by 6 of those characters.
(.{7} - Then a 1st capture group of 7 characters to mimic the MID() functionality, capturing the Base-32 code.
\S* - 0+ (Greedy) non-whitespace characters, upto:
([A-Z][a-z]+) - A 2nd capture group to capture the lastname through a single uppercase alphachar and 1+ (Greedy) lowercase ones.
\s* - 0+ (Greedy) whitespace characters upto:
(\S+) - A 3rd capture group to catch the first name through 1+ (Greedy) non-whitespace characters.
You'd call this function in your sheet through =GetData(A1,1) to get the 'Base-32' code and use integer 2 to get the last name and a 3 to get the first name. I hope that helped a bit.

How to remove duplicates in a string

I have a file contains 38,000 records each row contains 2 or more ';' at the end. is there any formula to remove the end repeated ';' in Excel or any other tool for example
To remove repeated characters (semi-colons in this case)
Hit CTRL+H
Find What: ;; (two semicolons)
Replace with: ; (one semicolon)
Click Replace All.
When it finishes, repeat Step 4 until there are no more matches found.
Now the document will have no more than one semicolon in a row.
Remove repeated characters using a VBA function:
The following function does the same thing using VBA, and for any character you choose:
Function removeDoubleChars(txt As String, doubleChar As String) As String
'removes all multiple-consecutive [doubleChar] within [txt]
Do
txt = Replace(txt, doubleChar & doubleChar, doubleChar)
Loop While InStr(txt, doubleChar & doubleChar) > 0
removeDoubleChars = txt
End Function
You would use this like Range("A1") = removeDoubleChars ( Range("A1"), ";") to remove consecutive semicolons from cell A1.

Count Patterns In one Cell Excel

I wanted your help, I'm currently working in extracting some data, now the thing is that I have to count an specific amount of Call IDs a call ID format is the following 9129572520020000711. The pattern is 19 characters that starts with 9 and ends in 1.
and I want to count how many times this pattern appears in one cell
I.E. this is the value in one cell and I want to count how many times the pattern appears.
1912957252002000071129129545183410000711391295381628700007114912959791875000071159129597085000000711691295892838400007117912958908933000071189129452513730000711
To solve this with formulae you need to know:
The starting character
The ending character
The length of your Call ID
Finding all possible Call IDs
Let B1 be your number string and B2 be the call ID (or pattern) you are looking for. In B5 enter the formula =MID($B$2,1,1) to find the starting character you are looking for. In B6 enter =RIGHT($B$2,1) for the end character. In B7 enter =LEN($B$2) for the length of the call ID.
In Column A we'll enter the position of every starting character. The first formula will be a simple Find() formula in B10 as =FIND($B$5,$B$1,1). To find the other starting characters start the Find() at the location after the last starting character: =FIND($B$5,$B$1,$A10+1) in B11. Copy this down the column a few dozen times (or more).
In Column B we'll see if the next X characters (where X is the length of the Call ID) meets the criteria for a Call ID:
=IF(MID($B$1,$A10+($B$7-1),1)=$B$6,TRUE,FALSE)
The MID($B$1,$A10+($B$7-1),1)=$B$6 checks if the character at the end of the character at the end of this possible Call ID is the end character we're looking for. $A10+($B$7) calculates the position of the possible Call ID and $B$6 is the end character.
In Column C we can return the actual Call ID if there is a match. This isn't necessary to find the count, but will be useful later. Simply check if the value in Column B is True and, if yes, return the calculated string: =IF(B10,MID($B$1,$A10,$B$7),"").
To actually count the number of valid Call IDs, do a CountIf() of the Call ID column to check for the number of True values: =IF(B10,MID($B$1,$A10,$B$7),"").
If you don't want all the #Values! just wrap everything in IFERROR(,"") formulas.
Finding all consecutive Call IDs
However , some of these Call IDs overlap. Operating on the assumption that Call IDs cannot overlap, we simply have to start our search after the end character of a found ID, not the start. Insert an "Ending Position" column in Column B with the formulae: =$A10+($C$7-1), starting in B11. Alter A11 to =FIND($C$5,$C$1,$B10+1) and copy down. Don't change A10 as this finds the first starting position and is not depending on anything but the original text.
Which ones are valid?
I don't know, that depends on other criteria for your Call IDs. If you receive them consecutively, then the second method is best and the other possible ones found are by coincidence. If not, then you'll have to apply some other validation criteria to the first method, hence why we identified each ID.
You can solve this simply with a UDF using a regular expression.
Option Explicit
Function callIDcount(S As String) As Long
Dim RE As Object, MC As Object
Const sPat As String = "9\d{17}1"
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.Pattern = sPat
Set MC = .Execute(S)
callIDcount = MC.Count
End With
End Function
Using your example, this returns a count of 8
The regular expression engine captures all of the matches that match the pattern, into the match collection. To see how many are there, we merely return the count of that collection.
Trivial modifications would allow one to return the actual ID's also, should that be necessary.
The regex:
9\d{17}1
9\d{17}1
Match the character “9” literally 9
Match a single character that is a “digit” (ASCII 0–9 only) \d{17}
Exactly 17 times {17}
Match the character “1” literally 1
Created with RegexBuddy
EDIT Reading through TheFizh's post, he considered that you might want the count to include overlapping CallID's. In other words, given:
9129572520020000711291
We see that includes:
9129572520020000711
9572520020000711291
where the second overlaps with the first, but both meet your requirements.
Should that be what you want, merely change the regex so it does not "consume" the match:
Const sPat As String = "9(?=\d{17}1)"
and you will return the result of 15 instead of 8, which would be non-overlapping pattern.
Do you mean something like what's following?
Sub CallID_noPatterns()
Dim CallID As String, CallIDLen As Integer
CallID = "9#################1"
CallIDLen = Len(CallID) 'the CallID's length
'Say that you want to get the value of "A1" cell and deal with its value
Dim CellVal As String, CellLen As Integer
CellVal = CStr(Range("A1").Text) 'get its value as a string
CellLen = Len(CellVal) 'get its length
'You Have 2 options:-
'1-The value is smaller than your CallID length. (Not Applicable)
'2-The value is longer than or equal to your CallID length
'So just run your code for the 2nd option
Dim i As Integer, num_checks, num_patterns
i = 0
num_patterns = 0
'imagine both of them as 2 arrays, every array consists of sequenced elements
'and your job is to take a sub-array from your value, of a length
' equals to CallID's length
'then compare your sub-array with CallID
num_checks = CellLen - CallIDLen + 1
If CellLen >= CallIDLen Then
For i = 0 To num_checks - 1 Step 19
For j = i To num_checks - 1
If Mid(CellVal, (j + 1), CallIDLen) Like CallID Then
num_patterns = num_patterns + 1
Exit For
End If
Next j
Next i
End If
'Display your result
MsgBox "Number of Patterns: " & Str(num_patterns)
End Sub

Excel replace characters in string before and after 'x'

Hello I have a column with strings (names of products) in it.
Now these are formatted as Name LenghtxWidth, example Green box 20x30. Now I need to change the 20 with the 30 in this example so I get Green box 30x20, any ideas how I can achieve this?
Thanks
Here is both a formula solution, as well as a VBA solution using Regular Expressions:
Formula
=LEFT(A1,FIND(TRIM(RIGHT(SUBSTITUTE(A1," ",REPT(" ",99)),99)),A1)-1)&
MID(TRIM(RIGHT(SUBSTITUTE(A1," ",REPT(" ",99)),99)),SEARCH("x",TRIM(RIGHT(SUBSTITUTE(A1," ",REPT(" ",99)),99)))+1,99)&
"x"&
LEFT(TRIM(RIGHT(SUBSTITUTE(A1," ",REPT(" ",99)),99)),SEARCH("x",TRIM(RIGHT(SUBSTITUTE(A1," ",REPT(" ",99)),99)))-1)
UDF
Option Explicit
Function RevWL(S As String)
Dim RE As Object
Const sPat As String = "(\d+.?\d*)x(\d+.?\d*)"
'If L or W might start with a decimal point, and not a digit,
'Then change sPat to: (\d*.?\d+)x(\d*.?\d+)
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.ignorecase = True
.Pattern = sPat
RevWL = .Replace(S, "$2x$1")
End With
End Function
Here is an example of the kinds of data I tested with:
The Formula works by looking at the last space-separated substring which would be LxW, then reversing the portion after and before the x, then concatenating everything back together.
The regex pattern captures the two numbers (could be integers or decimals, so long as the start with an integer -- although that could be changed if needed), and reversing them.
Here is a more detailed explanation of the regex (and the replacement string) with links to a tutorial:
(\d+.?\d*)x(\d+.?\d*)
(\d+.?\d*)x(\d+.?\d*)
Options: Case insensitive; ^$ don’t match at line breaks
Match the regex below and capture its match into backreference number 1 (\d+.?\d*)
Match a single character that is a “digit” \d+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
Match any single character that is NOT a line break character .?
Between zero and one times, as many times as possible, giving back as needed (greedy) ?
Match a single character that is a “digit” \d*
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
Match the character “x” literally x
Match the regex below and capture its match into backreference number 2 (\d+.?\d*)
Match a single character that is a “digit” \d+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
Match any single character that is NOT a line break character .?
Between zero and one times, as many times as possible, giving back as needed (greedy) ?
Match a single character that is a “digit” \d*
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
$2x$1
Insert the text that was last matched by capturing group number 2 $2
Insert the character “x” literally x
Insert the text that was last matched by capturing group number 1 $1
Created with RegexBuddy
Here is a VBA solution that will work for you:
Option Explicit
Function Switch(r As Range) As String
Dim measurement As String
Dim firstPart As String
Dim secondPart As String
measurement = Right(r, Len(r) - InStrRev(r, " "))
secondPart = Right(measurement, Len(measurement) - InStr(1, measurement, "x"))
firstPart = Left(measurement, InStr(1, measurement, "x") - 1)
Switch = Left(r, InStrRev(r, " ") - 1) & " " & secondPart & "x" & firstPart
End Function
You can paste this in a regular module in the VBE (Visual Basic Editor) and use it as a regular function/formula. If your value is in cell A1 then type =Switch(A1) in cell B1. Hope it helps!
Ok, so it is really easier to use VBA, but if you want only some formulas you can use some columns to split your text and then concatenate your cells.
Here is a little example:
Of course B1-4 are optional. It is here only to have something more readable, but you can do use only one formula
=CONCATENATE(LEFT(A1, SEARCH(" ",A1,1)-1)," ",RIGHT(RIGHT(A1,LEN(A1)-SEARCH(" ",A1,1)),LEN(RIGHT(A1,LEN(A1)-SEARCH(" ",A1,1)))-SEARCH("x",RIGHT(A1,LEN(A1)-SEARCH(" ",A1,1)),1)),"x",LEFT(RIGHT(A1,LEN(A1)-SEARCH(" ",A1,1)), SEARCH("x",RIGHT(A1,LEN(A1)-SEARCH(" ",A1,1)),1)-1))
If you have several spaces in your names, you can use this formula that will search the last space in the text
=CONCATENATE(LEFT(A1, SEARCH("^^",SUBSTITUTE(A1," ","^^",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))-1)," ",RIGHT(RIGHT(A1,LEN(A1)-SEARCH("^^",SUBSTITUTE(A1," ","^^",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))),LEN(RIGHT(A1,LEN(A1)-SEARCH("^^",SUBSTITUTE(A1," ","^^",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))))-SEARCH("x",RIGHT(A1,LEN(A1)-SEARCH("^^",SUBSTITUTE(A1," ","^^",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))),1)),"x",LEFT(RIGHT(A1,LEN(A1)-SEARCH("^^",SUBSTITUTE(A1," ","^^",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))), SEARCH("x",RIGHT(A1,LEN(A1)-SEARCH("^^",SUBSTITUTE(A1," ","^^",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))),1)-1))

Deleting variable number of leading characters from a variable-length string

If I am having G4ED7883666 and I want the output to be 7883666
and I have to apply this on a range of cells and they are not the same length and the only common thing is that I have to delete anything before the number that lies before the alphabet?
This formula finds the last number in a string, that is, all digits to the right of the last alpha character in the string.
=RIGHT(A1,MATCH(99,IFERROR(1*MID(A1,LEN(A1)+1-ROW($1:$25),1),99),0)-1)
Note that this is an array formula and must be entered with the Control-Shift-Enter keyboard combination.
How the formula works
Let's assume that the target string is fairly simple: "G4E78"
Working outward from the middle of the formula, the first thing to do is create an array with the elements 1 through 25. (Although this might seem to limit the formula to strings with no more than 25 characters, it actually places a limit of 25 digits on the size of the number that may be extracted by the formula.
ROW($1:$25) = {1;2;3;4;5;6;7; etc.}
Subtracting from this array the value of (1 + the length of the target string) produces a new array, the elements of which count down from the length of string. The first five elements will correspond to the position of the characters of the string - in reverse order!
LEN(A1)+1-ROW($1:$25) = {5;4;3;2;1;0;-1;-2;-3;-4; etc.}
The MID function then creates a new array that reverses the order of the characters of the string.
For example, the first element of the new array is the result of MID(A1, 5, 1), the second of MID(A1, 4, 1) and so on. The #VALUE! errors reflect the fact that MID cannot evaluate 0 or negative values as the position of a string, e.g., MID(A1,0,1) = #VALUE!.
MID(A1,LEN(A1)+1-ROW($1:$25),1) = {"8";"7";"E";"4";"G";#VALUE!;#VALUE!; etc.}
Multiplying the elements of the array by 1 turns the character elements of that array to #VALUE! errors as well.
=1*MID(A1,LEN(A1)+1-ROW($1:$25),1) = {"8";"7";#VALUE!;"4";#VALUE!;#VALUE!;#VALUE!; etc.}
And the IFERROR function turns the #VALUES into 99, which is just an arbitrary number greater than the value of a single digit.
IFERROR(1*MID(A1,LEN(A1)+1-ROW($1:$25),1),99) = {8;7;99;4;99;99;99; etc.}
Matching on the 99 gives the position of the first non-digit character counting from the right end of the string. In this case, "E" is the first non-digit in the reversed string "87E4G", at position 3. This is equivalent to saying that the number we are looking for at the end of the string, plus the "E", is 3 characters long.
MATCH(99,IFERROR(1*MID(A1,LEN(A1)+1-ROW($1:$25),1),99),0) = 3
So, for the final step, we take 3 - 1 (for the "E) characters from the right of string.
RIGHT(A1,MATCH(99,IFERROR(1*MID(A1,LEN(A1)+1-ROW($1:$25),1),99),0)-1) = "78"
One more submission for you to consider. This VBA function will get the right most digits before the first non-numeric character
Public Function GetRightNumbers(str As String)
Dim i As Integer
For i = Len(str) To 0 Step -1
If Not IsNumeric(Mid(str, i, 1)) Then
Exit For
End If
Next i
GetRightNumbers = Mid(str, i + 1)
End Function
You can write some VBA to format the data (just starting at the end and working back until you hit a non-number.)
Or you could (if you're happy to get an addin like Excelicious) then you can use regular expressions to format the text via a formula. An expression like [0-9]+$ would return all the numbers at the end of a string IIRC.
NOTE: This uses the regex pattern in James Snell's answer, so please upvote his answer if you find this useful.
Your best bet is to use a regular expression. You need to set a reference to VBScript Regular Expressions for this to work. Tools --> References...
Now you can use regex in your VBA.
This will find the numbers at the end of each cell. I am placing the result next to the original so that you can verify it is working the way you want. You can modify it to replace the cell as soon as you feel comfortable with it. The code works regardless of the length of the string you are evaluating, and will skip the cell if it doesn't find a match.
Sub GetTrailingNumbers()
Dim ws As Worksheet
Dim rng As Range
Dim cell As Range
Dim result As Object, results As Object
Dim regEx As New VBScript_RegExp_55.RegExp
Set ws = ThisWorkbook.Sheets("Sheet1")
' range is hard-coded here, but you can define
' it programatically based on the shape of your data
Set rng = ws.Range("A1:A3")
' pattern from James Snell's answer
regEx.Pattern = "[0-9]+$"
For Each cell In rng
If regEx.Test(cell.Value) Then
Set results = regEx.Execute(cell.Value)
For Each result In results
cell.Offset(, 1).Value = result.Value
Next result
End If
Next cell
End Sub
Takes the first 4 digits from the right of num:
num1=Right(num,4)
Takes the first 5 digits from the left of num:
num1=Left(num,5)
First takes the first ten digits from the left then takes the first four digits from the right:
num1=Right(Left(num, 10),4)
In your case:
num=G4ED7883666
num1=Right(num,7)

Resources