Excel - reverse text to columns - excel

I am struggling to find a good way of doing this.
I have a file where some lines are hundreds of words long (comma separated) and some are only a few words long.
So performing a text to columns produces hundreds of columns, most of which are blank.
I have done my edits on the second column and now need to join everything back so that each line can be read in a text file and all words are comma separated once more.
Is there a formula that will know how many columns are in each line that are not blank and bring them all back into the first cell with comma separation?
I should add, I only have Excel 2010
I'd be happy to try a good powershell script solution if possible
Many thanks,
K
eg.
54325,354354,786756,6543,73644,23323,544,7233,64537,654,56,3456,754,876666,78,788
122,433
655,766
1233,7374,65436,65444,6577,85488,56767,8585876,6755,544445,67,67783,2233,466636
I use text to columns so I can work on contents of column 2, but then need it back in this format once done.
If I simply save as csv and open as text file, there are commas for every blank cell in each column

Given a text file like the one you describe, you could simply split the lines manually:
Get-Content .\input.txt |ForEach-Object {
# split line into individual values
$values = $_ -split ','
# modify as needed
if($values.Length -gt 2){
$values[1] = "Make changes to column 2 here"
}
# stitch line back together
$values -join ','
} |Set-Content .\output.txt

Not sure why you need to strip off the trailing commas from the short rows, as they don't produce blank columns, only empty cells. The columns have content from another row. But here is VBA code to do what you request, in case that is a possibility for your usage:
Assumption: Starting point is your worksheet with each element of the CSV data you show above, in a separate cell.
None of your words includes a comma, linefeed, or other character that requires enclosing the word in quotes.
This VBA code:
Option Explicit
Sub createCSV()
Dim v, x, S As String, I As Long, RE As Object
v = ThisWorkbook.Worksheets("Sheet2").Cells(1, 1).CurrentRegion
Set RE = CreateObject("vbscript.regexp")
With RE
.Pattern = ",*$"
.MultiLine = True
.Global = True
End With
ReDim x(1 To UBound(v, 1), 1 To 1)
With WorksheetFunction
For I = 1 To UBound(v, 1)
'Replace trailing commas with nothing
x(I, 1) = RE.Replace(Join(.Index(v, I, 0), ","), "")
Next I
End With
Open "D:\Users\Ron\Desktop\NewFile.csv" For Output As #1
For I = 1 To UBound(x)
Print #1, x(I, 1)
Next I
Close #1
End Sub
I used a Regular Expression to remove the trailing commas.
results (in Notepad++):
54325,354354,786756,6543,73644,23323,544,7233,64537,654,56,3456,754,876666,78,788
122,433
655,766
1233,7374,65436,65444,6577,85488,56767,8585876,6755,544445,67,67783,2233,466636

Related

Splitting very large string separated with comma and i need to split 50 items only per row

im having very big string on 1st row.so 1st row contains lots of items with comma like below
12345,54322,44444,222222222,444444,121,333,44444,........
I just need to split this till 50 items in every row. lets assume there are 700 items separated with comma and I want to keep till 50 items only in 1st row and then next 50 in 2nd row and so on.
I tried with the below code which splits till 50 for sure but im not sure if this will works going forward. so need help on this
OutData = Split(InpData, ",")(50)
MsgBox OutData
You can do this in many more ways, but one would be to replace every nth comma. For example through Regular Expressions:
Sub Test()
Dim s As String: s = "1,2,3,4,5,6,7,8,9,10,11"
Dim n As Long: n = 2
Dim arr() As String
With CreateObject("vbscript.regexp")
.Global = True
.Pattern = "([^,]*(?:,[^,]*){" & n - 1 & "}),"
arr = Split(.Replace(s, "$1|"), "|")
End With
End Sub
The pattern used means:
( - Open 1st capture group;
[^,]* - Match 0+ (Greedy) characters other than comma;
(?: - Open a nested non-capture group;
,[^,]* - Match a comma and again 0+ characters other than comma;
){1} - Close the non-capture group and match n-1 times (1 time in the given example);
), - Close the capture group and match a literal comma.
Replace every match with the content of the 1st capture group and a character you know is not in the full string so we can split on that character. See an online demo
I suppose you can do whatever you like with the resulting array. You probably want to transpose it into the worksheet.

Find count of multiline in an Excel cell starting with delimiter -

I am looking to find formula which gives me count of -> how many line in multiline of the cell are begining with - (hyphen)
for e.g. if cell contains
how are you keeping up
-I am well and need toy
-"You" are asking wrong question
<you are wrong>
-why should i reply you
sum count of qualified multiline is = 3
can anyone help me out here please
If you first lines never start with an hyphen, or at least do not count towards the total, then try:
Formula in B1:
=(LEN(A1)-LEN(SUBSTITUTE(A1,CHAR(10)&"-","")))/2
If your first line can also start with an hyphen and therefor count towards the total, try:
=(LEN(CHAR(10)&A1)-LEN(SUBSTITUTE(CHAR(10)&A1,CHAR(10)&"-","")))/2
Here is a VBA solution:
Function CountLines(text As String, Optional flag As String = "") As Long
'counts all lines in text which starts with flag
Dim i As Long, count As Long
Dim lines As Variant
lines = Split(text, vbLf)
For i = LBound(lines) To UBound(lines)
If Mid(lines(i), 1, Len(flag)) = flag Then
count = count + 1
End If
Next i
CountLines = count
End Function
If this is in a standard code module, the example text in A1 and in B1 you enter the formula =CountLines(A1,"-"), it will evaluate to 3.
If you want to include the first line in the potential count, then, in Windows Excel 2013+, you can try:
=COUNTA(FILTERXML("<t><s>" & SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1,">",">"),"<","<"),"""","""),CHAR(10),"</s><s>") & "</s></t>","//s[starts-with(text(),'-')]"))
Replace illegal xml characters ",<, and >
Create an XML by splitting into nodes based on the LF character
Use xpath //s[starts-with(text(),'-')] to return only those nodes that start with a hyphen.
COUNTA to return the count of those nodes

How to remove duplicates in a string

I have a file contains 38,000 records each row contains 2 or more ';' at the end. is there any formula to remove the end repeated ';' in Excel or any other tool for example
To remove repeated characters (semi-colons in this case)
Hit CTRL+H
Find What: ;; (two semicolons)
Replace with: ; (one semicolon)
Click Replace All.
When it finishes, repeat Step 4 until there are no more matches found.
Now the document will have no more than one semicolon in a row.
Remove repeated characters using a VBA function:
The following function does the same thing using VBA, and for any character you choose:
Function removeDoubleChars(txt As String, doubleChar As String) As String
'removes all multiple-consecutive [doubleChar] within [txt]
Do
txt = Replace(txt, doubleChar & doubleChar, doubleChar)
Loop While InStr(txt, doubleChar & doubleChar) > 0
removeDoubleChars = txt
End Function
You would use this like Range("A1") = removeDoubleChars ( Range("A1"), ";") to remove consecutive semicolons from cell A1.

Split Cell by Numbers Within Cell

I have some fields that need to be split up into different cells. They are in the following format:
Numbers on Mission 21 0 21
Numbers on Mission 5 1 6
The desired output would be 4 separate cells. The first would contain the words in the string "Numbers on Mission" and the subsequent cells would have each number, which is determined by a space. So for the first example the numbers to extract would be 21, 0, 21. Each would be in its own cell next to the string value. And for the second: 5, 1, 6.
I tried using a split function but wasn't sure how to target the numbers specifically, and to identify the numbers based on the spaces separating them.
Pertinent to your first case (Numbers on Mission), the simple solution could be as shown below:
Sub SplitCells()
Const RowHeader As String = "Numbers on Mission"
Dim ArrNum As Variant
ArrNum = Split(Replace(Range("A1"), RowHeader, ""), " ")
For i = 1 To UBound(ArrNum)
Cells(1, i + 2) = ArrNum(i)
Next
Cells(1, 2) = RowHeader
End Sub
The same logic is applicable to your second case. Hope this may help.
Unless I'm overlooking something, you may not need VBA at all. Have you tried the "Text to Columns" option? If you select the cell(s) with the information you would like to split up, and go to Data -> Text to Columns. There, you can choose "delimited" and choose a space as a delimiter, which will split your data into multiple cells, split by where the space is.
edit: Just realized that will also split up your string. In that case, when you are in 3rd part of the Text to Columns, choose a destaination cell that isn't the cell with your data. (I.E. if your data is in A1, choose B1 as destination, and it'll put the split info there. Then just combine the text columns with something like =B1&" "&C1&" "&D1)
I was able to properly split the values using the following:
If i.Value Like "*on Mission*" Then
x = Split(i, " ")
For y = 0 To UBound(x)
i.Offset(0, y + 1).Value = x(y)
Next y
End If

Macro that sorts call signs containing letters and numbers

All the call signs will be in column A and when the macro is run should sort them. The sort is case insensitive usually in all caps. A call sign consists of 1-2 letters(prefix), 1-2 numbers(numbers), then 1-3 letters(suffix) I want to sort each sign by the number, suffix, then prefix in that order.
W9K, BB3C, W9GFO, AB8VN, G3G, A77Bc, KB8HTM, K9DOG, W8AER, K1ZZ, W4BFT, W0CQC, WA6FV, W6TRW, AA5B, W4IY, N4C, K5UZ, K4LRG
I will bite. Half the fun of coding is solving a problem for the simple pleasure of knowing you figured it out.
Here is a user defined function (Formula) that you can use to convert the call sign into the format for sorting. Note the numeric portion is zero padded so ones and tens do not sort together before twos and twenties.
Option Explicit
Public Function FormatCallSign(aCell As Range)
Dim Nbr As String
Dim i As Integer
Dim tmp As String
Dim vList As Variant
For i = 1 To Len(aCell.Value)
If InStr(1, "1234567890", UCase(Mid(aCell.Value, i, 1))) > 0 Then
Nbr = Nbr & Mid(aCell.Value, i, 1)
tmp = tmp & ","
tmp = Replace(tmp, ",,", ",")
Else
If InStr(1, "ABCDEFGHIJKLMNOPQRSTUVWXYZ", UCase(Mid(aCell.Value, i, 1))) > 0 Then
tmp = tmp & Mid(aCell.Value, i, 1)
End If
End If
Next i
vList = Split(tmp, ",")
FormatCallSign = vList(1) & Right("0" & Nbr, 2) & vList(0)
End Function
Put the formula in cell B2, for example by using the formulas command on the ribbon and selecting the function from the user defined section.
As asked earlier if the call sign had delimiters in it already, you could use a simple formula to rearrange the parts and exclude the delimiters.
=CONCATENATE(MID(A3,SEARCH("-",A3)+1,4),RIGHT("0"&MID(A3,SEARCH("/",A3)+1,SEARCH("-",A3)-SEARCH("/",A3)-1),2),LEFT(A3,SEARCH("/",A3)-1))
To build a formula like the above, start by constructing it in parts.
First write a Search function to find the "/", then copy it to find the "-"
Then write a mid function to get the characters to the right of the dash, left of the slash, then the numeric section. paste the formulas into a single formula for your masterpiece.
Since it makes better sense to keep the three elements in separate fields for simplified sorting, the above formula can be split into three separate formulas, one for each column.
=MID(A3,SEARCH("-",A3)+1,4)
=value(MID(A3,SEARCH("/",A3)+1,SEARCH("-",A3)-SEARCH("/",A3)-1),2))
=LEFT(A3,SEARCH("/",A3)-1)
This corrects sorting problems given the three elements are variable length.
The initial specification for callsign format is inaccurate, since they can begin with numbers or letters and a logical sort would be by ITU assigned prefix. A function would need a table lookup for country after it determined if the string after the forward slash was a valid country designation. This is actually a pretty complicated problem.

Resources