Using VBA, how can I select every other cell in a row range (to be copied and pasted vertically)? - ms-office

I have a 2200+ page text file. It is delivered from a customer through a data exchange to us with asterisks to separate values and tildes (~) to denote the end of a row. The file is sent to me as a text file in Word. Most rows are split in two (1 row covers a full line and part of a second line). I transfer segments (10 page chunks) of it at a time into Excel where, unfortunately, any zeroes that occur at the end of a row get discarded in the "text to columns" procedure. So, I eyeball every "long" row to insure that zeroes were not lost and manually re-enter any that were.
Here is a small bit of sample data:
SDQ EA 92 1551 378 1601 151 1603 157 1604 83
The "SDQ, EA, and 92" are irrelevant (artifacts of data transmission). I want to use Excel and/or VBA to select 1551, 1601, 1603, and 1604 (these are store numbers) so that I can copy those values, and transpose paste them vertically. I will then go back and copy 378, 151, 157, and 83 (sales values) so that I can transpose paste them next to the store numbers. The next two rows of data contain the same store numbers but give the corresponding dollar values. I will only need to copy the dollar values so they can be transpose pasted vertically next to unit values (e.g. 378, 151, 157, and 83).
Just being able to put my cursor on the first cell of interest in the row and run a macro to copy every other cell would speed up my work tremendously. I have tried using ActiveCell and Offset references to select a range to copy, but have not been successful. Does any have any suggestions for me? Thanks in advance for the help.

It's hard to give a complete answer without more information about the file.
I think if your input data is 2200+ pages long, it's unlikely that opening it with the default excel opening functions is the way to go. Especially since Excel has maximum number of rows and columns. If the file is a text file (.txt) I would suggest opening it with VBA and reading each line, one at a time, and processing the data.
Here's an example to get you started. Just keep in mind that this is transposing each row of text into columns of data, so you will quickly fill all the columns of excel long before you run thru 2200 pages of text. But it's just an example.
Sub getData()
dFile = FreeFile
sFile = "c:\code\test.txt"
Open sFile For Input As #dFile
c = 1
'keep doing this until end of file
Do While Not EOF(dFile)
'read line into dataLine
Input #dFile, dataLine
' break up line into words based on spaces
j = Split(dataLine, " ")
jLength = UBound(j)
If jLength > 2 Then
r = 1
'ignore first 3 words
'and get every other word
'transpose rows of text into columns
For word = 3 To jLength Step 2
Cells(r, c) = j(word)
r = r + 1
Next word
End If
c = c + 1
Loop
Close #Data
End Sub

Related

Manually Define Fixed Widths in Data File

I have a fixed width data file from a third party that contains 1,000 records. It came with a separate document that displays all available columns, char start char end and char length for each column. It has thousands of columns.
My data file doesn't have data in every row so defining the fixed widths in Excel isn't feasible as I might erroneously skip a column because I can't see that it has data.
Is there a text editor that lets you manually type/define or import widths?
What does this "separate document" look like? Let's say I have a text file with a column of width values to be read that looks something like this:
20
25
30
10
5
23
25
10
23
I can then read the values from this text file into excel, and adjust the column widths of my spreadsheet using the following vba code:
Sub colWidth()
Dim widthArray() As String
Dim myFile, textline As String
Dim x, y As Integer
'example text file containing column widths
myFile = "C:\qqq\qqq\qqq\widths.txt"
'loop through the file and store each column width in an array
Open myFile For Input As #1
x = 1
Do Until EOF(1)
Line Input #1, textline
ReDim Preserve widthArray(1 To x)
widthArray(x) = textline
x = x + 1
Loop
Close #1
'using the array of column widths to adjust columns
For y = 1 To UBound(widthArray)
Columns(y).ColumnWidth = Int(widthArray(y))
Next y
End Sub

Handle Large Data for Conversion of Hex Data

I have a Text/CSV File of more than 10,000,000 Rows and 3 Columns.
Columns Names: ClientName, CLientMobile, ClientData
ClientData is in Hex format.
Presently I am doing:
Splitting the File in multiple parts of 900,000 rows each
Opening Each File - Say File 1
Pasting the below stated Function as Macro (Macro for Hex2Text)
Public Function HexToText(Text As Range) As String
Dim i As Integer
Dim DummyStr As String
For i = 1 To Len(Text) Step 2
DummyStr = DummyStr & Chr(Val("&H" & (Mid(Text, i, 2))))
DoEvents
Next i
HexToText = DummyStr
End Function
Converting Each Hex Value on Column "ClientData" in Readable Text by using above Function "Hex2Text"
Saving the Sheet.
Issues Faced:
I have to split all such big files in 900,000 row limit due to Excel limitations
It takes lot of time for calculations to run when I copy past formulae for Hex2Text for all 900,000 rows for Hex Values conversion in "ClientData"
Solution Desired:
Is there any other software that I can use for the same purpose to avoid spitting and avoid huge time wasted in Excel calculations for Hext2Text conversion.
Any simple solution/idea's will be welcome.

Add a space after colored text

I'm Using Microsoft Excel 2013.
I have a lot of data that I need to separate in Excel that is in a single cell. The "Text to Columns" feature works great except for one snag.
In a single cell, I have First Name, Last Name & Email address. The last name and email addresses do not have a space between them, but the color of the names are different than the email.
Example (all caps represent colored names RGB (1, 91, 167), lowercase is the email which is just standard black text):
JOHN DOEjohndoe#acmerockets.com
So I need to put a space after DOE so that it reads:
JOHN DOE johndoe#acmerockets.com
I have about 20k rows to go through so any tips would be appreciated. I just need to get a space or something in between that last name and email so I can use the "Text to Columns" feature and split those up.
Not a complete answer, but I would do it way:
Step 1 to get rid of the formatting:
Copy all text that you have to the notepad
Then copy-paste text from Notepad to excel as text
I think this should remove all the formatting issues
Step 2 is to use VBA to grab emails. I assume that you have all your emails as lowercase. Therefore something like this should do the trick (link link2):
([a-z0-9\-_+]*#([a-z0-9\-_+].)?[a-z0-9\-_+].[a-z0-9]{2,6})
Step 3 is to exclude emails that you extracted from Step2 from your main text. Something like this via simple Excel function:
=TRIM(SUBSTITUTE(FULLTEXT,EMAIL,""))
Since you removed all the formatting in Step1, you can apply it back when you done
You can knock this out pretty quickly taking advantage of a how Font returns the Color for a set of characters that do not have the same color: it returns Null! Knowing this, you can iterate through the characters 2 at a time and find the first spot where it throws Null. You now know that the color shift is there and can spit out the pieces using Mid.
Code makes use of this behavior and IsNull to iterate through a fixed Range. Define the Range however you want to get the cells. By default it spits them out in the neighboring two columns with Offset.
Sub FindChangeInColor()
Dim rng_cell As Range
Dim i As Integer
For Each rng_cell In Range("B2:B4")
For i = 1 To Len(rng_cell.Text) - 1
If IsNull(rng_cell.Characters(i, 2).Font.Color) Then
rng_cell.Offset(0, 1) = Mid(rng_cell, 1, i)
rng_cell.Offset(0, 2) = Mid(rng_cell, i + 1)
End If
Next
Next
End Sub
Picture of ranges and results
The nice thing about this approach is that the actual colors involved don't matter. You also don't have to manually search for a switch, although that would have been the next step.
Also your neighboring cells will be blank if no color change was found, so it's decently robust against bad inputs.
Edit adds ability to change original string if you want that instead:
Sub FindChangeInColorAndAddChar()
Dim rng_cell As Range
Dim i As Integer
For Each rng_cell In Range("B2:B4")
For i = 1 To Len(rng_cell.Text) - 1
If IsNull(rng_cell.Characters(i, 2).Font.Color) Then
rng_cell = Mid(rng_cell, 1, i) & "|" & Mid(rng_cell, i + 1)
End If
Next
Next
End Sub
Picture of results again use same input as above.

Split Cell by Numbers Within Cell

I have some fields that need to be split up into different cells. They are in the following format:
Numbers on Mission 21 0 21
Numbers on Mission 5 1 6
The desired output would be 4 separate cells. The first would contain the words in the string "Numbers on Mission" and the subsequent cells would have each number, which is determined by a space. So for the first example the numbers to extract would be 21, 0, 21. Each would be in its own cell next to the string value. And for the second: 5, 1, 6.
I tried using a split function but wasn't sure how to target the numbers specifically, and to identify the numbers based on the spaces separating them.
Pertinent to your first case (Numbers on Mission), the simple solution could be as shown below:
Sub SplitCells()
Const RowHeader As String = "Numbers on Mission"
Dim ArrNum As Variant
ArrNum = Split(Replace(Range("A1"), RowHeader, ""), " ")
For i = 1 To UBound(ArrNum)
Cells(1, i + 2) = ArrNum(i)
Next
Cells(1, 2) = RowHeader
End Sub
The same logic is applicable to your second case. Hope this may help.
Unless I'm overlooking something, you may not need VBA at all. Have you tried the "Text to Columns" option? If you select the cell(s) with the information you would like to split up, and go to Data -> Text to Columns. There, you can choose "delimited" and choose a space as a delimiter, which will split your data into multiple cells, split by where the space is.
edit: Just realized that will also split up your string. In that case, when you are in 3rd part of the Text to Columns, choose a destaination cell that isn't the cell with your data. (I.E. if your data is in A1, choose B1 as destination, and it'll put the split info there. Then just combine the text columns with something like =B1&" "&C1&" "&D1)
I was able to properly split the values using the following:
If i.Value Like "*on Mission*" Then
x = Split(i, " ")
For y = 0 To UBound(x)
i.Offset(0, y + 1).Value = x(y)
Next y
End If

Extract same column from multiple excel files using xlsread

I have a directory on C drive containing a number of excel files of the same format. I would like to copy column H from each file into a new file using the following script I found online:
dirs=dir('C:\xxx\*.xlsx');
dircell=struct2cell(dirs);
filenames=dircell(1,:);
range = 'H:H';
n = (numel(filenames));
for i = 1:n;
Newfile(:,i) = xlsread(filenames{i},range);
end
This gives an error message of "Subscripted assignment dimension mismatch." with only one column extracted in the resulting file (Newfile).
I played around with the range and noticed that error occurs when xlsread reaches the end of the list of the first file and stops when the value is empty. My column H's have different number of filled values (i.e. file 1 has 20, file 2 has 100, file 3 has 3, etc.).
So, my question is whether it is possible to modify this script so that when it encounters an empty cell, either an empty cell or a NaN cell is extracted and most importantly that it will move on to the next column.
Thank you so much for your help in advance!
Not having Matlab at home I have to take it from the top of my head.
Since the column you read, H, has different number of valid entries you should not try to force them into the resulting array NewFile directly but rather use a temporary variable
dirs=dir('C:\xxx\*.xlsx');
dircell=struct2cell(dirs);
filenames=dircell(1,:);
range = 'H:H';
n = numel(filenames);
Newfile = NaN*ones(1, n);
for nf = 1:n;
tempVar = xlsread(filenames{nf},range);
r = size(NewFile,1); % get number of rows in NewFile
if length(tempVar) > r
% Make Newfile big enough to fit column nf
Newfile = [Newfile;NaN*ones(length(tempVar)-r,n)];
end
Newfile(:,nf) = tempVar;
end

Resources