Handle Large Data for Conversion of Hex Data - excel

I have a Text/CSV File of more than 10,000,000 Rows and 3 Columns.
Columns Names: ClientName, CLientMobile, ClientData
ClientData is in Hex format.
Presently I am doing:
Splitting the File in multiple parts of 900,000 rows each
Opening Each File - Say File 1
Pasting the below stated Function as Macro (Macro for Hex2Text)
Public Function HexToText(Text As Range) As String
Dim i As Integer
Dim DummyStr As String
For i = 1 To Len(Text) Step 2
DummyStr = DummyStr & Chr(Val("&H" & (Mid(Text, i, 2))))
DoEvents
Next i
HexToText = DummyStr
End Function
Converting Each Hex Value on Column "ClientData" in Readable Text by using above Function "Hex2Text"
Saving the Sheet.
Issues Faced:
I have to split all such big files in 900,000 row limit due to Excel limitations
It takes lot of time for calculations to run when I copy past formulae for Hex2Text for all 900,000 rows for Hex Values conversion in "ClientData"
Solution Desired:
Is there any other software that I can use for the same purpose to avoid spitting and avoid huge time wasted in Excel calculations for Hext2Text conversion.
Any simple solution/idea's will be welcome.

Related

Summing the digits in Excel cells (long and short strings)

I'm working on a research related to frequencies.
I want to sum all the numbers in each cell and reduce them to single number only.
some cells have 2 numbers, others have 13 numbers. like these..
24.0542653897891
25.4846064424057
27
28.6055035477009
I tried several formulas to do that. the best ones have me 2 digits number, that I couldn't sum it again to get a single result.
like these Formulas:
=SUMPRODUCT(MID(SUBSTITUTE(B5,".",""),ROW(INDIRECT("1:"&LEN(B5)-1)),1)+0)
=SUMPRODUCT(1*MID(C5,ROW(INDIRECT("1:"&LEN(C5))),1))
any suggestion?
Thank you in advance.
EDIT
Based on your explanation your comments, it seems that what you want is what is called the digital root of the all the digits (excluding the decimal point). In other words, repeatedly summing the digits until you get to a single digit.
That can be calculated by a simpler formula than adding up the digits.
=1+(SUBSTITUTE(B5,".","")-1)-(INT((SUBSTITUTE(B5,".","")-1)/9)*9)
For long numbers, we can split the number in half and process each half. eg:
=1+MOD(1+MOD(LEFT(SUBSTITUTE(B5,".",""),INT(LEN(SUBSTITUTE(B5,".",""))/2))-1,9)+1+MOD(RIGHT(SUBSTITUTE(B5,".",""),LEN(SUBSTITUTE(B5,".",""))-INT(LEN(SUBSTITUTE(B5,".",""))/2))-1,9)-1,9)
However, the numbers should be stored as TEXT. When numbers are stored as numbers, what we see may not necessarily be what is stored there, and what the formula (as well as the UDF) will process.
The long formula version will correct all the errors on your worksheet EXCEPT for B104. B104 appears to have the value 5226.9332653096000 but Excel is really storing the value 5226.9333265309688. Because of Excel's precision limitations, this will get processed as 5226.93332653097. Hence there will be a disagreement.
Another method that should work would be to round all of the results in your column B to 15 digits (eg: . Combining that with using the long formula version should result in agreement for all the values you show.
Explanation
if a number is divisible by 9, its digital root will be 9, otherwise, the digital root will be n MOD 9
The general formula would be: =1+Mod(n-1,9)
In your case, since we are dealing with numbers larger than can be calculated using the MOD function, we need to both remove the dot, and also use the equivalent of mod which is n-(int(n/9)*9)
Notes:
this will work best with numbers stored as text. Since Excel may display and/or convert large numbers, or numbers with many decimal places, differently than expected, working with text strings of digits is the most stable method.
this method will not work reliably with numbers > 15 digits.
If you have numbers > 15 digits, then I suggest a VBA User Defined Function:
Option Explicit
Function digitalRoot(num As String) As Long
Dim S As String, Sum As Long, I As Long
S = num
Do While Len(S) > 1
Sum = 0
For I = 1 To Len(S)
Sum = Sum + Val(Mid(S, I, 1))
Next I
S = Trim(Str(Sum))
Loop
digitalRoot = CLng(S)
End Function
You could use a formula like:
=SUMPRODUCT(FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s"))
You might need an extra SUBSTITUTE for changing . to , if that's your decimal delimiter:
=SUMPRODUCT(FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(A1,".",",")," ","</s><s>")&"</s></t>","//s"))
However, maybe a UDF as others proposed is also a possibility for you. Though, something tells me I might have misinterpreted your question...
I hope you are looking for something like following UDF.
Function SumToOneDigit(myNumber)
Dim temp: temp = 0
CalcLoop:
For i = 1 To Len(myNumber)
If IsNumeric(Mid(myNumber, i, 1)) Then temp = temp + Mid(myNumber, i, 1)
Next
If Len(temp) > 1 Then
myNumber = temp
temp = 0
GoTo CalcLoop
End If
SumToOneDigit = temp
End Function
UDF (User Defined Functions) are codes in VBA (visual basic for applications).
When you can not make calculations with Given Excel functions like ones in your question, you can UDFs in VBA module in Excel. See this link for UDF .. If you dont have developer tab see this link ,, Add a module in VBA in by right clicking on the workbook and paste the above code in that module. Remember, this code remains in this workbook only. So, if you want to use this UDF in some other file your will have to add module in that file and paste the code in there as well. If you are frequently using such an UDF, better to make add-in out of it like this link
In addition to using "Text to Columns" as a one-off conversion, this is relatively easy to do in VBA, by creating a user function that accepts the data as a string, splits it into an array separated by spaces, and then loops the elements to add them up.
Add the following VBA code to a new module:
Function fSumData(strData As String) As Double
On Error GoTo E_Handle
Dim aData() As String
Dim lngLoop1 As Long
aData = Split(strData, " ")
For lngLoop1 = LBound(aData) To UBound(aData)
fSumData = fSumData + CDbl(aData(lngLoop1))
Next lngLoop1
fExit:
On Error Resume Next
Exit Function
E_Handle:
MsgBox Err.Description & vbCrLf & vbCrLf & "fSumData", vbOKOnly + vbCritical, "Error: " & Err.Number
Resume fExit
End Function
Then enter this into a cell in the Excel worksheet:
=fSumData(A1)
Regards,
The UDF below will return the sum of all numbers in a cell passed to it as an argument.
Function SumCell(Cell As Range) As Double
Dim Fun As Double ' function return value
Dim Sp() As String ' helper array
Dim i As Integer ' index to helper array
Sp = Split(Cell.Cells(1).Value)
For i = 0 To UBound(Sp)
Fun = Fun + Val(Sp(i))
Next i
SumCell = Fun
End Function
Install the function in a standard code module, created with a name like Module1. Call it from the worksheet with syntax like =SumCell(A2) where A2 is the cell that contains the numbers to be summed up. Copy down as you would a built-in function.

How do extract values from a label-value string in Excel VBA?

I am trying to process a large amount of data in VBA (in excel).
I have thousands of lines of strings that look like this:
LABEL_PERCENT XXX.XX% LABEL_DATE mm/dd/yy
I have used split to process line-by-line (so I am looking at an individual string as defined above). All of the lines have that exact formatting. For each line, I'd like to extract the percentage, and date, for populating a spreadsheet. How do I process the string in VBA, such that I can extract the values into two new variables?
You are already using Split()? This function is how you could extract the four values, splitting on the spaces:
Dim str As String
Dim splitted As Variant
str = "LABEL_PERCENT XXX.XX% LABEL_DATE mm/dd/yy"
splitted = Split(str, " ")
Debug.Print splitted(1) 'XXX.XX%
splitted(3) will give you the date. You then might want to parse the values as a percentage and date.

vba powerpoint formatting % and $ [duplicate]

This question already has answers here:
Stop Excel from automatically converting certain text values to dates
(37 answers)
Closed 6 years ago.
I am having a challenge with how MSOffice deals with number formats.
While I believe this is similar root cause to: Stop Excel from automatically converting certain text values to dates
It is different as this is not a date format and this involves both Excel and PowerPoint with VBA.
I have data that I am pulling out of a dB into CSV files and I am doing a .Replace on certain text markers (e.g. ##ReplaceText##) in a PPT template. (There is a good post on the site on how to do this I can't seem to locate now)
There is one field I need to deal with which is tracking a metric, this field is text in my dB, but it can contain special characters - specifically $ and %.
e.g. I could see the following values in the CSV file:
"increase market share","1234","$10","28%"
I want VBA to treat this all as text, so the % and $ characters are maintained...but... Excel reads the data as a number and keeps the $ or % sign. PowerPoint removes the $ or % sign and converts 28% to 0.28 and $10 to 10.
Per the above question, adding "=""28%""" to the .csv in Excel, will give me that exact literal text in PowerPoint.
Adding a preceding space or ' character works in forcing Excel to read the data as text string. But PowerPoint ignores it and behaves same as above. Eg 28% to 0.28.
I tried using FORMAT as below, but because the data is variable, I don't know which case to apply.
sCurrentText = Format(sCurrentText, "$#")
or
sCurrentText = Format(sCurrentText, "0.0%")
If statements don't work because the $ or % are not present in what VBA sees (e.g the $ or % character is already gone)
If sCurrentText Like "*$*" Then or If sCurrentText Like "*%" Then
So my question is how do I force VBA to take what is in the CSV file as text and ignore processing $ or % as special characters and just maintain them in the CSV?
You didn't specify what exactly you want to do with the data in the CSV file, but I've assumed you're trying to open the file in VBA.
If you are opening the CSV file using OpenText (as below) then Excel will automatically parse the data in the format it sees fit. eg:
Workbooks.OpenText fileName:="directory", DataType:=xlDelimited, Comma:=True
You can use a different method to open the CSV file if you want VBA to handle the data as just text which you can use as you see fit.
Sub OpenCSVFile()
Dim ff As Long, iRow As Long, iCol As Long
Dim FilePath As String
Dim FileBuffer As String 'Entire CSV file as one string
Dim LineSeparatedFile() As String 'Array of data separated into lines
Dim LineData() As String 'Array of comma separated values for that line
ff = FreeFile
Open FilePath For Binary Access Read As #ff
FileBuffer = Space$(LOF(ff))
Get #ff, , FileBuffer
Close #ff
LineSeparatedFile = Split(txtBuffer, vbCrLf)
For iRow = 0 To UBound(LineSeparatedFile)
LineData = Split(LineSeparatedFile(i), ",")
For iCol = 0 To UBound(LineData)
'Code to do something with each entry.
'Eg. print to cell as text
ThisWorkbook.Sheets(1).Cells(iRow + 1, iCol + 1).NumberFormat = "#"
ThisWorkbook.Sheets(1).Cells(iRow + 1, iCol + 1).Value = LineData(iCol)
Next iCol
Next iRow
End Sub

printing serial date and data to .csv or excel file using fprintf?

I just started using Matlab (R2015a) a few weeks ago, and although I have searched for an answer to this question (and tried a few workarounds) I haven't had any luck. Hopefully, it's an easy fix!!
I am trying to write one column of serial dates at high precision (I need milliseconds) and many columns of data to a .csv file. I don't want insane precision for everything, just the first column of dates.
Here's what I've found:
- csvwrite doesn't allow for differing precisions.
xlswrite doesn't have enough precision (even though my serial date is a double, and yes I looked at the spreadsheet cell)
dlmwrite appends data in row format, so writing the dates and then appending the rest of the data doesn't work (though soooo close!)
Now I'm trying with fprintf:
hz_time is the serial date (double)
data1 and data2 are 4x25 (double) and 4x7 (double) respectively
hz_time = 1.0e+05 *
[7.357583607870371, 7.357583607928241, 7.357583607986110, 7.357583608043980]
STR_data = [data1, data2];
filename = (strcat('Processed_',files(k1).name));
file = fopen(filename,'w');
fprintf(file,'%.20f\n',hz_time);
fprintf(file,'%f%f%f%f%f%f%f%f%\n',STR_data);
fclose('all')
Currently, this code appends data1 and data2 in one cell at the end of the STR_date_time column. When I try concatenating hz_time and the data matrices together (using strcat) I fail:
STR_data = strcat([hz_time, data1, data2])
Warning: Out of range or non-integer values truncated during conversion to character.
I'm sure it's probably my formatting...
My end goal is to export this data (into a .csv or excel spreadsheet or something) so that the first column has the serial date (loads of precision) and columns 2-8 have the other data in it.
Any help would be much appreciated.
Thanks in advance!

Using VBA, how can I select every other cell in a row range (to be copied and pasted vertically)?

I have a 2200+ page text file. It is delivered from a customer through a data exchange to us with asterisks to separate values and tildes (~) to denote the end of a row. The file is sent to me as a text file in Word. Most rows are split in two (1 row covers a full line and part of a second line). I transfer segments (10 page chunks) of it at a time into Excel where, unfortunately, any zeroes that occur at the end of a row get discarded in the "text to columns" procedure. So, I eyeball every "long" row to insure that zeroes were not lost and manually re-enter any that were.
Here is a small bit of sample data:
SDQ EA 92 1551 378 1601 151 1603 157 1604 83
The "SDQ, EA, and 92" are irrelevant (artifacts of data transmission). I want to use Excel and/or VBA to select 1551, 1601, 1603, and 1604 (these are store numbers) so that I can copy those values, and transpose paste them vertically. I will then go back and copy 378, 151, 157, and 83 (sales values) so that I can transpose paste them next to the store numbers. The next two rows of data contain the same store numbers but give the corresponding dollar values. I will only need to copy the dollar values so they can be transpose pasted vertically next to unit values (e.g. 378, 151, 157, and 83).
Just being able to put my cursor on the first cell of interest in the row and run a macro to copy every other cell would speed up my work tremendously. I have tried using ActiveCell and Offset references to select a range to copy, but have not been successful. Does any have any suggestions for me? Thanks in advance for the help.
It's hard to give a complete answer without more information about the file.
I think if your input data is 2200+ pages long, it's unlikely that opening it with the default excel opening functions is the way to go. Especially since Excel has maximum number of rows and columns. If the file is a text file (.txt) I would suggest opening it with VBA and reading each line, one at a time, and processing the data.
Here's an example to get you started. Just keep in mind that this is transposing each row of text into columns of data, so you will quickly fill all the columns of excel long before you run thru 2200 pages of text. But it's just an example.
Sub getData()
dFile = FreeFile
sFile = "c:\code\test.txt"
Open sFile For Input As #dFile
c = 1
'keep doing this until end of file
Do While Not EOF(dFile)
'read line into dataLine
Input #dFile, dataLine
' break up line into words based on spaces
j = Split(dataLine, " ")
jLength = UBound(j)
If jLength > 2 Then
r = 1
'ignore first 3 words
'and get every other word
'transpose rows of text into columns
For word = 3 To jLength Step 2
Cells(r, c) = j(word)
r = r + 1
Next word
End If
c = c + 1
Loop
Close #Data
End Sub

Resources