Split function in Excel VBA returning odd characters - excel

I have a text file containing danish characters
"Næsby IF afdeling * Badminton * Sport *"
When splitting and placing them in an array the danish characters gets "messed up"
This is the complete Text string in a *.TXT file to be split up in Excel columns:
"Ulrich*wiingreen*BenPauWin05 Aps*Søballehøjen 12*5270*Odense N*+4530212215*ulrich#wiingreen.eu*Næsby IF afdeling*Badminton*Sport* *Hal 1*Hal 2*99*11/03/2022 13:00*11/03/2022 17:00*kkkk"
The code doing this is:
If InStr(FileName, "forespoerg_") <> 0 Then
OrderArr = Split(OrderDetails, "*")
OrderRow = OrdersDB.Range("A99999").End(xlUp).Row + 1
OrdersDB.Cells(OrderRow, 1).Value = Application.WorksheetFunction.Max(Range("A4:A9999")) + 1
OrdersDB.Cells(OrderRow, 2).Value = Date
For OrderCol = 3 To 20
OrdersDB.Cells(OrderRow, OrderCol).Value = OrderArr(OrderCol - 3)
Next OrderCol
End If
The splitting works just fine. Unfortiunately the characters gets messed up.
Example: "Søballehøjen 12" imports as: "Søballehøjen 12"
Can anyone give a hint to solve this character issue.

Not sure but I suspect encoding mismatch. You can try opening your text file with VS Code and watch at the bottom right what is its encoding.
You can then use StrConv(yourTextVar, someconversion) wher someconversion is a value like vbUnicode or vbFromUnicode (see options here)
How do you import the text file ? If your file is a unix or another non Windows flavour you could try reading the file using an ADODB.Stream, which offers fine control on the encoding. I quickly found a sample here.

Related

printing txt file from VBA for Matlab

I am working parallel with Excel and VBA in order to create txt files I wish to use for MATLAB. However, I experience some format issues I can't resolve.
For instance, the following VBA
Open "example.txt" For Output As #1
For i = 1 To 5
Print #1, Sheets("Example").Cells(i + 3, 3)
Next i
Indeed prints numbers (reals) it is supposed to however MATLAB struggles with reading this example .txt file.
There are some characters VBA is printing. I don't know how to delete those within a VBA code.
Example.txt opened in matlab. Note the NaN read by MATLAB from a text file:
VBA text file - Note a line as the first element of a column
Perhaps there is a character that is invisible.
A possible solution is to remove those characters with regex.
Add reference to Microsoft VBScript Regular Expression 5.5
Then the following VBA code:
Set re = New RegExp
re.Pattern = "[^0-9]"
Open "example.txt" For Output As #1
For i = 1 To 5
Print #1, re.Replace(Sheets("Example").Cells(i + 3, 3).value, vbNullString)
Next i
This should remove anything that is not a digit from the cell before printing it to the text document.

Importing Excel and using it for Stata do code

For example, suppose I have this Excel file.
Then, I am manually putting things on Excel into do file like this.
replace A = 1 if B>=1 & B<=6
replace A = 2 if B>=23 & B<=2
replace A = 3 if B>=3 & B<=1
replace A = 4 if B>=5 & B<=3
If this wasn't clear, please see this image to see what I am doing.
But there could be actually hundreds of lines.
How can write a short code which imports the Excel file, and another short code which replaces the manual codes I have written?
So the goal here is just to make my code succinct.
You can import excel this file. Let's suppose the headers are A and B and the import produces those as numeric variables. Then the text of a new do-file is contained within
gen text = "replace A = " + string(_n) + " if inrange(A, " + string(A) + "," + string(B) + ")"
which you must export and then run on your real data.
Not tested. I'd also suggest considering doing this in your favourite text editor.
Note that many of your comparisons in your example will always be false.

VB .Net when exporting to CSV issue when viewing in MS Excel

I have encountered something really weird. When exporting to CSV my top line shows the quotation marks yet the lines below down.
I use UTF8 encoding and manually add the double quotation marks to the value so that it is encased with quotation marks.
the code being used is
Dim fs As New IO.FileStream(GenericValueEditorExportFilename.Value, IO.FileMode.Create)
Dim writer As New IO.StreamWriter(fs, Encoding.UTF8)
fs.Write(Encoding.UTF8.GetPreamble(), 0, Encoding.UTF8.GetPreamble().Length)
....
....
....
While reader.Read
If reader("TargetLanguageID") = targetLanguageID Then
writer.WriteLine(Encode(reader("SourcePhrase")) & ", " & Encode(reader("TargetPhrase")))
End If
....
....
....
Friend Shared Function Encode(ByVal value As String) As String
Return ControlChars.Quote & value.Replace("""", """""") & ControlChars.Quote
End Function
the result when displayed in excel is shown as (https://ibb.co/ntMYdw)
when i open the file in Notepad++ the text is shown as below. But each line is displayed differently. Why is it that the 1st row displays them and the 2nd does not. Notepad++ result is displayed as (https://ibb.co/fMkWWG)
Excel is treating the first line as headers.
https://stackoverflow.com/a/24923167/2319909
So the issue was being caused by the BOM that was created to manually set the encoding for the file as a start writing to the file.
fs.Write(Encoding.UTF8.GetPreamble(), 0, Encoding.UTF8.GetPreamble().Length)
Removing this resolves by issue and the file remains in the desired UTF8 encoding as it is set on the stream writer. so there is no need to add the BOM to set the encoding.
Something like this should work for you.
Dim str As New StringBuilder
For Each dr As DataRow In Me.NorthwindDataSet.Customers
For Each field As Object In dr.ItemArray
str.Append(field.ToString & ",")
Next
str.Replace(",", vbNewLine, str.Length - 1, 1)
Next
Try
My.Computer.FileSystem.WriteAllText("C:\temp\testcsv.csv", str.ToString, False)
Catch ex As Exception
MessageBox.Show("Write Error")
End Try

VBA PublishObjects. Add character formatting

I found the article about putting excel cells into an email using the RangetoHTML function in VBA. It works like a charm, but now I’m facing a Problem.
If there are Umlaut (e.g.: ü, ä, ö) in the cells the result in the email shows strange symbols (e.g.: ä, …).
I looked up the written temp.htm file. On the first view of this file, it seems the umlaute are correctly written, but after looking through the file with an hex editor i found that the written symbols are not correct.
The function which writes the file is: PublishObjects.Add
So I hope someone can help me with this.
Edit: Added a testfile. Word and Office is needed.
Select the table and run the procedure SendMail.
You will always have problems with vba and foreign chars and the web.
EDIT:
Because you can't separate the cell values from the html the function below will unfortunately not work in this situation. BUT:
if you Save a copy of the document with western European windows encoding it will work.
(See comments below).
To be able to do that you press "Save As" and there is a dropdown on the left side of the save button (Tools) which will give you a dialog where you can change the encoding.
The image has ben lifted from technet and always save web.. is not necessary.
EOF EDIT:
This is a function I have used, Unfortunately can't remember who I got it from, But its from the olden days of vba and classic asp
Put your email cell formula into this function and it should work because all the letters are html encoded. Its slow and makes a bad overhead. But it will work.
Function HtmlEncode(ByVal inText As String) As String
Dim i As Integer
Dim sEnc As Integer
Dim repl As String
HtmlEncode = inText
For i = Len(HtmlEncode) To 1 Step -1
sEnc = Asc(Mid$(HtmlEncode, i, 1))
Select Case sEnc
Case 32
repl = " "
Case 34
repl = """
Case 38
repl = "&"
Case 60
repl = "<"
Case 62
repl = ">"
Case 32 To 127
'Numbers
Case Else
repl = "&#" & CStr(sEnc) & ";" 'Encode it all
End Select
If Len(repl) Then
HtmlEncode = Left$(HtmlEncode, i - 1) & repl & Mid$(HtmlEncode, i + 1)
repl = ""
End If
Next
End Function

Javafx clipboard double newlines

I'm fairly sure this is a bug in the JavaFX clipboard but I want to make sure I'm not doing something stupid. I'm programatically placing plain text on to the clipboard using the following code:
Clipboard clipboard = Clipboard.getSystemClipboard();
ClipboardContent content = new ClipboardContent();
//String test = "1" + System.lineSeparator() + "2"; //Example 1 - Two lines
//String test = "1\r\n2"; //Example 2 - Two lines
String test = "1\n2"; //Example 3 - One line
content.putString(test);
clipboard.setContent(content);
Example 1 and 2 result in this text after pasting
1
2
Example 3 results in this text after pasting (as expected)
1
2
Making notepad++ show line ends confirms that in the first two example the lines endings are being doubled. Running a debugger over it shows the String is fine after it's been placed into the ClipboardContent but I stopped following it after that.
This is all on Windows 8 (the running code and the paste operation). My conclusion is that somewhere deep in the system it's detecting the need for windows line endings and converting each of the \r and \n into \r\n just before the paste happens.
I resolved this issue with a simple replaceAll like this:
final ClipboardContent content = new ClipboardContent();
content.putString(str.replaceAll("\r\n", "\n"));

Resources