Viewing a txt file with row separator "\x00" - excel

I got a program that outputs a txt with column separator "\t" and row separator "\x00" (hex code)
But when I open the txt with MS Excel, Notepad++, LibreOffice, all the contents are put in one row.
But I want to open this in either MS Excel or LibreOffice in the "normal" view so that I can edit it easily.
I tried to find some type of buttons in LibreOffice to change the separator but I couldn't.
I changed encoding of it using Notepad++ to all encodings, but changing encoding didn't help at least at notepad++.
How can I open this file with row separator actually being displayed as it should?
I want to see this in multiple rows and edit it efficiently.

In Notepad++, you can perform a Find and Replace (ctrl+h).
Set the Search Mode to Extended
For the Find what, enter \\x00
For the Replace with, enter \n
Hit Replace All
Then to replace the \t with tabs, you can:
For the Find what, enter \\t
For the Replace with, enter \t
Hit Replace All

Related

how to end a line with semicolon (;) in a csv format with excel?

i am trying to save a file to csv format in excel but the results appears like this when i open it in notebook:
04/13/2020;20:00;8699;8745;8686;8742;5925
i need each line ending with a semicolon (;) like this
04/13/2020;20:00;8699;8745;8686;8742;5925;
i tried Spliting text into different columns with the Convert Text to Columns Wizard and selecting delimited to separate with semicolon , but i cannot find a way yet.
how can i convert this?,
Try this on Windows/Notepad++ :
Open your .csv file
Press CTRL+A to select all the lines
Press CTRL+H (Replace)
Select Regular Expression in Search Mode
Tap $ (which means the end of each line) in Find what
Tap ; in Replace with
Click Replace All
The ; will be then interpreted as delimiter in Excel (see column8 in the image below):

Carriage Return in Notepad

I have between 1-2 thousand notepad files that I need to add a new line to. I have an excel macro that can automatically find and replace text in notepad files, which I can use to add in the text I need. The excel macro has one cell where the user types the text to be found, and another where the user types the text that will replace that text. The problem is, I need to replace one line with two, and putting in a linebreak in the 'replace with' cell in excel (using alt-enter) does not put the text on a new line in notepad.
Interestingly, when I open the notepad file in Word, it does show up on a new line, with a carriage return between the two lines, but is still on the same line in notepad. Is there any way that I can use the excel macro to add the carriage return to show up in notepad?
ALT+Enter will only put a line feed into the string.
Notepad does not understand the "UNIX" style of encoding, but more advanced programs do.
if you replace the line feed with a full DOS newline, you should find your problem goes away:
NewString=Replace(OldString,vbLf,vbCrLf)
vbLf is the excel constant for the line feed.
vbCrLf is the excel constant for the DOS newline.

Import txt file with line breaks into Excel

while working on an export to Excel I discovered the following problem.
If you create a table where one cell has a line break and you save the document as a txt file it will look like this:
"firstLine<LF>secondLine"<TAB>"secondColoumn"
When I open this file in Excel the line break is gone and the first row has only one cell with the value firstLine
Do you know if it is somehow possible to keep the line breaks?
EDIT: Applies to Excel2010. Don't know if other versions behave different.
EDIT2: Steps to reproduce:
Open blank excel sheet
Enter text (first column with line break, second colum not important)
Save as Unicode Text (txt) // all other txt don't work as well
Close Excel file
File->Open
No changes in the upcoming dialog.
The excel file has now 2 rows which is wrong.
I was finally able to solve the problem! yay :D
CSV:
The german Excel needs a semicolon as a separator. Comma doesn't work.
Note: This is only true when the file is encoded as UTF-8 with BOM at the beginning of the file. If it's ASCII encoded comma does work as a delimiter.
TXT:
The encoding has to be UTF-16LE. Also it needs to be tab delimited.
Important:
The files will still be displayed incorrect if you open them with the "File->Open" dialog and "import" them. Draging them into Excel or opening with double click works.
It isn't a problem - in the sense of expected behaviour - this is inherent when you save text as Unicode or as Text (tab delimited)
If you save the file as unicode and then either
Open it in Notepad
Import it in Excel
you will see that the cells with linebreaks are surrounded by ""
The example below shows two linebreaks
A1 has an entry separated using Alt+Enter
B1 has an enry using the formula CHAR(10)
The picture also shows what notepad sees on a saved Unicode version
Suggested Workaround 1- Manual Method
In Excel, choose Edit>Replace
Click in the Find What box
Hold the Alt key, and (on the number keypad), type 0010
Replace with a double pipe delimiter
Save as Unicode
Then reverse the process when needed to reinsert the linebreaks
This can be done easily in VBA
Suggested Workaround 2 - VBA alternative
Const strDelim = "||"
Sub LBtoPIPE()
ActiveSheet.UsedRange.Replace Chr(10), strDelim, xlPart
ActiveSheet.UsedRange.Replace "CHAR(10)", strDelim, xlPart
End Sub
Sub PIPEtoLB()
ActiveSheet.UsedRange.Replace strDelim, Chr(10), xlPart
ActiveSheet.UsedRange.Replace strDelim, "CHAR(10)", xlPart
End Sub

How do I export an Excel file with Chinese characters to a CSV?

I having a Excel document with a data table containing Chinese characters. I am trying to export this Excel spreadsheet to a CSV file for importing into a MySQL database.
However, when I save the Excel document as a CSV file, Notepad displays the resulting CSV file's Chinese characters as question marks. Importing into MySQL preserves the question marks, completely ignoring what the original Chinese characters are.
I'm suspecting this may have to do with using Excel with UTF-8 encoding. Thanks for your help!
The following method has been tested and used to import CSV files in MongoDB, so it should work:
In your Excel worksheet, go to File > Save As.
Name the file and choose Unicode Text (*.txt) from the drop-down list next to "Save as type", and then click Save.
Open the unicode .txt file using your preferred text editor, for example Notepad.
Since our unicode text file is a tab-delimited file and we want to convert Excel to CSV (comma-separated) file, we need to replace all tabs with commas.
Select a tab character, right click it and choose Copy from the context menu, or simply press CTRL+C as shown in the screenshot below.
Press CTRL+H to open the Replace dialog and paste the copied tab (CTRL+V) in the Find what field. When you do this, the cursor will move rightwards indicating that the tab was pasted. Type a comma in the Replace with field and click Replace All.
Click File > Save As, enter a file name and change the encoding to UTF-8. Then click the Save button.
Change the .txt extension to .csv directly in Notepad's Save as dialog and choose All files (.) next to Save as type, as shown in the screenshot below.
Open the CSV file from Excel by clicking File > Open > Text files (.prn, .txt, .csv) and verify if the data is Okay.
Source here
As far as I know Excel doesn't save CSV files in any Unicode encoding. I have had similar issues recently trying to export a file as CSV with the £ symbol. I had the benefit of being able to use another tool altogether.
My version of Excel 2010 can export in Unicode format File > Save As > Unicode Text (.txt), but the output is a tab-delimited, UCS-2 encoded file. I don't know MySQL at all but a brief look at the specifications and it appears to handle tab delimited imports and UCS-2. It may be worth trying this output.
Edit: Additionally, you could always open this Unicode output in Notepad++ convert it to UTF-8 Encoding > Convert to UTF-8 without BOM And possibly replace all tab chars with commas too (Use the Replace dialogue in Extended Search mode, \t in the Find box and , in the Replace box.)
You might want to try notepad++, I doubt notepad will support unicode characters.
http://notepad-plus-plus.org/
For some people this solution may work: https://support.geekseller.com/knowledgebase/utf-8/
When saving csv, go to lower right Tools > Web Options > Encoding > Unicode (UTF-8)
Or this SO answer: just use Google Sheets to save csv as unicode:
Excel to CSV with UTF8 encoding
I have tried all above methods for my data but it does not quite work for my data (Simplified Chinese, over 700Mb. I have tried Windows Chinese and English system, English and Chinese excel. Windows excel seems not be able to save to utf8 even it claims to do so. I specify the uft8 csv in save as, but when i use the 'open sheet' to detect the encoding mehtods. it is not uft8,not GB* as well.
Here is my final solution.
(1) Download 'open sheet'.
(2) Open it properly. You Ccan scroll the encoding method until you see the Chinese character displayed in the preview windows.
(3) Save it as utf-8(if you want utf-8).
PS:You need to figure out the default encoding in your system. As far
as I know, Ubuntu deals with UTF8 fine. But the windows default
Simplied Chinese is start with GB**.Even if you encode it as utf8,
still, you might open it cocrrectly as well. In my case, r could not
open my utf-8 csv, but can open the GB* encoding.
This methods work well even your file is very large.
Some other work around is google sheet(but the file size can be limited). Notepad++ also works for smaller file.
There is a way to detect the encoding methods by opening your file and scroll through the encoding methods until you see the Chinese displayed correctly.
You should save csv file with:
df.to_csv(file_name, encoding = 'utf_8_sig')
instead of:
df.to_csv(file_name, encoding = 'utf-8')

import text file containing line breaks into excel

I have a plain text file looking like this:
"some
text
containing
line
breaks"
I'm trying to talk excel 2004 (Mac, v.11.5) into opening this file correctly. I'd expect to see only one cell (A1) containing all of the above (without the quotes)...
But alas, I can't make it happen, because Excel seems to insist on using the CR's as row delimiters, even if I set the text qualifier to double quote. I was sort of hoping that Excel would understand that those line breaks are part of the value - they are embedded in double quotes which should qualify them as part of the value. So my Excel sheet has 5 rows, which is not what I want.
I also tried this Applescript to no avail:
tell application "Microsoft Excel"
activate
open text file filename ¬
"Users:maximiliantyrtania:Desktop:linebreaks" data type delimited ¬
text qualifier text qualifier double quote ¬
field info {{1, text format}} ¬
origin Macintosh with tab
end tell
If I could tell Excel to use a row delimiter other than CR (or LF), well, I'd be a happy camper, but excel seems to allow the change of the field delimiter only, not the row delimiter.
Any pointers?
Thanks,
Max
Excel's open
Looks like I just found the solution myself. I need to save the initial file as ".csv". Excel honors the line breaks properly with CSV files. Opening those via applescript works as well.
Thanks again to those who responded.
Max
The other option is to create a macro to handle the opening. Open the file for input, and then read the text into the worksheet, parsing as you need, using a Range object.
If your file has columns separated by list separators (comma's, but semicolons for some non-English region settings), rename it to .csv and open it in Excel.
If your file has columns separated by TABs, rename it to .tab and open it in Excel.
Importing (instead of opening) a csv or tab file does not seem to understand line feeds in between text delimiters. :-(
Is it just one file? If so, don\'t import it. Just copy paste the content of your text file into the first cell (hit f2, then paste).
If you absolutely must script this, Excel actually uses only one of those two chars (cr, lf) as the row delimiter, but I'm not sure which. Try first stripping out the lf's with an external util (leave the cr's) and then import it... if that does't work, strip out the cr's (leave the lf's) and thenimport it.

Resources