Situation:
The software application I am using uploads data using a .txt file. The application supports Arabic using UTF-8.
Problem:
I create an excel document with the appropriate columns and rows and enter an Arabic value into one of the cells. When I click the Save As > Tab Delimitated Text (.txt) format, Excel saves the file however the Arabic originally in the file is replaced with "__".
Question:
How can I create a .txt file in Excel that properly saves the Arabic text? Is it possible?
You can not do this with saving as because the export of the Arabic characters is not supported by Excel the SOLUTION is on Excel type Ctrl+A to select all text then go to notepad or wordpad and paste all the selection there.
Related
I copy column A from excel to notepad++, then do some editing,
then copy from notepad++ back to excel
however, when it is pasted back to excel, the data expands into column B and C.
So I think using Microsoft Word as a workaround, I copy from notepad++ to Word , then copy from word to excel, the data successfully stays within only column A in Excel.
I have many files, it would be the best to only copy between notepad++ and excel.
so how to keep data in column A only when copy from notepad++ to excel?
thanks in advance for any help
Export your text from Excel as CSV file.
Do your changes in exported CSV file. While editing, keep the CSV syntax.
Import your text from CSV back to the Excel.
I strongly recommend to explicitly select encoding UTF-8 when exporting and importing the CSV. If you check closer, that option is available in both export and import. Otherwise there is a risk of damaging of special characters (national characters, symbols etc) potentially contained in the file.
I am trying to make a CSV file from an Excel file. It has English, Korean and Japanese inputs. Right now it's saved as file.xlsx.
But when I try to save-as CSV through Excel as file.csv, all the Korean and Japanese inputs turn into question marks (???????)
I tried importing into Google Spreadsheets and exporting out as csv from there (from reading some other solutions) but it still turns into question marks.
I tried building a CSV file from scratch and just copying/pasting values from the Excel file into the CSV, but after I save it as CSV, the characters always crack.
Does anybody know how to work-around this? Thank you
I don't know that there IS an answer for this. CSV has no encoding, so it gets lost when you save in that format.
I tried, as a test, saving Chinese characters as a Unicode Text file, and believe it or not, that worked. So you may be able to do that, and simply change the filename to CSV. Assuming for some reason you NEED the filename to be CSV.
EDIT: I just ran addional testing on this. I was able to reimport the TXT file with either TXT or CSV extension, and the characters stayed just fine. So I think Unicode text is your answer.
Simply opening a CSV file in Excel only works when default assumptions hold. You may be writing the CSV correctly but not validating it properly.
It is more reliable to open a blank worksheet and then use Data Import. The encoding of the CSV file is one of the parameters you can specify.
To fully retain the characters while saving it on a CSV format and to somehow be able to import/re-use the data in the future.
You can follow these steps.
In Microsoft Excel, open the *.xlsx file.
Select Menu | Save As.
Enter any name for your file.
Under "Save as type," select Unicode Text.
Click Save.
Open your saved file in Microsoft Notepad.
Replace all tab characters with commas (",").
Select a tab character (select and copy the space between two column headers)
Open the "Find and Replace" window (Press Ctrl+H) and replace all tab characters with comma .
Click Save As.
Name the file, and change the Encoding: to UTF-8.
Change the file extension from .txt to .csv.
Click Save.
Open the .csv file in Excel to view your data.
Had the same issue. the below article shows the workaround in details:
https://help.salesforce.com/articleView?id=000003837&type=1
However, i decided to go with LibreOffice Calc, as it requires less steps to achieve the desired outcome. While exporting, you get to select charecter set, field delimiter and text decimeter.
For all other tasks, i prefer Excel.
Download and install Unicode CSV Addin for excel.
Save the csv from the new "Unicode CSV" menu as shown in picture
below.
I having a Excel document with a data table containing Chinese characters. I am trying to export this Excel spreadsheet to a CSV file for importing into a MySQL database.
However, when I save the Excel document as a CSV file, Notepad displays the resulting CSV file's Chinese characters as question marks. Importing into MySQL preserves the question marks, completely ignoring what the original Chinese characters are.
I'm suspecting this may have to do with using Excel with UTF-8 encoding. Thanks for your help!
The following method has been tested and used to import CSV files in MongoDB, so it should work:
In your Excel worksheet, go to File > Save As.
Name the file and choose Unicode Text (*.txt) from the drop-down list next to "Save as type", and then click Save.
Open the unicode .txt file using your preferred text editor, for example Notepad.
Since our unicode text file is a tab-delimited file and we want to convert Excel to CSV (comma-separated) file, we need to replace all tabs with commas.
Select a tab character, right click it and choose Copy from the context menu, or simply press CTRL+C as shown in the screenshot below.
Press CTRL+H to open the Replace dialog and paste the copied tab (CTRL+V) in the Find what field. When you do this, the cursor will move rightwards indicating that the tab was pasted. Type a comma in the Replace with field and click Replace All.
Click File > Save As, enter a file name and change the encoding to UTF-8. Then click the Save button.
Change the .txt extension to .csv directly in Notepad's Save as dialog and choose All files (.) next to Save as type, as shown in the screenshot below.
Open the CSV file from Excel by clicking File > Open > Text files (.prn, .txt, .csv) and verify if the data is Okay.
Source here
As far as I know Excel doesn't save CSV files in any Unicode encoding. I have had similar issues recently trying to export a file as CSV with the £ symbol. I had the benefit of being able to use another tool altogether.
My version of Excel 2010 can export in Unicode format File > Save As > Unicode Text (.txt), but the output is a tab-delimited, UCS-2 encoded file. I don't know MySQL at all but a brief look at the specifications and it appears to handle tab delimited imports and UCS-2. It may be worth trying this output.
Edit: Additionally, you could always open this Unicode output in Notepad++ convert it to UTF-8 Encoding > Convert to UTF-8 without BOM And possibly replace all tab chars with commas too (Use the Replace dialogue in Extended Search mode, \t in the Find box and , in the Replace box.)
You might want to try notepad++, I doubt notepad will support unicode characters.
http://notepad-plus-plus.org/
For some people this solution may work: https://support.geekseller.com/knowledgebase/utf-8/
When saving csv, go to lower right Tools > Web Options > Encoding > Unicode (UTF-8)
Or this SO answer: just use Google Sheets to save csv as unicode:
Excel to CSV with UTF8 encoding
I have tried all above methods for my data but it does not quite work for my data (Simplified Chinese, over 700Mb. I have tried Windows Chinese and English system, English and Chinese excel. Windows excel seems not be able to save to utf8 even it claims to do so. I specify the uft8 csv in save as, but when i use the 'open sheet' to detect the encoding mehtods. it is not uft8,not GB* as well.
Here is my final solution.
(1) Download 'open sheet'.
(2) Open it properly. You Ccan scroll the encoding method until you see the Chinese character displayed in the preview windows.
(3) Save it as utf-8(if you want utf-8).
PS:You need to figure out the default encoding in your system. As far
as I know, Ubuntu deals with UTF8 fine. But the windows default
Simplied Chinese is start with GB**.Even if you encode it as utf8,
still, you might open it cocrrectly as well. In my case, r could not
open my utf-8 csv, but can open the GB* encoding.
This methods work well even your file is very large.
Some other work around is google sheet(but the file size can be limited). Notepad++ also works for smaller file.
There is a way to detect the encoding methods by opening your file and scroll through the encoding methods until you see the Chinese displayed correctly.
You should save csv file with:
df.to_csv(file_name, encoding = 'utf_8_sig')
instead of:
df.to_csv(file_name, encoding = 'utf-8')
My application needs to pass data back and forth via text files with Excel. My text files will have Unicode text, and will also need to have some way of indicating mulitple lines within a cell (which I believe is the LF character (ascii 10)).
Excel can read my csv file correctly. However, when I save the csv file in Excel, it replaces the Unicode characters with ?'s. So although it still looks fine in Excel, if I close Excel and re-open the file with Excel, I see ?'s instead of my Unicode characters.
If instead of Excel saving as csv, I save as Unicode text, that produces a tab-delimited file that does have the Unicode characters. However, if I close the file and re-open it with Excel, it takes me through an import wizard that does not recognize the LF character (produced by alt-enter) to indicate a new line within a cell. Instead, it treats the LF as a new row.
How can I get Excel to save in a text format that supports both Unicode and multiple lines within a cell?
To get around this problem do not open the .txt file from Excel. Instead right click on the file in file explorer and choose open with Excel.
If you save the .txt file with .xls extension you can double click on the file in file explorer to open in Excel.
To Open From Excel
Click File/Open...
Select the .txt file to open.
Hold down the Shift key when clicking on the Open button.
I am encountering what I believe to be a strange issue with Excel (in this case, Excel 2007, but maybe also Excel 2003, but don't have access to it as I write this).
I can reliably convert some server data over into a tab-delimited format (been doing this for years) and then open it using Excel - no issue.
However, what seems to be happening is if I have an html <table> inside one of the fields, it looks like Excel 2007 thinks it should be converting the table into rows and columns inside Excel (not what I want). As you might imagine, this throws off the entire spreadsheet.
So question is, is there any way to set up excel to NOT do this (perhaps some setting in Excel that pertains to reading tab delimited files), or am I missing something?
Thanks.
Save your file as .txt
Now open the file in excel using Drag and Drop (rather than double clicking your hookey .xls)
Slightly more work to open the file, but your tab text formatting will now be respected.
When you open the tab-delimited file, you are shown an import mapping dialog that lets you pick each columns' data type (date, text, currency, etc.). For the columns that have HTML data present, choose text. This will tell it basically to import as-is and not try to automatically parse the data into a derived format.
Excel 2003 does the same. I don't think there is a way to do it with a config because Excel finds delimiters in the html table and breaks the html in cells and columns as it does for the other columns.
If the column containing html is always the same, you can use JYelton suggestion of renaming the file as csv and record a small VBA macro to load the file selecting automatically the html column as text in the import mapping dialog and you load the file calling the macro instead of double-clicking on the file.
If nothing else, import it into OpenOffice.org Calc, save as an .xls file, then open in Excel.