How to enable my python code to read from Arabic content in Excel? - excel

I have two related problems. I'm working on Arabic dataset using Excel. I think that Excel somehow reads the contents as ؟؟؟؟؟ , because when I tried to replace this character '؟' with this '?' it replaces the whole text in the sheet. But when I replace or search for another letter it works.
Second, I'm trying to edit the sheet using python, but I'm unable to write Arabic letters (I'm using jGRASP). For example when I write the letter 'ل' it appears as 0644, and when I run the code this message appears : "ُError encoding text. Unable to encode text using charset windows-1252 ".

0644 is the character code of the character in hex. jGRASP displays that when the font does not contain the character. You can use "Settings" > "Font" in jGRASP to choose a CSD font that contains the characters you need. Finding one that has those characters and also works well as a coding font might not be possible, so you may need to switch between two fonts.
jGRASP uses the system character encoding for loading and saving files by default. Windows-1252 is an 8-bit encoding used on English language Windows systems. You can use "File" > "Save As" to save the file with the same name but a different encoding (charset). Once you do that, jGRASP will remember it (per file) and you can load and save normally. Alternately, you can use "Settings" > "CSD Windows Settings" > "Workspace" and change the "Default Charset" setting to make the default something other than the system default.

Related

Does Talend Support UTF-8 Encoding for Excel Headers?

I am new to Talend, and I have this Excel sheet, which has UTF-8 Letters in the headers, that i want to profile using Talend DQ, Now, I was able to import the list and in the preview everything is shown correctly; however, When I Click next, all the UTF-8 encoded letters are change into "Column0, Column1, ....
Any ideas on how to fix this?
Thanks!
In advanced option of your Input Excel File component there is a field that allow to select the encoding you want to use.
Encoding in Advanced settings if for content, header is based on field names which must respect Java namming rules convention (no accent, no special character and so on).
In short, if you need header with composed UTF-8 characters, don't use the standard "Include header" option, do it by yourself.

How to show these char like .。・゚゚・(>_<)・゚゚・。.(kaomoji) normally?

I grab the some char from this web kaomoji.
I check the source code of the page and find it is encoded by utf-8.
But when I copy them in my Chrome and then paste in Vim, Vim can't show the char correctly like this and the encode of file is utf-8, too as shown at the bottom of the pic. Why this happened? How can I make these chars show normally in the text file?
1215200040.bmp
The symbols should be pasted as is. The most common cause you can't see them is the font you use doesn't have these symbols. Try using another font in Vim.

How to mannually specify Byte Order Mark in CSV

I have a CSV that is encoded in Unicode, however lacks a byte order mark at the start. As such Excel (2013) opens without encoding correctly (i think it assumes ASCII if no BOM specified...), meaning that certain characters are displayed incorectly.
From reading around i have read that a BOM of "\uFEFF" should be entered at the start of the CSV file. I have tried opening in txt editor and adding the characters e.g.
\uFEFF
r1test 1, r1text2, r1text3
r2test 1, r2text2, r2text3
However, this does not solve the problem - the characters "\uFEFF" show up on the first row when I open in excel, rather than it beign interpreted as a BOM. I am not sure what I am doing wrong, and the format of how the text should be specified such that it is interpreted as a BOM, rather than text in the the first of the data
I have only very limited experience using CSV, and only just heard of a BOM... and thus I could be implementing this completely wrong!
(for reference, i know that I could specify the encoding if i use the import data option within excel... however I really want to work out how to get it correctly specified in advance such that I can just open the csv... I have several thousand of these files that I am creating and exporting - once I know how to do this 'manually' [i.e. by adding some text at start of a the file], I can configure to automatically do in Python).
Thanks in advance
For someone else wanting to tell Excel to add a BOM: See if you can "Save as Unicode Text".
source

String unknown Eclipse

I've changed the encoding from Eclipse but all my strings with special characters now are like this "�". The old encoding was the Default (Cp1252), now it is UTF-8. How can I fix the strings with special characters?
Thanks.
Well, imagine you switch your brain to only understand Chinese. Could you read an English text anymore?
You changed the way Eclipse interprets the bits of your sourcecode. So you need to translate the sourcecode from Cp1252 to UTF-8.
I don't know if Eclipse is able to do this, but Notepad++ is.
Open a java-file you want to change the encoding of in Notepad++.
Click on Encoding
Select Convert to UTF-8
Save the file
When you now click on Encoding again, there should be a dot in front of Encode in UTF-8
Edit: Notepad++ recognizes the used encoding, so you can read it there. Copy and Paste from Notepad++ to Eclipse won't work, because you copied the same string Eclipse couldn't read. You have to change the encoding of the string.

VB6 text appears as gibberish in one EXE but Hebrew in another

I have strange problem and minimal knowledge on VB6,
I got an EXE file that takes text (for button captions) from a text file.
This EXE which I don't has it code present all Hebrew text correctly.
I have build another EXE file (Identical to the previous) and all the functionally is working but it present all Hebrew as Gibberish!!
my text file encoding is Unicode.
Can you help me? Is there encoding for vb6 exe?
The default VB form/control fonts do not support all "foreign" characters.
If you set the font at runtime to "MS Shell Dlg" then Windows will translate this to the default UI font for that version of Windows which should handle most languages.
You'll also need to check the encoding of the file. If it's UTF-8 or a specific code page, then you'll need to use the MultibyteToWidechar() function to convert to UTF-16 for use in VB after reading it.
the perfect solution to resolve this issue is to change the font.Charset to 177 that represent HEBREW_CHARSET.
For example:
Text1.Font.Charset = 177
http://www.example-code.com/vb/vb6-display-unicode.asp

Resources