How should I setup MATLAB for correct display of Russian (Cyrillic) characters on figures in Linux? - linux

I have installed MATLAB R2008b on Ubuntu 12.04.4 LTS and Windows XP.
The system locale in Ubuntu is Unicode - en_US.UTF-8.
For compatibility with Windows I launch MATLAB in Ubuntu with ru_RU.CP1251 locale - so I have simple script to launch MATLAB:
cat /opt/MATLAB_R2008b/bin/matlab-run
#!/bin/bash
export LANG="ru_RU.CP1251";
export LC_ALL="ru_RU.CP1251";
/opt/MATLAB_R2008b/bin/matlab -desktop
After that slCharacterEncoding and feature('DefaultCharacterSet') returns desired windows-1251 as expected.
There are many fonts in my system, almost all support Russian (Cyrillic) glyphs.
Russian text displays normally in uicontrol (see screenshot )
uicontrol('String','Русский=Russian','Position',[0 0 200 200])
but does not in figure labels and title, so
x = linspace(0,2*pi,100); y = sin(x);
xlabel('x, в радианах','interpreter','none');
ylabel('y, значение sin(x)','interpreter','none');
title('y, значение sin(x)','interpreter','none');
produce wrong characters in labels and title .
I have no idea how to fix this.
How should I setup MATLAB for correct display of Russian (Cyrillic) characters on figures in Linux?

I solved my problem.
I installed all recommended fonts - packages xfonts-100dpi, xfonts-75dpi, xfonts-cyrillic, t1-cyrillic, cm-super, ttf-freefont, gsfonts-x11.
But what is interesting these fonts work only for UTF-8, so I can use the following fonts for displaying Russian (Cyrillic) text in figures:
clean
free avant garde
free bookman
free chancery
free courier
free helvetian
free paladin
free schoolbook
free times
oldslavic
tahoma guap
teams
terminus
For my original problem I found special ttf-font file, which works as expected and Russian (Cyrillic) text looks as expected in CP/Windows-1251 charset.
I placed this font in /usr/local/share/fonts/truetype, ran mkfontscale, mkfontdir and fc-cache -vf and added this location to /etc/X11/xorg.conf:
Section "Files"
FontPath "/usr/share/fonts/truetype"
FontPath "/usr/local/share/fonts/truetype"
EndSection
.
I installed language-pack-ru and edited /var/lib/locales/supported.d/local as follows:
en_US.UTF-8 UTF-8
ru_RU ISO-8859-5
ru_RU.CP1251 CP1251
ru_RU.KOI8-R KOI8-R

Related

Unicode character not visible while doing cat

I have a CSV file generated by a windows system. The file is then moved to linux. The linux environment is NAME="Red Hat Enterprise Linux Server".VERSION="7.3 (Maipo)".ID="rhel".
When I use vi editor, all characters are visible. For example, one line is given :"Sarah--bitte nicht löschen".
But when i cat the file, i get something like "Sarah--bitte nicht l▒schen".
This file is consumed by datastage application and this unicode characters are coming as "?" in datastage. Since cat is not showing the character properly, I believe the issue is at the linux server. Any help is appreciated.
vi reads the file using encoding according fenc setting and show the content using your locales setting ($LANG env). If fenc is different from LANG, vi can handle the translate.
But cat doesn't handle the translate, it always output the exact byte stream without any convert.
Your terminal will show the output content of both vi and cat using your local PC locale setting.

Sublime Text shows "NUL" characters in build output

I've coded a simple Red "Hello world" program in Sublime Text 3:
Red []
print "Hello world!"
I've also created a build system that I'm trying to use to compile and run the program, where G:\Red Programming Language\redlang.exe is the Red programming language compiler that I downloaded from the Windows link here:
{
"shell_cmd": "\"G:\\Red Programming Language\\redlang\" \"$file\""
}
The problem is that whenever I use my build system on a saved program, a strange NUL character appears between each character of the build output:
This doesn't happen with any other build system I have installed. The output appears fine if I run the redlang.exe from the Command Prompt, so it's probably an issue with my Sublime Text setup; I'm using Sublime Text Build 3083 and Windows 10. How can I get rid of those NUL characters?
The output of Red programs on Windows is using the native UTF-16LE encoding, which is the cause of the NUL characters you are seeing, as Sublime's output capturing defaults to UTF-8. You need to change it in your build system using the encoding command as described in the Sublime build system documentation.
So you might try something like:
{
"shell_cmd": "\"G:\\Red Programming Language\\redlang\" \"$file\"",
"encoding": "UTF-16LE"
}
See the supported encodings list here. Hope this helps.

Font in fonts.dir not available in xfontsel or gvim

I am working to add some fonts containing devicons to my $HOME dir for use in vim and gvim. vim needs the font in the terminal so I'm trying this command and get a xterm: unable to open font <name>, trying "fixed" error:
xterm -u8 -fn '-misc-knack-bold-i-normal--0-0-0-0-p-0-iso8859-15'
I see that specified font in the fonts.dir file and I've refreshed my cache with fc-cache -f -v. fc-list shows Knack:style=NerdFontPlusOcticonsPlusPomicons but using that string yields the same result. xfontsel does NOT show this as an available font but gvim does show this font as an option.
Why does the font appear in fonts.dir (and fonts.scale) but not in xfontsel?
Why does gvim see the font but not X11?
Shell is tcsh on a Suse11 system.
This
-misc-knack-bold-i-normal--0-0-0-0-p-0-iso8859-15
is a scalable font as described in mkfontdir, because all of the sizes are zeros. xterm and xfd need sizes. You can experiment with
#!/bin/sh
FONT=`xfontsel -print`
test -n "$FONT" && xfd -fn "$FONT
to see what sizes the font server would like to deliver for a non-scaled version of the font, or use the name from fc-list with the -fa option of xterm and xfd:
-fa pattern
This option sets the pattern for fonts selected from the
FreeType library if support for that library was compiled into
xterm. This corresponds to the faceName resource. When a CJK
double-width font is specified, you also need to turn on the
cjkWidth resource.
Further reading:
X Logical Font Description (Arch wiki)
Appendix A. Specifying Fonts (SGI developer books)
Appendix A. Specifying Fonts: Scalable Fonts

Ghostscript not prining accented characters correctly

I have a Bash script that writes a text watermark to PDF files. It does this by generating an overlay PDF with Ghostscript and then using PDFtk to stamp the overlay onto the original.
All this works perfectly, except that Ghostscript is not writing accented characters correctly. If my input text is, for example, "Français", the output on the PDF will be "Franˆ§ais".
My Ghostscript command line is:
/usr/local/bin/gs -q -o "${TEMPFILE}" \
-sDEVICE=pdfwrite -sPAPERSIZE=letter \
-c "60 23 moveto 0.32 0.23 0.22 setrgbcolor /Helvetica-Oblique findfont 9 scalefont setfont (${WATERMARK}) show"
The $WATERMARK variable contains a single line of text to be written. The problem occurs both when running the Bash script that contains this line and also when I run just this command directly.
I'm seeing this problem using Ghostscript 9.06 on Mac OS X (installed via Homebrew) and 9.05 on Ubuntu 12.04 (installed from the Ubuntu package repository). The Bash script and gs command line were both written by someone else; I have no experience using Ghostscript myself.
Changing the font has no effect on the problem and I've been unable to google anything useful related to this. What are we doing wrong here?
Thanks.
You haven't encoded the font correctly (or indeed at all).
You are assuming that the character code which represents the glyph named ccedilla is the same in the font as it is on your computer system. For Latin fonts, and the ASCII characters up to 127 this is usually true, for characters beyond that it usually isn't and for non-Latin languages (eg Russian, Arabic, CJKV languages, etc) it isn't true at all.
Encoding fonts isn't hard, but it is rather lengthy to go into here, so instead let me recommend the excellent series of articles written by John Deubert of Acumen Training, you can find them here:
http://www.acumentraining.com/acumenjournal.html
For your purposes I suggest the November and December 2001 articles.

ghostscript fonts

I'm trying to get ghostscript to render a pdf file from a Windows box. The pdf file uses the ComicSansMS font. I've copied the comic.ttf file from my Windows7 box into my /usr/share/ghostscript/fonts directory, and I've created a Fontmap file in that same directory containing this line:
/ComicSansMS (comic.ttf) ;
As nearly as I can tell, the font is not being found despite this. The text comes out very poorly, and some of the smaller font sizes are rendered half the size they should be. Access times and strace show that the Fontmap file is being read, but the font file (comic.ttf) is not being accessed at all. There are no error messages:
hope 78$ gs cards-01.pdf
GPL Ghostscript 9.00 (2010-09-14)
Copyright (C) 2010 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
>>showpage, press <return> to continue<<
If I use -dFAPIDEBUG on the gs command line, I see the following:
hope 74$ gs -dFAPIDEBUG -I/usr/share/ghostscript/fonts cards-01.pdf
GPL Ghostscript 9.00 (2010-09-14)
Copyright (C) 2010 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
FAPIhook --nostringval--
Trying to render the font Font --nostringval-- ( aliased from ComicSansMS ) with FAPI...
Font --nostringval-- ( aliased from ComicSansMS ) is being rendered with FAPI=FreeType
FAPIhook --nostringval--
Font --nostringval-- ( aliased from ComicSansMS ) is mapped to FAPI=FreeType
FAPIhook RVJCAL+SymbolMT
Trying to render the font Font RVJCAL+SymbolMT with FAPI...
Font RVJCAL+SymbolMT is being rendered with FAPI=FreeType
FAPIhook RVJCAL+SymbolMT
Font RVJCAL+SymbolMT is mapped to FAPI=FreeType
FAPIhook HYLUQF+ComicSansMS
Trying to render the font Font HYLUQF+ComicSansMS with FAPI...
Font HYLUQF+ComicSansMS is being rendered with FAPI=FreeType
FAPIhook HYLUQF+ComicSansMS
Font HYLUQF+ComicSansMS is mapped to FAPI=FreeType
>>showpage, press <return> to continue<<
Naturally, the line from the above that most concerns me is this one:
Font --nostringval-- ( aliased from ComicSansMS ) is being rendered with FAPI=FreeType
"gs -h" shows that the font directory is, indeed, in the search path:
hope 77$ gs -h
GPL Ghostscript 9.00 (2010-09-14)
[ ... ]
Search path:
/usr/share/ghostscript/9.00/Resource/Init :
/usr/share/ghostscript/9.00/lib :
/usr/share/ghostscript/9.00/Resource/Font :
/usr/share/ghostscript/fonts : /usr/share/fonts/Type1 : /usr/share/fonts
I've tried several permutations of formatting in the Fontmap file, including:
(Comic Sans MS) (comic.ttf) ;
(ComicSansMS) (comic.ttf) ;
/Comic Sans MS (comic.ttf) ;
/ComicSansMS /comic.ttf ;
I'm fairly sure my original one is the correct one, but I was getting desperate. :-P
Any help would be greatly appreciated. Thanks in advance.
I assume that PDF does not have the ComicSansMS font embedded?
You should consider 2 other possibilities as well:
Your PDF file card-01.pdf is somehow corrupted. (Are other PDF viewers rendering that file without a problem? Does it display OK in Acrobat Reader on Widnows?)
Your fontfile comic.ttf is somehow corrupted. (Which method did you use to transfer it from Windows to Linux?)
You could try to positively proof that both these components are getting along well enough with each other by using Ghostscript+comic.ttf to create a PDF (with comic.ttf embedded):
gs \
-sFONTPATH=/usr/share/ghostscript/fonts \
-o comic-ttf.pdf \
-sDEVICE=pdfwrite \
-g5950x8420 \
-c "200 700 moveto" \
-c "/ComicSansMS findfont 60 scalefont setfont" \
-c "(comic.ttf) show showpage"
On Windows, use this variation of above command:
gswin32c.exe ^
-o comic-ttf.pdf ^
-sDEVICE=pdfwrite ^
-sFONTPATH=c:/windows/fonts ^
-g5950x8420 ^
-c "200 700 moveto" ^
-c "/ComicSansMS findfont 60 scalefont setfont" ^
-c "(comic.ttf) show showpage"
When I do this, I see:
gswin32c.exe ^
-o comic-ttf.pdf ^
-sDEVICE=pdfwrite ^
-sFONTPATH=c:/windows/fonts ^
-dHaveTrueTypes=true ^
-g5950x8420 ^
-c "200 700 moveto" ^
-c "/ComicSansMS findfont 60 scalefont setfont" ^
-c "(comic.ttf) show showpage"
GPL Ghostscript 9.00 (2010-09-14)
Copyright (C) 2010 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Scanning c:/windows/fonts for fonts... 423 files, 255 scanned, 240 new fonts.
Loading ComicSansMS font from c:/windows/fonts/comic.ttf... 3343720 1813337 2926116 1611207 1 done.
and my output PDF comic-ttf.pdf looks OK and does have the comic.ttf font embedded.
If this does also work for you, then your Ghostscript and your comic.ttf are OK, but your PDF file cards-01.pdf is not.
I came back to this problem after a delay. Upon further investigation with a magnifying glass, the problem is different from what I initially thought.
Text is definitely being rendered incorrectly in parts of the document. Each letter is far too small, though the spacing is oddly correct. However, the individual letters are the correct shape for the font.
The font on disk is not being accessed, but that's because the fonts are all embedded within the document. This fact would probably have been obvious to a Ghostscript expert from the output I posted in the original question (I'm guessing the "HYLUQF+" prefix is the smoking gun there), but I don't work with Ghostscript much. My fonts were installed correctly, and other documents were able to access them without trouble.
Of course, this still leaves the question of why my embedded fonts are being rendered incorrectly, but I will investigate that separately and/or post a different question. I maintain that the PDF file is uncorrupted (I have several other PDFs which exhibit the same problem), but I still don't know what's wrong.
#pipitas: Thanks very much for trying. You certainly did help verify that my installed fonts are not the problem. Actually, now that I look again, you even gently suggested the font might be embedded, but I either didn't see it, didn't believe it, or didn't know how to check.

Resources