JavaME internationalization (i18n)

JavaME internationalization (i18n) - java-me

Does anyone have some knowledge with internationalization with JavaME? I'm looking for as much information as possible like examples, experiences and maybe some best practices.
Thanks

A few thoughts. J2ME doesnt support i18n as it the api support is not there (cant use resource bundles). But we can do this to a limited extent. Here is what I found out.
It is difficult if not impossible to support english and say chinese languages (typographic characters) for a given J2ME app. But easier to support english and say spanish (I forgot the correct nomenclature to talk about i18n support but you get the idea).
We can have all strings in one config class, that way you can swap this one out for different languages.
We can have the text/strings downloaded from server on initial launch of app and thus have the ability to swap it out from server.
Because of different screen sizes, it is best to work with custom fonts so that code can be written to calculate the text length while displaying it. This will make multiple language support easier.
Image assets can also be downloaded from server based on different languages. But I dont think we can change the midlet icon, so it should be generic.
With this in mind it is possible to design multiple language support.

omemuhammed's answer is an excellent coverages of localization in the mobile space.
I've only had to support EFIGS (English, French, Italian, German, and Spanish). We stored all strings in XML, had an XML pack for each language. We would then compile these XML packs into proprietary binary data and had the ability to build either all 5 languages into a build, have only one language for the build in cases where the application size was tight, or download the binary from a server.
Other considerations with localization is screen layout. I also recommend custom fonts in order to have better control of the display across many different devices. You will need some auto-wrapping code to be able to adjust to different screen resolutions and aspect ratios and you will need a way to handle strings that run off the screen on some devices. Either paging or or scrolling would be a good solution.
Finally, just know that German will screw up your formatting. Try to allow 20-30% padding in English for menus and other UI elements as the German translations will be much longer than the other languages.

See the actual internationalization spec for JavaME: http://www.jcp.org/en/jsr/detail?id=238
Recent Symbian phones should support this.
One obvious advice is to actually try your application on a localized phone: Get a phone from switzerland (it should support at least 4 languages) and another from hong kong (with 3 different version of "chinese"). It might be worth looking into eastern europe/ex-ussr too.
When the actual characters aren't your usual ascii ones, you do need to use a TextBox or TextField in order to have the localized native control on the screen.

Keep in mind that when you use RTL (righ to left) languages, like arabic, you should invert positions of almost everything on the screen, like a list would look like this on latin languages
List item 1
List item 2
List item 3
but the bullets would be on the other side of the screen on arabic (tried make it here, but I could´n generate na inverted list :P )
One other thing is that is better to store your strings in a class than a plain text properties, as this may cause some errors interpreting the unicode of some languages depending on the OS and text editor you are using.
What I usually do is have a I18nManager Class that stores the language in the startApp and through this class I get all the strings I need.

Related

as/400: other way for display graphics?

I'm aware of the existence of DDS files which allow programming of display graphics on the as/400, but is there another way?
Specifically, what I want to do is manipulate the terminal buffer directly to be able to display anything else than just text.
For example, the terminal looks like that:
Let's say, in memory, there would be a two dimensional char array: text[20][80] for the text menu and lower than that, there would be a pixel buffer array of size [200][800].
Is there a way to access either of those arrays directly?
I would like to be able to create a displayable menu entirely in C without the need of a display file and also display other kind of graphics (images) directly in the pixel buffer.

Is there a way to access either of those arrays directly?
That's easy enough, though a "display file" that has no formatted fields will still be needed. The 'file' will be the connection between the program and the physical device (or the emulator). You can define a single large area that contains whatever "text" you want your program to put into it. This can even include display field attributes that delimit input areas.
For the most control, the DDS USRDFN keyword is appropriate. But for simple stuff like lists of menu items, almost any large text field can be output to.
Outputting simple text is easy. For detailed stuff like USRDFN formatting, detailed understanding of the 5250 protocol is needed.
One kind of alternative would be to use User Interface Manager (UIM) APIs to update a PANEL's "text area" (:TEXT) via its USREXIT= application program. The UIM handles everything as far as any "display file" definition and actual I/O goes. The UIM can be thought of as a HTML interface for 5250 and uses a very similar markup language to define PANELs.
Another alternative is the Dynamic Screen Manager (DSM) APIs. These give much finer control than the UIM or DDS methods (though DDS USRDFN gets very close). But as with USRDFN, actual device control will require 5250 protocol knowledge.
...and also display other kind of graphics (images) directly in the
pixel buffer.
There is no "pixel buffer" for 5250 nor even 'pixels'. It's a character-based protocol, like telnet. If you're going for images or 'pixels', you're into browser interfaces, or perhaps Java and NAWT, or X-windows, etc.
Now, granted that with TCP/IP and sockets, you can do essentially anything that you're able to program. Whatever you can figure out how to do, including downloading/installing 3rd-party code libraries, you can do -- within the network restrictions surrounding your server. But it is in fact a server, so GUI kinds of apps generally shouldn't run on it. That's the same as for almost all types of servers. Code the GUI on the client system rather than the server. But you can do it if you really want to.

I'm not sure why you'd want to do this...
Now-a-days, it'd be much easier to simply generate your output as HTML and serve it up via the integrated apache web server.
But if you really want to do graphics via 5250, it can be done...theoretically at least. In 20+ years on the platform, I've never seen it.
But way back when (1994?), IBM added support for Graphical Data Display Manager (GDDM) and Presentation Graphics APIs into OS/400. "GDDM is a means of
displaying, printing, or plotting pictures. Presentation Graphics routines are a
means of displaying, printing, or plotting business charts."
The support is still in the OS. However, client side support is NOT available in IBM i Access for Windows or the most recently released client, IBM Access Client Solutions (ACS). It appears that the standalone IBM Personal Communications product may support GDDM.
For complete control of the character buffer, take a look at the Dynamic Screen Manager (DSM) APIs. The DSM APIs are "a set of screen I/O interfaces that provide a dynamic way to create and manage screens for the Integrated Language Environment® (ILE) high-level languages. Because the DSM interfaces are bindable, they are accessible to ILE programs only."

There is a way to do it in ILE C/C++. This was very fun to investigate since I haven't tried it myself.
The only documentation on it (page 183+) I could find is from 5.1, but you are able to cross reference the functions used to this 7.3 manual (possibly page vii/7) to see if they're still used the same.
Hope this helped!

Will an English CAPTCHA be an issue for people in other countries?

What if I have a captcha that displays a series of English characters. Will people who don't speak English have trouble interpreting and/or typing these characters? If this is the case then what is the best solution for an internationalized captcha?

Since 99% of the URLs are in regular ASCII, I don't think you will have a problem..after all how would they get to Google or Yahoo if they couldn't type the URL
That said I have on occasion run across Chinese characters used in captchas

Image-based CAPTCHA has two main advantages over text-based CAPTCHA:
International
Harder to solve algorithmically (see PWNtcha - captcha decoder)
There are several flavors, such as:
Classification: see Captcha The Dog, KittenAuth, Microsoft Asirra
3D projection: see 3D images: A human way to create Captchas and 3D-based Captchas become reality
Detection: see Image-Based CAPTCHA from Confident Technologies and Pic-Capture
Rotation: see A Dynamic, User-Friendly Captcha With Pictures
Puzzle: see Key Captcha

It would be a problem for users using their native, non-Latin keyboard layout, for example Russians and Greeks. They would be forced to switch keyboard layout just to fill security question.
Another thing is an ability to even recognize the words - somebody who doesn't speak English could have huge problems with getting word right. Even I sometimes do (for less popular words), although I am quite proficient...
In other words, don't do this mistake, your application should be easy to use for all users.

It's definitely a concern. Dictionary-based CAPTCHAs should ideally adapt to the user's language preferences and ask them to recognize words that match their language preferences and by extension the character set they are most familiar with.
But in the absence of such internationalization, I would say that numerals and mathematical expressions are the most universal solution, and for word-based CAPTCHAs a random series of ASCII characters (which being random would be culture-neutral) would be the most accessible as pretty much any user around the world has the ability to enter these characters even if some have to switch their input method.
Now where it really gets tricky is providing accessibility alternatives for visually impaired users. Making a univeral audio CAPTCHA seems pretty much impossible (you could consider a set of universally-recognized sounds instead of spoken words, but I doubt this would provide sufficient security). And internationalized (multilingual) spoken word generation is far from trivial.

No, because English captchas are ASCII -- ASCII is always available, even if people have a Japanese, Chinese, or Russian keyboard. So this should not be a problem! And image based captchas only require the person to read the letter - and that should be possible for anybody on the web who can see, as SQLMenace pointed out.
The other way around is a problem though.
Google's reCaptcha has a little icon where the user can get a different captcha if for some reason the captcha is not readable or contains foreign characters.
I would recommend that you use Google's reCaptcha, rather than implementing it yourself.
Added Benefit:
Google's reCaptcha is also available for other languages btw. http://www.google.com/recaptcha/faq
which makes it possible for you to internationalize the captcha for the user's default locale.
EDIT:
There is a work-around for Google's reCaptcha to work with flash!
Check here:
http://groups.google.com/group/recaptcha/browse_thread/thread/e22d7e3c91bcc9db

Sure they are a problem. Would a Russian captcha be a problem for you? What about a Chinese one?
The URLs are indeed ASCII, but that is only relevant for geeks.
Regular people go to Google, type some text in their own language, and then click on one of the answers. Then never get to type an URL.

Yes, this could represent a problem to a small percentage of users. Is it a large enough problem to take into consideration when building the UI for your site to better the UX? That's up to you. If it were up to me, probably not.
To help you in the right direction though, I would use Google' reCAPTCHA. It serves a great cause and works like a charm. There's also a great API where you can customize the language that it displays. You could use PHP to detect their country and write some code to change the settings to display in their native language.
Here's a sample of changing reCATCHA's language. "fr" is french!
<script type="text/javascript">
var RecaptchaOptions = {
lang : 'fr',
};
</script>
Google reCATPCHA's API:
http://code.google.com/apis/recaptcha/docs/customization.html#i18n

I believe that the 24 letters that constitute the English alphabet correspond in most 90% of the world. We have Chinese, Japanese, Cyrillic and Arabic users however all of them have the possibility of switching to an English keyboard within their operating systems.
We have no diacritics in English which makes everything a lot easier and our system more easily adaptable all over the world. Everyone types ASCII but they are able to switch to their own zone-specific/language-specific characters.

Brazilian portuguese website to support russian, mandarin and japanese

We have a website in brazilian portuguese developed using Coldfusion (for the user interface), Hibernate (for the business logic) and Oracle database.
If we consider to support russian, mandarin and japanese languages what concerns do we must have?
Thanks in advance.

The main consideration is to make sure everything (and I mean everything OS,shell,web server, appserver, database, editors) is configured to use utf-8 or unicode by default.
If you expect a lot of asian users its slightly better to use full unicode as most chinese characters fit into a 16 bit UTF-16, but, can take up 24 or 32 bits in utf-8.
With Coldfusion and Oracle this should not present any mojor problems.
The other main consideration is how you plan to handle the internationalisation isssues.
The standard way is to keep langauge/cultural specific items in a "bundle". There are several tools out there to support this, basically you write your app in portuguese making sure all text the user will see is in quoted literals, then run the app through a utility which replaces all literals with a library call and extracts all strings into a "bundle" file. You can then edit the bundle to add other language versions of the strings. The great advantage of this is that these formats are standard and translation agencies will have the tools to edit these files -- so you can easily outsource the translation to specialists.
The other option which requires much more work but IMHO produces a nicer result is to branch of a version of the front end for each language/culture supported. This gets around a lot of problems with text height and string size. Also it handles cultural norms better -- different cultures have differnet ordering and conventions for things like address and title.
A classic example of small differences causing big problems is the Irish Republic and Post Codes, they just dont have them. So if your form validation insists on a Zip code it will annoy your Irish users. The Brits do have post codes but these are two 1 to 4 character alphanumeric strings separated by a space, not the more usual 5 or 7 digit numeric codes.

Is there a visual two-dimensional code editor?

Let me explain what I mean by "two-dimensional code editor": imagine of using Inkscape or Gimp in a big canvas (say infinite). The "T - add text" tool is used to write the code. Additionally, all function definitions will be framed and links will connect the called functions.
In other words: you have a very large sheet of (virtual) paper where you can write.
It would be really useful. I don't want to write code as a long list of lines, especially now that big monitors are cheaper.
Is such a code editor out there?
What's your opinion? Would you use a 2d code editor?

I've written 3 or 4 visual editors and my second one worked like this, that was for java and c++ (never published, though I did use it for some published research work)
I still don't like much to write my code 'as a long list of lines'. My point is, after trying a system like this, I tried a windowed system (class outlines in windows, right click to open code editors), then a tree based system...
in the long run (I wrote several apps using all of those), the tree based system with non overlapping windows felt at once most scalable (to different monitor sizes) and foremost, most productive, because dragging the text boxes and links and/or windows in the first version was necessary, without adding much to the programming experience, so it felt wasteful.
If you want to try some of this stuff out, you can google antegram for java (java only) antegram for web (javascript/php/actionscript) and ee-ide (on oogtech.org). I'm not sure if I could dig up the original c++/java textbox + links editor (which could collapse graphs as well, and had an infinite canvas, so pretty close to what you describe).
I'm not working on this as much as I used to as few programmers ever seemed to like it except me, but if you like working the tree way, or feel like adding stuff for your own purposes, ee-ide would be the way to go, as it's nicely modular and easy to extend compared to the rest.
On the commercial side, you can configure visual studio to work with UML-like diagrams. I have a feel it might be a little too heavy (although it's definitely more coding than UML oriented), but I'm not sure, I haven't really tried yet.

This probably doesn't answer your question exactly, but anyway.
Have a look at the NodeBox beta . It is a visual programming environment mostly for creating generative graphics. You can program and edit the nodes with python code, connect and reuse them in multiple ways. (Windows and Mac OS)
Also worth mentioning (in terms of concept) is Field . It is for programming performances and arranges bits of code on a stage/timeline. Very interesting but also very confusing. (Mac OS only)
Third one is vvvv. It is used a lot by graphical artists to create realtime 3d visuals. Node based. (Windows only)
NodeBox and Field are open-source, so if you are looking to create something yourself you can see how it's done there.

Check this out. I came across it today and remembered this question.
Code Bubbles
Developers spend significant time
reading and navigating code fragments
spread across multiple locations. The
file-based nature of contemporary IDEs
makes it prohibitively difficult to
create and maintain a simultaneous
view of such fragments. We propose a
novel user interface metaphor for code
understanding and maintanence based on
collections of lightweight, editable
fragments called bubbles, which form
concurrently visible working sets.
The essential goal of this project is
to make it easier for developers to
see many fragments of code (or other
information) at once without having to
navigate back and forth. Each of these
fragments is shown in a bubble.
A bubble is a fully editable and
interactive view of a fragment such as
a method or collection of member
variables. Bubbles, in contrast to
windows, have minimal border
decoration, avoid clipping their
contents by using automatic code
reflow and elision, and do not overlap
but instead push each other out of the
way. Bubbles exist in a large,
pannable 2-D virtual space where a
cluster of bubbles comprises a
concurrently visible working set.
Bubbles support a lightweight grouping
mechanism, and further support
connections between them.
A quantiative user study indicates
that Code Bubbles increased
performance significantly for two
controlled code understanding tasks. A
qualitative user study with 23
professional developers indicates
substantial interest and enthusiasm
for the approach, despite the radical
departure from what developers are
used to.
http://www.cs.brown.edu/people/acb/codebubbles_site.htm

At one point, LabView had a programming mode like this. You connected program blocks together in a graphical way.
It's been so long since I've used LabView that I don't know if it is still the same.

For me, the MVVM pattern means that there's no code behind the UI controls anyway. The logic is all in a class with properties.
The properties use WPF databinding to update the UI controls. For example, on the form or window, page, whatever, MySearchButton.IsEnabled is bound to ViewModel.MySearchButtonIsEnabled property. So the app logic runs in the ViewModel class and just sets its own properties and the UI updates automatically.
Although this is specific to MS WPF the pattern actually stems from SmallTalk and is found across the development field as MVP. Without WPF one would need to write the databinding or 'presenter' logic, which is common.
This means the UI can be torn off and a new one pasted-in really quickly and with little code knowledge from the UI guy - who, in an ideal world, is a crack creative guy that drives a 70s Citroen.
So my point is that, although it sounds like a neat innovation, a 2D editor like this would be assisting a coding style that is no longer considered optimal.

Libraries of audio samples (spoken text)

For a project we're currently working on, we need a library of spoken words in many different languages.
Two options seem possible: text-to-speech or "real" recordings by native speakers. As the quality is important to us, we're thinking about going the latter path.
In order to create a prototype for our application, we're looking for libraries that contain as many words in different languages as possible. To get a feeling for the quality of our approach, this library should not be made up of synthesized speech.
Do you know of any available/accessible libraries?

A co-worker just found this community based library, which is nice, but rather small in size:
Forvo.com

I've just found this on the Audacity wiki: VoxForge. From their site:
VoxForge was set up to collect transcribed speech for use with Free and Open Source Speech Recognition Engines (on Linux, Windows and Mac).
We will make available all submitted audio files under the GPL license, and then 'compile' them into acoustic models for use with Open Source speech recognition engines such as Sphinx, ISIP, Julius and HTK (note: HTK has distribution restrictions).

There is also Old time radio, not sure if this is the sort of spoken word you're after though.

My guess is that you won't find a library anywhere that consists of just individual words. Whatever you find, you're going to have to open the audio up in an editor (like Pro Tools or Cool Edit) and chop it up into individual words.
You would probably be better off creating a list of all the words you need for each language, and then finding native speakers to read them while you record. You can have them read slowly, so that you'll have an easy time chopping up each individual word.

One I use to use a lot: http://shtooka.net/index.php
Easy access to the recordings.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string