Afrikaans collator data missing from ICU 57.1? - icu

I am working on a C++ project which depends on ICU 57.1 for its collator service. As part of this project, I would like to create a .dat ICU archive file that includes all collator data. The ICU data library customizer tool (http://apps.icu-project.org/datacustom/) clearly shows that there is a collation tailoring for Afrikaans in the coll/af.res file. The ICU locale explorer also shows that the Afrikaans locale has a tailoring of the Unicode Collation Algorithm given by the following rule:
&N <<< ʼn
However, when I inspect the source/data/in/icudt57l.dat from a download of the 57.1 source code, it doesn't seem to include coll/af.res:
$ eval $(./source/install/bin/icu-config --invoke=icupkg) -l source/data/in/icudt57l.dat | grep af.res
af.res
curr/af.res
lang/af.res
rbnf/af.res
region/af.res
unit/af.res
zone/af.res
coll/af.res similarly seems to be missing if I build ICU from source using --with-data-packaging=files.
Am I doing something wrong or is there an actual problem with the ICU data?

Related

Searching for Python MSDOS parser library

Does anyone knows a good Python library to parse MSDOS files and obtain metadata and start()'s bytecodes? Like an alternative version of pefile library but for MSDOS? I can't seems to find any via Google.
If there isn't, is there a good source to refer to on MSDOS's file format? This way, I can create my own parser instead. I know there are tools like IDA Pro and Reko decompiler but I need a MSDOS file parser to automate some stuff. Thank you in advanced!
Reko decompiler maintainer here. For what it's worth, you can use Reko's MS-DOS source code and translate it to Python. It's not a lot of code and MS-DOS executables aren't that complex to parse -- it's quite a simple format. The relevant files are:
https://github.com/uxmal/reko/blob/master/src/ImageLoaders/MzExe/ExeImageLoader.cs
https://github.com/uxmal/reko/blob/master/src/ImageLoaders/MzExe/MsdosImageLoader.cs
You could also try executing the Reko code directly from Python. The Reko binaries are available as a nuget package: https://www.nuget.org/packages/Reko.Decompiler.Runtime
Use the class Reko.ImageLoaders.MzExe.ExeImageLoader in the Reko.ImageLoaders.MzExe class. Integration could be done with http://pythonnet.github.io/

wxWidgets fails to build due to missing wxxml.lib

Apparently anything GUI-related in terms of components involves XML. I cannot go around actually configuring and building wxWidgets from source because of that. I'm new to wxWidgets.
My current setup is on Win10 with MSVC v141 (Visual Studio 2017) with the latest CMake version (currently 3.21).
Inside the config.cmake of the wxWidgets projects (using latest master branch) I see
wx_get_dependencies(EXTRALIBS_XML xml)
I am also calling CMake with -DwxUSE_XML=ON (among other parameters) but this still leads to:
the XML dependency is nowhere to be found
respectively it's not built
Linking then fails with the following error:
LINK : fatal error LNK1104: cannot open file 'wxxml.lib' [C:\Users\...\CMakeBuilds\ef5b5ada-ee42-7735-988a-ae37c735ccff\build\deps\build\wxwidgets\libs\qa\wxqa.vcxproj]
What library is actually wxWidgets using and how do I trigger it's retrieval and accordingly configuration and building? Since I am adding wxWidgets to my CMake project as an ExternalProject component, I would appreciated something in that line of thought. However any kind of information regarding this issue is more than welcome especially since it will shine light on how to configure other features (if I want them in the future) such as WebView.
The wxxml.lib issue is fixed now. While fixing it I also discovered a bug (of sort) in the build system of wxWidgets.
The reason why it failed to build this library in particular was actually quite simple but due to the lack of knowledge in the dependencies of wxWidgets. I thought that wxWidgets, given it depends on XML so much, has its own XML parser. Well, not really. The wxXML component actually uses and underlying 3rd party dependency called EXPAT, which - as you can see in my question - I have deactivated since it was giving me issues during the build (due to the still present problem of not being able to automatically retrieve dependencies).
What I did was to clone the libexpat repository, add it as an ExternalProject, set the variables for the libraries and include directory and pass them onto my wxWidgets project. But there is a catch...
The expat.cmake file looks as follows:
#############################################################################
# Name: build/cmake/lib/expat.cmake
# Purpose: Use external or internal expat lib
# Author: Tobias Taschner
# Created: 2016-09-21
# Copyright: (c) 2016 wxWidgets development team
# Licence: wxWindows licence
#############################################################################
if(wxUSE_EXPAT STREQUAL "builtin")
# TODO: implement building expat via its CMake file, using
# add_subdirectory or ExternalProject_Add
wx_add_builtin_library(wxexpat
src/expat/expat/lib/xmlparse.c
src/expat/expat/lib/xmlrole.c
src/expat/expat/lib/xmltok.c
)
set(EXPAT_LIBRARIES wxexpat)
set(EXPAT_INCLUDE_DIRS ${wxSOURCE_DIR}/src/expat/expat/lib)
elseif(wxUSE_EXPAT)
find_package(EXPAT REQUIRED)
endif()
I would use the *.cmake files of the 3rd party dependencies stored inside <ROOT_OF_WXWIDGETS_PROJECT>/build/cmake/lib to determine which variables I need to set if builtin is selected as the value for the respective library. Since I want to use my own I need sys (e.g. -DwxUSE_EXPAT=sys as a CMAKE_ARGS inside my wxWidgets ExternalProject) and also to pass the headers and libraries accordingly.
Given the file above one would assume that EXPAT_LIBRARIES is required. However after failing to build (yet again) and seeing that the reason was the activated expat build and that it was set as builtin I checked the log in detail and found the following error:
Could NOT find EXPAT (missing: EXPAT_LIBRARY) (found version "2.2.6")
Notice the EXPAT_LIBRARY. After passing it (-DEXPAT_LIBRARY=...) my build was complete. For me this is a bug or simply inconsistency between the dependency cmake file and the rest of the wxWidgets project.
It is important to note that I do not retrieve the external dependency through wxWidgets itself (see config.cmake and more precisely the macro wx_get_dependencies(...)). This solves the problem with a basic configuration and build of wxWidgets but if you don't want to tackle every dependency of wxWidgets on your own (why should you?), I recommend looking for a solution where the dependencies (at least the ones you don't want to deal with) are automatically retrieved, configured and build as builtin.

What is maskgen tool in JavaCard

In JavaCard, can somebody please tell me what is the purpose of the maskgen tool?
What I have heard from my senior colleagues that it is the tool which converts the Java codes into C code for that particular JavaCard platform. But is answer seems to broad and lacking the exact particularity. If the above mentioned purpose is correct then my few questions are
1. How does it convert the java source code into C code ?
2. How can I see the source code of this maskgen tool ?
3. How can I convert my Java card source code using maskgen tool?
Quoted from Java Card 3 Platform Development Kit User Guide:
What id Maskgen tool?
The maskgen tool produces a mask file from a set of Java Card Assembly
files produced by the Converter. The format of the output mask file is
targeted to a specific platform. The plug-ins that produce each
different maskgen output format are called generators. The supported
generators are cref, which supports the Java Card RE, and size, which
reports size statistics for the mask. Other generators that are not
supported in this release include jref, which supports the Java
programming language Java Card RE, and a51, which supports the Keil
A51 assembly language interpreter. Java Card Assembly Syntax Example
provides additional information about the contents of a Java Card
Assembly file.
Where I can find Maskgen tool source?
The maskgen tool is not available or of use outside of a source
release bundle, so [...] if you do not have a source release of the
development kit you would have maskgen tool. If you have a source
release, you can localize locale-specific data associated with the
maskgen tool, see Localizing With The Development Kit.
How to convert Java Card sources using Maskgen tool?
Check Running Maskgen oracle page:
maskgen Example
This example uses a text file (args.txt) to pass command line
arguments to maskgen:
maskgen -o mask.c cref #args.txt
where the contents of the file args.txt is:
first.jca second.jca third.jca
This is equivalent to the command line:
maskgen -o mask.c cref first.jca second.jca third.jca
This command produces an output file mask.c that is compiled with a C
compiler to produce mask.o, which is linked with the Java Card RE
interpreter. Refer to Using the Reference Implementation for more
information about this target platform.
Above ".JCA" (Java Card Assembly) files are generated using Converter tool. Here is its manual.
Some related quoted info from here:
maskgen actually generates a mask.c file which contains VM bytecodes
that are interpreted by the JCVM and the applet is executed. The
mask.c file should be loaded onto the card. This method is used only
for static use of JavaCard.
And
Maskgen takes the CAP file (which is generated by the converter ), and
generates a mask.c file which will be a part of the cref in static
cards. The parameters for memory configuration of your MCU/processor
can be set in maskgen.cfg file.
Anyway, you need a binary release of JCDK to have this tool and its source.

Is there a way to specify all the GDCM libraries in CMakeList at once?

This may be a very basic question, but I am unable to find the answer.
I just installed GDCM library on my Windows 7 workstation and compiled it with CMake and later built the generated solution using VS2012 Express.
However, I'm unsure about which GDCM libraries to include in the CMakeLists and I was wondering if there was an easier way to specify all the libraries at once.(like VTK_LIBRARIES for VTK). I tried GDCM_LIBRARIES and that doesn't return a value, neither does GDCM.
Specifically, I am looking to replace:
TARGET_LINK_LIBRARIES(TestvtkGDCMImageReader vtkgdcm gdcmMSFF gdcmDSED gdcm2vtk)
with something more general.
Typically, GDCM_LIBRARIES would be defined by the use file that you include after finding the GDCM package in CMake; however, it isn't currently set. You might just do it yourself by setting a variable with all of the GDCM library names.
For example, from looking at the libraries included in the 2.4.0 Windows binary distribution, I could do this in my CMakeLists.txt:
set( GDCM_LIBRARIES
gdcmDICT gdcmDSED gdcmIOD gdcmMEXD gdcmMSFF gdcmcharls gdcmexpat gdcmgetopt
gdcmjpeg12 gdcmjpeg16 gdcmjpeg8 gdcmopenjpeg gdcmzlib
# since you built the vtk component, you might also include them here
vtkgdcm gdcm2vtk
)
# then you can replace your line with this
target_link_libraries( TestvtkGDCMImageReader ${GDCM_LIBRARIES} )
Check out where the GDCM dlls you built are installed to see that you get all of the libraries.

Why doesn't libicudata.so get smaller when I remove items from the data library?

I'm trying to build a custom ICU with a minimal data set. I've tried to follow the instructions at Reducing the Size of ICU's Data: Conversion Tables, but many of the files referenced don't exist in the ICU 4.8.1 source distribution. Specifically, I cannot find any files that match ucm*.mk.
I've also tried creating reslocal.mk files as indicated in e.g. ICU's source/data/lang/resfiles.mk. That did not help either. My build is the typical:
$ ./configure --prefix=/some/dir
$ make
$ make install
Regardless of what I do, libicudata.so.48.1 is about 17M. It shouldn't matter, but I'm building on Ubuntu 11.04.
See the note at the top of that page: (see
Note that ICU for C by default comes with pre-built data. The source
data files are not included unless ICU is downloaded from the
source repository. Alternatively, the Data Customizer may be used to
customize the pre-built data.
Your ICU is reading the prebuilt package from icu/source/data/in/*.dat and ignoring the .mk files. We have had requests for the source data to be included as a downloadable .zip and so we plan to do this in the future.
If you have any suggestions for how our instructions can be made more clear, please file a bug. I've added a copy of that notice to the section you referenced.

Resources