Javacard using a final variable or just using a value - javacard

For example:
i) using a value
Util.arrayCopyNonAtomic(buffer,(short)(offset+20), keyTrack, PARAMETER_OFFSET, **(short) 6**);
ii) using a final variable
final static short length=6;
Util.arrayCopyNonAtomic(buffer,(short)(offset+20), keyTrack, PARAMETER_OFFSET, **length**);
so which one is better for javacard development? (let's just say I'm gonna use a lot of "6" later on)

Using final static is better if related to constants. The size of the generated binary for both cases are the same. However, ii) has advantage in code readability and also easier to maintain (if you need to change the value, you only need to change in one place).
NOTE: to avoid confusion, a variable is written as camelCase while a constant (final static) is written using UPPER_CASE. Example:
Util.arrayCopyNonAtomic(buffer, (short) (offset + 20),
keyTrack, PARAMETER_OFFSET, LEN_OF_KEY);

Fields that are final and static and of a primitive type are inlined by the Java compiler as discussed in the Java Language Specification, section 13.4.9. This is independent of JavaCard because it happens even before the conversion to JavaCard binary format (CAP file) takes place.
So, the final binary code will be strictly identical.

Related

HashMap implementation - RPGLE

Is it feasible to implement a sort of hash map in RPGLE?
How would you begin thinking it?
Should I look at the Java source code and "copy" that style?
HashMap should ultimately be compatibile with every data type.
I'd start here:Implementing a HashMap
Should be able to use C code as a basis for an RPGLE version.
Or you could just build the procedures in C and call it from RPGLE.
Depending on your needs (if you don't need a specific order of your elements) you could also use a tree based map which already exists, http://rpgnextgen.com/index.php?content=libtree . It uses the red-black-tree implementation from the libtree project on github (which is wonderfully compatible C code. congrats to the developer).
The project on RPG Next Gen provides wrappers for character and integer keys. You can store any value in it as you pass a pointer and a length for it.
And yes, there is a need for data structures like lists and maps and trees. I use them often for passing data between procedures where I don't know how many elements may be returned. And in most programming languages lists and maps and trees are part of the language or at least part of the runtime library. Sadly not so in RPG.
In the end I did my own implementation.
You can find it here:
GitHub - HASHMAP.RPGLE
It is based on the JDK implementation, but the hash code is calculated from a SHA1 hash, and a module operation is used instead of bit shifting.

Public fixed-length Strings

I am just summarizing info about implementing a digital tree (Trie) in VBA. I am not asking how to do that so please do not post your solutions - my specific question regarding fixed-length Strings in class modules comes at the end of this post.
A Trie is all about efficiency and performance therefore most of other programming languages use a Char data type to represent members of TrieNodes. Since VBA does not have a Char datatype I was thinking about faking it and using a fixed-length String with 1 character.
Note: I can come up with a work-around to this ie. use Byte and a simple function to convert between Chr() and Asc() or an Enum, or delcare as a private str as String * 1 and take advantage of get/let properties but that's not the point. Stay tuned though because...
According to Public Statement on Microsoft Help Page you can't declare a fixed-length String variable in class modules.
I can't find any reasonable explanation for this constrain.
Can anyone give some insight why such a restriction applies to fixed-length Strings in class modules in VBA?
The VBA/VB6 runtime is heavily reliant on the COM system (oleaut32 et al) and this enforces some rules.
You can export a class flile between VB "stuff" but if you publish (or could theoretically publish) it as a COM object it must be able to describe a "fixed length string" in its interface description/type library so that say a C++ client can consume it.
A fixed length string is "special" because it has active behaviour, i.e. its not a dumb datatype, it behaves somewhat like a class; for example its always padded - if you assign to it it will have trailing spaces, in VBA the compiler adds generated code to get that behaviour. A C++ consumer would be unaware of the fixed-length nature of the string because the interface cant describe it/does not support a corresponding type (a String is a BSTR) which could lead to problems.
Strings are of type BSTR and like a byte array you would still lose the padding semantics if you used one of those instead.

How does the CLR string indexer work?

A comment buried in some C++ code in the SSCLI claims, referring to the unmanaged internal implementation of String.Chars property:
This method is not actually used. JIT will generate code for indexer method on string class.
So...what magical code is this? I understand the whole point of jitters is that they produce different code in different situations. But at the very least, for a modern x64 Windows 7+ platform, how might the/a jitter accomplish this? Or is that truly secret sauce?
Additional details
A while ago I was looking for the fastest way to iterate through individual characters in a string in C#.
It turned out the fastest way without resorting to unsafe code or duplicating the contents (via ToCharArray())
was the built-in string indexer, which is actually a call to the String.Chars property. Right in my original
question I asked if anyone had insight into how the indexer actually worked, but despite bumps from both Skeet and
Lippert, I didn't get any responses on that. So I decided to dig into it myself:
Stop 1: mscorlib
By examining mscorlib.dll with ildasm, we can see that String::get_Chars(int32 index) is just an internalcall pointer (plus an attribute):
.method public hidebysig specialname instance char
get_Chars(int32 index) cil managed internalcall
{
.custom instance void System.Security.SecuritySafeCriticalAttribute::.ctor() = ( 01 00 00 00 )
} // end of method String::get_Chars
As noted in the documentation for the MethodImplOptions enumeration, "An internal call is a call to a method that is implemented within the common language runtime itself." Both a 2004 MSDN Magazine article and an SO post indicate that the mapping of internalcall names to unmanaged implementations can be found in ecall.cpp within the Shared Source CLI.
Stop 2: ecapp.cpp
Searching an online copy of ecall.cpp reveals that get_Chars is implemented by COMString::GetCharAt:
FCIntrinsic("get_Chars", COMString::GetCharAt, CORINFO_INTRINSIC_StringGetChar)
Stop 3: comstring.cpp
comstring.cpp does indeed contain an implementation of GetCharAt, starting at line 1219. Except, it's preceded by this comment:
/*==================================GETCHARAT===================================
**Returns the character at position index. Thows IndexOutOfRangeException as
**appropriate.
**This method is not actually used. JIT will generate code for indexer method on string class.
**
==============================================================================*/
First of all, see Hans Passant's comment for the critical bit.
In early .NET (CLR 1 and 2), the CLR had considerable special support for String and StringBuilder types. In fact, the two types worked so closely together, that StringBuilder.ToString was not copying the actual characters anywhere, and the string indexer was still fetching the characters from that same memory location, using special jitter support. I assume that jitter support for String.Chars was originally necessary to avoid passing the index integer via stack, but the jitter seems to have improved since then.
.NET 4 comes with a different implementation of StringBuilder (ropes) that no longer is tied to how String is handled. (It has to copy during ToString, but has much faster appends.) After these changes,
StringBuilder indexer is drammatically slowed down to O(log n) on large strings. See here. It is never inlined, not even on short strings.
String indexer still uses (unpublished) special jitter support. I would expect this one to be basically inlined away into a shift, addition and a memory fetch, or something even faster that the nearest loop would allow.

statically/dynamically typed vs static/dynamic binding

everyone what is the difference between those 4 terms, can You give please examples?
Static and dynamic are jargon words that refer to the point in time at which some programming element is resolved. Static indicates that resolution takes place at the time a program is constructed. Dynamic indicates that resolution takes place at the time a program is run.
Static and Dynamic Typing
Typing refers to changes in program structure that are due to the differences between data values: integers, characters, floating point numbers, strings, objects and so on. These differences can have many effects, for example:
memory layout (e.g. 4 bytes for an int, 8 bytes for a double, more for an object)
instructions executed (e.g. primitive operations to add small integers, library calls to add large ones)
program flow (simple subroutine calling conventions versus hash-dispatch for multi-methods)
Static typing means that the executable form of a program generated at build time will vary depending upon the types of data values found in the program. Dynamic typing means that the generated code will always be the same, irrespective of type -- any differences in execution will be determined at run-time.
Note that few real systems are either purely one or the other, it is just a question of which is the preferred strategy.
Static and Dynamic Binding
Binding refers to the association of names in program text to the storage locations to which they refer. In static binding, this association is predetermined at build time. With dynamic binding, this association is not determined until run-time.
Truly static binding is almost extinct. Earlier assemblers and FORTRAN, for example, would completely precompute the exact memory location of all variables and subroutine locations. This situation did not last long, with the introduction of stack and heap allocation for variables and dynamically-loaded libraries for subroutines.
So one must take some liberty with the definitions. It is the spirit of the concept that counts here: statically bound programs precompute as much as possible about storage layout as is practical in a modern virtual memory, garbage collected, separately compiled application. Dynamically bound programs wait as late as possible.
An example might help. If I attempt to invoke a method MyClass.foo(), a static-binding system will verify at build time that there is a class called MyClass and that class has a method called foo. A dynamic-binding system will wait until run-time to see whether either exists.
Contrasts
The main strength of static strategies is that the program translator is much more aware of the programmer's intent. This makes it easier to:
catch many common errors early, during the build phase
build refactoring tools
incur a significant amount of the computational cost required to determine the executable form of the program only once, at build time
The main strength of dynamic strategies is that they are much easier to implement, meaning that:
a working dynamic environment can be created at a fraction of the cost of a static one
it is easier to add language features that might be very challenging to check statically
it is easier to handle situations that require self-modifying code
Typing - refers to variable tyes and if variables are allowed to change type during program execution
http://en.wikipedia.org/wiki/Type_system#Type_checking
Binding - this, as you can read below can refer to variable binding, or library binding
http://en.wikipedia.org/wiki/Binding_%28computer_science%29#Language_or_Name_binding

What's going on in the 'offsetof' macro?

Visual C++ 2008 C runtime offers an operator 'offsetof', which is actually macro defined as this:
#define offsetof(s,m) (size_t)&reinterpret_cast<const volatile char&>((((s *)0)->m))
This allows you to calculate the offset of the member variable m within the class s.
What I don't understand in this declaration is:
Why are we casting m to anything at all and then dereferencing it? Wouldn't this have worked just as well:
&(((s*)0)->m)
?
What's the reason for choosing char reference (char&) as the cast target?
Why use volatile? Is there a danger of the compiler optimizing the loading of m? If so, in what exact way could that happen?
An offset is in bytes. So to get a number expressed in bytes, you have to cast the addresses to char, because that is the same size as a byte (on this platform).
The use of volatile is perhaps a cautious step to ensure that no compiler optimisations (either that exist now or may be added in the future) will change the precise meaning of the cast.
Update:
If we look at the macro definition:
(size_t)&reinterpret_cast<const volatile char&>((((s *)0)->m))
With the cast-to-char removed it would be:
(size_t)&((((s *)0)->m))
In other words, get the address of member m in an object at address zero, which does look okay at first glance. So there must be some way that this would potentially cause a problem.
One thing that springs to mind is that the operator & may be overloaded on whatever type m happens to be. If so, this macro would be executing arbitrary code on an "artificial" object that is somewhere quite close to address zero. This would probably cause an access violation.
This kind of abuse may be outside the applicability of offsetof, which is supposed to only be used with POD types. Perhaps the idea is that it is better to return a junk value instead of crashing.
(Update 2: As Steve pointed out in the comments, there would be no similar problem with operator ->)
offsetof is something to be very careful with in C++. It's a relic from C. These days we are supposed to use member pointers. That said, I believe that member pointers to data members are overdesigned and broken - I actually prefer offsetof.
Even so, offsetof is full of nasty surprises.
First, for your specific questions, I suspect the real issue is that they've adapted relative to the traditional C macro (which I thought was mandated in the C++ standard). They probably use reinterpret_cast for "it's C++!" reasons (so why the (size_t) cast?), and a char& rather than a char* to try to simplify the expression a little.
Casting to char looks redundant in this form, but probably isn't. (size_t) is not equivalent to reinterpret_cast, and if you try to cast pointers to other types into integers, you run into problems. I don't think the compiler even allows it, but to be honest, I'm suffering memory failure ATM.
The fact that char is a single byte type has some relevance in the traditional form, but that may only be why the cast is correct again. To be honest, I seem to remember casting to void*, then char*.
Incidentally, having gone to the trouble of using C++-specific stuff, they really should be using std::ptrdiff_t for the final cast.
Anyway, coming back to the nasty surprises...
VC++ and GCC probably won't use that macro. IIRC, they have a compiler intrinsic, depending on options.
The reason is to do what offsetof is intended to do, rather than what the macro does, which is reliable in C but not in C++. To understand this, consider what would happen if your struct uses multiple or virtual inheritance. In the macro, when you dereference a null pointer, you end up trying to access a virtual table pointer that isn't there at address zero, meaning that your app probably crashes.
For this reason, some compilers have an intrinsic that just uses the specified structs layout instead of trying to deduce a run-time type. But the C++ standard doesn't mandate or even suggest this - it's only there for C compatibility reasons. And you still have to be careful if you're working with class heirarchies, because as soon as you use multiple or virtual inheritance, you cannot assume that the layout of the derived class matches the layout of the base class - you have to ensure that the offset is valid for the exact run-time type, not just a particular base.
If you're working on a data structure library, maybe using single inheritance for nodes, but apps cannot see or use your nodes directly, offsetof works well. But strictly speaking, even then, there's a gotcha. If your data structure is in a template, the nodes may have fields with types from template parameters (the contained data type). If that isn't POD, technically your structs aren't POD either. And all the standard demands for offsetof is that it works for POD. In practice, it will work - your type hasn't gained a virtual table or anything just because it has a non-POD member - but you have no guarantees.
If you know the exact run-time type when you dereference using a field offset, you should be OK even with multiple and virtual inheritance, but ONLY if the compiler provides an intrinsic implementation of offsetof to derive that offset in the first place. My advice - don't do it.
Why use inheritance in a data structure library? Well, how about...
class node_base { ... };
class leaf_node : public node_base { ... };
class branch_node : public node_base { ... };
The fields in the node_base are automatically shared (with identical layout) in both the leaf and branch, avoiding a common error in C with accidentally different node layouts.
BTW - offsetof is avoidable with this kind of stuff. Even if you are using offsetof for some jobs, node_base can still have virtual methods and therefore a virtual table, so long as it isn't needed to dereference member variables. Therefore, node_base can have pure virtual getters, setters and other methods. Normally, that's exactly what you should do. Using offsetof (or member pointers) is a complication, and should only be used as an optimisation if you know you need it. If your data structure is in a disk file, for instance, you definitely don't need it - a few virtual call overheads will be insignificant compared with the disk access overheads, so any optimisation efforts should go into minimising disk accesses.
Hmmm - went off on a bit of a tangent there. Whoops.
char is guarenteed to be the smallest number of bits the architectural can "bite" (aka byte).
All pointers are actually numbers, so cast adress 0 to that type because it's the beginning.
Take the address of member starting from 0 (resulting into 0 + location_of_m).
Cast that back to size_t.
1) I also do not know why it is done in this way.
2) The char type is special in two ways.
No other type has weaker alignment restrictions than the char type. This is important for reinterpret cast between pointers and between expression and reference.
It is also the only type (together with its unsigned variant) for which the specification defines behavior in case the char is used to access stored value of variables of different type. I do not know if this applies to this specific situation.
3) I think that the volatile modifier is used to ensure that no compiler optimization will result in attempt to read the memory.
2 . What's the reason for choosing char reference (char&) as the cast target?
if type s has operator& overloaded then we can't get address using &s
so we reinterpret_cast the type s to primitive type char because primitive type char
doesn't have operator& overloaded
now we can get address from that
if in C then reinterpret_cast is not required
3 . Why use volatile? Is there a danger of the compiler optimizing the loading of m? If so, in what exact way could that happen?
here volatile is not relevant to compiler optimizing.
if type s have const or volatile or both qualifier(s) then
reinterpret_cast can't cast to char& because reinterpret_cast can't remove cv-qualifiers
so result is using <const volatile char&> for casting work from any combination

Resources