In the example of defining a custom hash function on page 114 of Nim in Action, the !$ operator is used to "finalize the computed hash".
import tables, hashes
type
Dog = object
name: string
proc hash(x: Dog): Hash =
result = x.name.hash
result = !$result
var dogOwners = initTable[Dog, string]()
dogOwners[Dog(name: "Charlie")] = "John"
And in the paragraph below:
The !$ operator finalizes the computed hash, which is necessary when writing a custom hash procedure. The use of the $! operator ensures that the computed hash is unique.
I am having trouble understanding this. What does it mean to "finalize" something? And what does it mean to ensure that something is unique in this context?
Your questions might become answered if instead of reading the single description of the !$ operator you take a look at the beginning of the hashes module documentation. As you can see there, primitive data types have a hash() proc which returns their own hash. But if you have a complex object with many variables, you might want to create a single hash for the object itself, and how do you do that? Without going into hash theory, and treating hashes like black boxes, you need to use two kind of procs to produce a valid hash: the addition/concatenation operator and the finalization operator. So you end up using !& to keep adding (or mixing) individual hashes into a temporal value, and then use !$ to finalize that temporal value into a final hash. The Nim in Action example might have been easier to understand if the Dog object had more than a single variable, thus requiring the use of both operators:
import tables, hashes, sequtils
type
Dog = object
name: string
age: int
proc hash(x: Dog): Hash =
result = x.name.hash !& x.age.hash
result = !$result
var dogOwners = initTable[Dog, string]()
dogOwners[Dog(name: "Charlie", age: 2)] = "John"
dogOwners[Dog(name: "Charlie", age: 5)] = "Martha"
echo toSeq(dogOwners.keys)
for key, value in dogOwners:
echo "Key ", key.hash, " for ", key, " points at ", value
As for why are hash values temporarily concatenated and then finalized, that depends much on which algorithms have the Nim developers chosen to use for hashing. You can see from the source code that hash concatenation and finalization is mostly bit shifting. Unfortunately the source code doesn't explain or point at any other reference to understand why is that done and why this specific hashing algorithm was selected compared to others. You could try asking the Nim forums for that, and maybe improve the documentation/source code with your findings.
Related
I need to evaluate a dynamic logical expression and I know that in ABAP it is not possible.
I found the class cl_java_script and with this class I could achieve my requeriment. I've try something like this:
result = cl_java_script=>create( )->evaluate( `( 1 + 2 + 3 ) == 6 ;` ).
After the method evaluate execution result = true as espected. But my happiness is over when I look into the class documentation that says This class is obsolete.
My question is, there is another way to achieve this?
Using any turing complete language to parse a "dynamic logical expression" is a terrible idea, as an attacker might be able to run any program inside your expression, i.e. while(true) { } will crash your variant using cl_java_script. Also although I don't know the details of cl_java_script, I assume it launches a separate JS runtime in a separate thread somewhere, this does not seem to be the most efficient choice to calculate such a small dynamic expression.
Instead you could implement your own small parser. This has the advantage that you can limit what it supports to the bare minimum whilst being able to extend it to everything you need in your usecase. Here's a small example using reverse polish notation which is able to correctly evaluate the expression you've shown (using RPN simplifies parsing a lot, though for sure one can also build a full fledged expression parser):
REPORT z_expr_parser.
TYPES:
BEGIN OF multi_value,
integer TYPE REF TO i,
boolean TYPE REF TO bool,
END OF multi_value.
CLASS lcl_rpn_parser DEFINITION.
PUBLIC SECTION.
METHODS:
constructor
IMPORTING
text TYPE string,
parse
RETURNING VALUE(result) TYPE multi_value.
PRIVATE SECTION.
DATA:
tokens TYPE STANDARD TABLE OF string,
stack TYPE STANDARD TABLE OF multi_value.
METHODS pop_int
RETURNING VALUE(result) TYPE i.
METHODS pop_bool
RETURNING VALUE(result) TYPE abap_bool.
ENDCLASS.
CLASS lcl_rpn_parser IMPLEMENTATION.
METHOD constructor.
" a most simple lexer:
SPLIT text AT ' ' INTO TABLE tokens.
ASSERT lines( tokens ) > 0.
ENDMETHOD.
METHOD pop_int.
DATA(peek) = stack[ lines( stack ) ].
ASSERT peek-integer IS BOUND.
result = peek-integer->*.
DELETE stack INDEX lines( stack ).
ENDMETHOD.
METHOD pop_bool.
DATA(peek) = stack[ lines( stack ) ].
ASSERT peek-boolean IS BOUND.
result = peek-boolean->*.
DELETE stack INDEX lines( stack ).
ENDMETHOD.
METHOD parse.
LOOP AT tokens INTO DATA(token).
IF token = '='.
DATA(comparison) = xsdbool( pop_int( ) = pop_int( ) ).
APPEND VALUE #( boolean = NEW #( comparison ) ) TO stack.
ELSEIF token = '+'.
DATA(addition) = pop_int( ) + pop_int( ).
APPEND VALUE #( integer = NEW #( addition ) ) TO stack.
ELSE.
" assumption: token is integer
DATA value TYPE i.
value = token.
APPEND VALUE #( integer = NEW #( value ) ) TO stack.
ENDIF.
ENDLOOP.
ASSERT lines( stack ) = 1.
result = stack[ 1 ].
ENDMETHOD.
ENDCLASS.
START-OF-SELECTION.
" 1 + 2 + 3 = 6 in RPN:
DATA(program) = |1 2 3 + + 6 =|.
DATA(parser) = NEW lcl_rpn_parser( program ).
DATA(result) = parser->parse( ).
ASSERT result-boolean IS BOUND.
ASSERT result-boolean->* = abap_true.
SAPs BRF is an option, but potentially massive overkill in your scenario.
Here is a blog on calling BRF from abap.
And here is how Rules/Expressions can be defined dynamically.
BUT, if you know enough about the source problem to generate
1 + 2 + 3 = 6
Then it is hard to imagine why a simple custom parser cant be used.
Just how complex should the expressions be ?
Id probably write my own parser before investing in calling BRF.
Since some/many BSPs use server side JAVAscript and not ABAP as the scripting language, i cant see SAP removing the Kernel routine anytime soon.
SYSTEM-CALL JAVA SCRIPT EVALUATE.
SO maybe consider Just calling the cl_java_script anyway until it is an issue.
Then worry about a parser if and when it is really no longer a valid call.
But definitely some movement in the obsolete space here.
SAP is pushing/forcing you to cloud with the SDK, to execute such things.
https://sap.github.io/cloud-sdk/docs/js/overview-cloud-sdk-for-javascript
Look at the code below and explain what it illustrates:
hash = { ’one’ : 1, ’two’ : 2, ’three’ : 3 }
hash = {’three’: 3, ’two’: 2, ’one’: 1}
hash = {'one':1,'two':2,'three':3}
hash = {'three':3,'two':2,'one':1}
This is the correct declaration of hash (without typographical errors).
In order to get the value, you are supposed to use hash[key].
Considering the above two hashes:
If you type in hash[three], it will return the value 3.
The output(3) remains the same in case of both the above hashes.
A hash is different from a list. In a list, you would use the index/position to get the value.
a = [1,2,3]
a = [3,2,1]
If you type in a[0], the value will be different in both the cases.
But in case of hash, you are using the assigned key to access the value and therefore, it doesn't make a difference. The order doesn't matter. This is because hash functions doesn't care about the position. It works only on the basis of key/value pair.
In short, both the hashes are the same and would give the same output for a particular key.
Assuming you remove your obvious typographic errors, they might not necessarily be the same, but they will certainly be indistinguishable.
That is, there is no operation you can apply to either version that will reveal the order that the elements are inserted into the container.
Basically, you are creating a dictionary which is a data structure contained from key-value pairs. What you see on the left of ":" is the key and what you see on the right of ":" is the value.
In order to get a specific value of a key you use : hash[key]
There is no difference between the two lines, because the order doesn't count when you create dictionaries since the key-value pairs are the same.
>>> hash = {'three':3,'two':2,'one':1}
>>> hash1 = {'one':1,'two':2,'three':3}
>>> hash == hash1
True
>>> type(hash) == type(hash1)
True
>>> id(hash) == id(hash1)
False
While studying the Groovy (2.4.4) syntax in the official documentation, I came across the special behavior concerning maps with GStrings as identifiers. As described in the documentation, GStrings are a bad idea as (hash)map identifiers, because the hashcodes of a non-evaluated GString-object differs from a regular String-object with the same representation as the evaluated GString.
Example:
def key = "id"
def m = ["${key}": "value for ${key}"]
println "id".hashCode() // prints "3355"
println "${key}".hashCode() // prints "3392", different hashcode
assert m["id"] == null // evaluates true
However, my intuitive expectation was that using the actual GString identifier to address a key in the map will in fact deliver the value - but it does not.
def key = "id"
def m = ["${key}": "value for ${key}"]
assert m["${key}"] == null // evaluates also true, not expected
That made me curious. So I had several suggestions concerning this issue and did some experiments.
(pls keep in my mind that I am new to Groovy and I was just brainstorming on the fly - continue to Suggestion #4 if you do not want to read how I tried to examine the cause of the issue)
Suggestion #1. hashcode for GString objects works/is implemented somewhat non-deterministic for whatever reason and delivers different results depending on the context or the actual object.
That turned out to be nonsense quite fast:
println "${key}".hashCode() // prints "3392"
// do sth else
println "${key}".hashCode() // still "3392"
Suggestion #2. The actual key in the map or the map item does not have the expected representation or hashcode.
I took a closer look at the item in the map, the key, and its hashcode.
println m // prints "[id:value for id]", as expected
m.each {
it -> println key.hashCode()
} // prints "3355" - hashcode of the String "id"
So the hashcode of the key inside the map is different from the GString hashcode. HA! or not. Though it is nice to know, it is actually not relevant because I still do know the actual hashcodes in the map index. I just rehashed a key that has been transformed to a string after being put into the index. So what else?
Suggestion #3. The equals-method of a GString has an unknown or non- implemented behavior.
No matter whether two hashcodes are equal, they may not represent the same object in a map. That depends on the implementation of the equals method for the class of the key-object. If the equals-method is, for instance, not implemented, two objects are not equal even if the hashcode is identical and therefore the desired map key cannot be adressed properly. So I tried:
def a = "${key}"
def b = "${key}"
assert a.equals(b) // returns true (unfortunate but expected)
So two representations of the same GString are equal by default.
I skip some others ideas I tried and continue with the last thing I tried just before I was going to write this post.
Suggestion #4. The syntax of access matters.
That was a real killer of understanding. I knew before: There are syntactically different ways two access map values. Each way has its restrictions, but I thought the results stay the same. Well, this came up:
def key = "id"
def m = ["${key}": "value for ${key}"]
assert m["id"] == null // as before
assert m["${key}"] == null // as before
assert m.get("${key}") == null // assertion fails, value returned
So if I use the get-method of a map, I get the actual value in the way I expected it to in the first place.
What is the explanation for this map access behavior concerning GStrings? (or what kind of rookie mistake is hidden here?)
Thanks for your patience.
EDIT: I am afraid that my actual question is not clearly stated, so here is the case in short and concise:
When I have a map with a GString as a key like this
def m = ["${key}": "value for ${key}"]
why does this return the value
println m.get("${key}")
but that does not
println m["${key}"]
?
You can look at this matter with a very different approach. A map is supposed to have immutable keys (at least for hashcode and equals), because the map implementation depends on this. GString is mutable, thus not really suited for map keys in general. There is also the problem of calling String#equals(GString). GString is a Groovy class, so we can influence the equals method to equal to a String just fine. But String is very different. That means calling equals on a String with a GString will always be false in the Java world, even if hashcode() would behave the same for String and GString. And now imagine a map with String keys and you ask the map for a value with a GString. It would always return null. On the other hand a map with GString keys queried with a String could return the "proper" value. This means there will always be a disconnection.
And because of this problem GString#hashCode() is not equal to String#hashCode() on purpose.
It is in no way non-deterministic, but a GString hashcode can change, if the participating objects change their toString representation:
def map = [:]
def gstring = "$map"
def hashCodeOld = gstring.hashCode()
assert hashCodeOld == gstring.hashCode()
map.foo = "bar"
assert hashCodeOld != gstring.hashCode()
Here the toString representation of map will change for Groovy and GString, thus the GString will produce a different hashcode
The challenge is like that:
Array Buffers are the backend for Typed Arrays. Whereas Buffer in node
is both the raw bytes as well as the encoding/view, Array Buffers are
only raw bytes and you have to create a Typed Array on top of an Array
Buffer in order to access the data.
When you create a new Typed Array and don't give it an Array Buffer to
be a view on top of it will create it's own new Array Buffer instead.
For this challenge, take the integer from process.argv[2] and write it
as the first element in a single element Uint32Array. Then create a
Uint16Array from the Array Buffer of the Uint32Array and log out to
the console the JSON stringified version of the Uint16Array.
Bonus: try to explain the relevance of the integer from
process.argv[2], or explain why the Uint16Array has the particular
values that it does.
The solution given by the author is like that:
var num = +process.argv[2]
var ui32 = new Uint32Array(1)
ui32[0] = num
var ui16 = new Uint16Array(ui32.buffer)
console.log(JSON.stringify(ui16))
I don't understand what does the plus sign in the first line means? And I can't understand the logic of this block of code either.
Thank you a lot if you can solve my puzzle.
Typed arrays are often used in context of asm.js, a strongly typed subset of JavaScript that is highly optimisable. "strongly typed" and "subset of JavaScript" are contradictory requirements, since JavaScript does not natively distinguish integers and floats, and has no way to declare them. Thus, asm.js adopts the convention that the following no-ops on integers and floats respectively serve as declarations and casts:
n|0 is n for every integer
+n is n for every float
Thus,
var num = +process.argv[2]
would be the equivalent of the following line in C or Java:
float num = process.argv[2]
declaring a floating point variable num.
It is puzzling though, I would have expected the following, given the requirement for integers:
var num = (process.argv[2])|0
Even in normal JavaScript though they can have uses, because they will also convert string representations of integers or floats respectively into numbers.
Xapian docs talk about a query constructor that takes a term position parameter, to be used in phrase searches:
Quote:
This constructor actually takes a couple of extra parameters, which
may be used to specify positional and frequency information for terms
in the query:
Xapian::Query(const string & tname_,
Xapian::termcount wqf_ = 1,
Xapian::termpos term_pos_ = 0)
The term_pos represents the position of the term in the query. Again,
this isn't useful for a single term query by itself, but is used for
phrase searching, passage retrieval, and other operations which
require knowledge of the order of terms in the query (such as
returning the set of matching terms in a given document in the same
order as they occur in the query). If such operations are not
required, the default value of 0 may be used.
And in the reference, we have:
Xapian::Query::Query ( const std::string & tname_,
Xapian::termcount wqf_ = 1,
Xapian::termpos pos_ = 0
)
A query consisting of a single term.
And:
typedef unsigned termpos
A term position within a document or query.
So, say I want to build a query for the phrase: "foo bar baz", how do I go about it?!
Does term_pos_ provide relative position values, ie define the order of terms within the document:
(I'm using here the python bindings API, as I'm more familiar with it)
q = xapian.Query(xapian.Query.OP_AND, [xapian.Query("foo", wqf, 1),xapian.Query("bar", wqf,2),xapian.Query("baz", wqf,3)] )
And just for the sake of testing, suppose we did:
q = xapian.Query(xapian.Query.OP_AND, [xapian.Query("foo", wqf, 3),xapian.Query("bar", wqf, 4),xapian.Query("baz", wqf, 5)] )
So this would give the same results as the previous example?!
And suppose we have:
q = xapian.Query(xapian.Query.OP_AND, [xapian.Query("foo", wqf, 2),xapian.Query("bar", wqf, 4),xapian.Query("baz", wqf, 5)] )
So now this would match where documents have "foo" "bar" separated with one term, followed by "baz" ??
Is it as such, or is it that this parameter is referring to absolute positions of the indexed terms?!
Edit:
And how is OP_PHRASE related to this? I find some online samples using OP_PHRASE as such:
q = xapian.Query(xapian.Query.OP_PHRASE, term_list)
This makes obvious sense, but then what is the role of the said term_pos_ constructor in phrase searches - is it a more surgical way of doing things!?
int pos = 1;
std::list<Xapian::Query> subs;
subs.push_back(Xapian::Query("foo", 1, pos++));
subs.push_back(Xapian::Query("bar", 1, pos++));
querylist.push_back(Xapian::Query(Xapian::Query::OP_PHRASE, subs.begin(), subs.end()));