Internal array representation in v8/node.js - node.js

I wrote a small memory benchmark for node.js: http://pastebin.com/KfZ4Ucn4
It measures memory usage using process.memoryUsage().heapUsed for 3 cases:
Array of objects with 10 properties, different property names for each element
Array of objects with 10 properties, same property names
Array of objects with 10 properties, same property names, represented as an object of arrays
The overhead turns out to be 1300 bytes for Case 1, 300 bytes for Case 2 and 150 bytes for Case 3. Also only Case 1 garbage collects, while in latter cases memory usage doesn't go down.
Is there any explanation or documentation for these effects? I'm trying to optimize memory usage for an array of objects of objects, something like
[ {
foo : { bar : 1, baz : 2 }
, bar : { bar : 2, baz : 7 }
}
, {
foo : { bar : 1, baz : 2 }
, bar : { bar : 2, baz : 7 }
} ]
Any clues?

I'm guessing that this has to do with the way that V8 uses "hidden classes" to represent similar objects, but what you are reporting seems to be a pretty dramatic difference in footprints...
You can read more about hidden classes here: https://developers.google.com/v8/design
although that article seems to be more focused on speed than memory usage.

Related

What does the int value returned from compareTo function in Kotlin really mean?

In the documentation of compareTo function, I read:
Returns zero if this object is equal to the specified other object, a
negative number if it's less than other, or a positive number if it's
greater than other.
What does this less than or greater than mean in the context of strings? Is -for example- Hello World less than a single character a?
val epicString = "Hello World"
println(epicString.compareTo("a")) //-25
Why -25 and not -10 or -1 (for example)?
Other examples:
val epicString = "Hello World"
println(epicString.compareTo("HelloWorld")) //-55
Is Hello World less than HelloWorld? Why?
Why it returns -55 and not -1, -2, -3, etc?
val epicString = "Hello World"
println(epicString.compareTo("Hello World")) //55
Is Hello World greater than Hello World? Why?
Why it returns 55 and not 1, 2, 3, etc?
I believe you're asking about the implementation of compareTo method for java.lang.String. Here is a source code for java 11:
public int compareTo(String anotherString) {
byte v1[] = value;
byte v2[] = anotherString.value;
if (coder() == anotherString.coder()) {
return isLatin1() ? StringLatin1.compareTo(v1, v2)
: StringUTF16.compareTo(v1, v2);
}
return isLatin1() ? StringLatin1.compareToUTF16(v1, v2)
: StringUTF16.compareToLatin1(v1, v2);
}
So we have a delegation to either StringLatin1 or StringUTF16 here, so we should look further:
Fortunately StringLatin1 and StringUTF16 have similar implementation when it comes to compare functionality:
Here is an implementation for StringLatin1 for example:
public static int compareTo(byte[] value, byte[] other) {
int len1 = value.length;
int len2 = other.length;
return compareTo(value, other, len1, len2);
}
public static int compareTo(byte[] value, byte[] other, int len1, int len2) {
int lim = Math.min(len1, len2);
for (int k = 0; k < lim; k++) {
if (value[k] != other[k]) {
return getChar(value, k) - getChar(other, k);
}
}
return len1 - len2;
}
As you see, it iterated over the characters of the shorter string and in case the charaters in the same index of two strings are different it returns the difference between them. If during the iterations it doesn't find any different (one string is prefix of another) it resorts to the comparison between the length of two strings.
In your case, there is a difference in the first iteration already...
So its the same as `"H".compareTo("a") --> -25".
The code of "H" is 72
The code of "a" is 97
So, 72 - 97 = -25
Short answer: The exact value doesn't have any meaning; only its sign does.
As the specification for compareTo() says, it returns a -ve number if the receiver is smaller than the other object, a +ve number if the receiver is larger, or 0 if the two are considered equal (for the purposes of this ordering).
The specification doesn't distinguish between different -ve numbers, nor between different +ve numbers — and so neither should you.  Some classes always return -1, 0, and 1, while others return different numbers, but that's just an implementation detail — and implementations vary.
Let's look at a very simple hypothetical example:
class Length(val metres: Int) : Comparable<Length> {
override fun compareTo(other: Length)
= metres - other.metres
}
This class has a single numerical property, so we can use that property to compare them.  One common way to do the comparison is simply to subtract the two lengths: that gives a number which is positive if the receiver is larger, negative if it's smaller, and zero of they're the same length — which is just what we need.
In this case, the value of compareTo() would happen to be the signed difference between the two lengths.
However, that method has a subtle bug: the subtraction could overflow, and give the wrong results if the difference is bigger than Int.MAX_VALUE.  (Obviously, to hit that you'd need to be working with astronomical distances, both positive and negative — but that's not implausible.  Rocket scientists write programs too!)
To fix it, you might change it to something like:
class Length(val metres: Int) : Comparable<Length> {
override fun compareTo(other: Length) = when {
metres > other.metres -> 1
metres < other.metres -> -1
else -> 0
}
}
That fixes the bug; it works for all possible lengths.
But notice that the actual return value has changed in most cases: now it only ever returns -1, 0, or 1, and no longer gives an indication of the actual difference in lengths.
If this was your class, then it would be safe to make this change because it still matches the specification.  Anyone who just looked at the sign of the result would see no change (apart from the bug fix).  Anyone using the exact value would find that their programs were now broken — but that's their own fault, because they shouldn't have been relying on that, because it was undocumented behaviour.
Exactly the same applies to the String class and its implementation.  While it might be interesting to poke around inside it and look at how it's written, the code you write should never rely on that sort of detail.  (It could change in a future version.  Or someone could apply your code to another object which didn't behave the same way.  Or you might want to expand your project to be cross-platform, and discover the hard way that the JavaScript implementation didn't behave exactly the same as the Java one.)
In the long run, life is much simpler if you don't assume anything more than the specification promises!

sig literals in Alloy

How can I write out a literal for a sig in Alloy? Consider the example below.
sig Foo { a: Int }
fact { #Foo = 1 }
If I execute this, I get
| this/Foo | a |
|----------|---|
| Foo⁰ | 7 |
In the evaluator, I know I can get a reference to the Foo instance with Foo$0 but how can I write a literal that represents the same value?
I've tried {a: 7}, but this is not equal to Foo$0. This is intentionally a trivial example, but I'm debugging a more complex model and I need to be able to write out literals of sigs with multiple fields.
Ah, this is one of the well hidden secrets! :-) Clearly in your model you cannot refer to atoms since the model is defining all possible values of those atoms. However, quite often you need your hands on some atom to reason about it. That is, you want to be able to name some objects.
The best way to get 'constants' is to create a predicate you call from a run clause. In this predicate, you define names for atoms you want to discuss. You only have to make sure this predicate is true.
pred collision[ car1, car2 : Car, road : Road ] {
// here you can reason about car1 and car2
}
run collision for 10
Another way is to create a quantification whenever you need to have some named objects:
run {
some car1, car2 : Car, road : Road {
// here you can reason about car1 and car2 and road
}
} for 10
There was a recent discussion to add these kinds of instances to the language so that Kodkod could take advantage of them. (It would allow faster solving and it is extremely useful for test cases of your model.) However, during a discussion this solution I presented came forward and it does not require any new syntax.
try to put a limitation for 'Integer' in the 'run' command. I mean :
sig Foo {a : Int}
fact{ #Foo = 1}
pred show {}
run show for 1 Foo, 2 Int

Why doesn't simple integer counterexample occur in Alloy?

I am trying to model a relationship between a numeric variable and a boolean variable, in which if the numeric variable is in a certain range then the boolean variable will change value. I'm new to Alloy, and am having trouble understanding how to constrain my scope sufficiently to yield the obvious counterexample. My code is as follows:
open util/boolean
one sig Object {
discrete : one Bool,
integer : one Int
}
fact { all o : Object | o.integer > 0 and o.integer < 10 }
fact { all o : Object | o.integer > 5 iff o.discrete = False }
assert discreteCondition { all o : Object | o.discrete = True }
check discreteCondition for 1000
Since o.integer is integer-values and ranges from 0 to 10, it could only be one of 10 different choices. And I specified that each Object should only have one integer and one discrete. So it seems reasonable to me that there are really only 10 cases to check here: one case for each value of integer. And yet even with 1000 cases, I get
No counterexample found.
If I remove the integer variable and related facts then it does find the counterexample almost immediately. I have also tried using other solvers and increasing various depth and memory values in the Options, but this did not help, so clearly my code is at fault.
How can I limit my scope to make Alloy find the counterexample (by iterating over possible values of the integer)? Thanks!
By default, the bitwidth used to represent integers is 4 so only integer in the range [-8,7] are considered during the instance generation, and so, due to integer overflows, your first fact is void (as 10 is outside this range).
To fix the problem, increase the bitwidth used to at least 5:
check discreteCondition for 10 but 5 Int.
Note that a scope of 1000 does not mean that you consider 1000 case in your analysis. The scope is the maximum number of atoms present in the generated instance, typed after a given signature. In your case you have only one signature with multiplicity one. So analyzing your model with a scope of 1 or 10000 doesn't change anything. There'll still be only one Object atom in the instance generated.
You might want to check this Q/A to learn more about scopes Specifying A Scope for Sig in Alloy

Model a finite set of integers

Below is an Alloy model representing this set of integers: {0, 2, 4, 6}
As you know, the plus symbol (+) denotes set union. How can 0 be unioned to 2? 0 and 2 are not sets. I thought the union operator applies only to sets? Isn't this violating a basic notion of set union?
Second question: Is there a better way to model this, one that is less cognitively jarring?
one sig List {
numbers: set Int
} {
numbers = 0 + 2 + 4 + 6
}
In Alloy, everything you work with is a set of tuples. none is the empty set, and many sets are sets of relations (tuples with arity > 1). So also each integer, when you use it, is a set with a relation of arity 1 and cardinality 1. I.e. in Alloy when you use 1 it is really {(1)}, a set of a type containing the atom 1. I.e. the definition is in reality like:
enum Int {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7}
Ints in Alloy are just not very good integers :-( The finite set of atoms is normally not a problem but with Ints there are just too few of them to be really useful. Worse, they quickly overflow and Alloy is not good in handling this at all.
But I do agree it looks ugly. I have an even worse problem with seq.
0-A + 1->B + 2->C + 3->C
I already experimented with adding literal seq to Alloy and got an experimental version running. Maybe sets could also be implemented this way:
// does not work in Alloy 4
seq [ A, B, C, C ] = 0->A + 1->B + 2->C + 3->C
set [ 1, 2, 3, 3 ] = 1+2+3
Today you could do this:
let x[a , b ] = { a + b }
run {
x[1,x[2,x[3,4]]] = 1+2+3+4
} for 4 int
But not sure I like this any better. If macros would have meta fields or would make the arguments available as a sequence (like most interpreters have) then we could do this
// does not work in Alloy 4
let list[ args ... ] = Int.args // args = seq univ
run {
range[ list[1,2,3,4,4] ] = 1+2+3+4
}
If you like the seq [ A, B, C, C ] syntax or the varargs then start a thread on the AlloyTools list. As said, I got the seq [ A, B, C, C ] working in a prototype.

Inconsistencies when using UnsafeMutablePointer with String or Character types

I'm currently trying to implement my own DynamicArray data type in Swift. To do so I'm using pointers a bit. As my root I'm using an UnsafeMutablePointer of a generic type T:
struct DynamicArray<T> {
private var root: UnsafeMutablePointer<T> = nil
private var capacity = 0 {
didSet {
//...
}
}
//...
init(capacity: Int) {
root = UnsafeMutablePointer<T>.alloc(capacity)
self.capacity = capacity
}
init(count: Int, repeatedValue: T) {
self.init(capacity: count)
for index in 0..<count {
(root + index).memory = repeatedValue
}
self.count = count
}
//...
}
Now as you can see I've also implemented a capacity property which tells me how much memory is currently allocated for root. Accordingly one can create an instance of DynamicArray using the init(capacity:) initializer, which allocates the appropriate amount of memory, and sets the capacity property.
But then I also implemented the init(count:repeatedValue:) initializer, which first allocates the needed memory using init(capacity: count). It then sets each segment in that part of memory to the repeatedValue.
When using the init(count:repeatedValue:) initializer with number types like Int, Double, or Float it works perfectly fine. Then using Character, or String though it crashes. It doesn't crash consistently though, but actually works sometimes, as can be seen here, by compiling a few times.
var a = DynamicArray<Character>(count: 5, repeatedValue: "A")
println(a.description) //prints [A, A, A, A, A]
//crashes most of the time
var b = DynamicArray<Int>(count: 5, repeatedValue: 1)
println(a.description) //prints [1, 1, 1, 1, 1]
//works consistently
Why is this happening? Does it have to do with String and Character holding values of different length?
Update #1:
Now #AirspeedVelocity addressed the problem with init(count:repeatedValue:). The DynamicArray contains another initializer though, which at first worked in a similar fashion as init(count:repeatedValue:). I changed it to work, as #AirspeedVelocity described for init(count:repeatedValue:) though:
init<C: CollectionType where C.Generator.Element == T, C.Index.Distance == Int>(collection: C) {
let collectionCount = countElements(collection)
self.init(capacity: collectionCount)
root.initializeFrom(collection)
count = collectionCount
}
I'm using the initializeFrom(source:) method as described here. And since collection conforms to CollectionType it should work fine.
I'm now getting this error though:
<stdin>:144:29: error: missing argument for parameter 'count' in call
root.initializeFrom(collection)
^
Is this just a misleading error message again?
Yes, chances are this doesn’t crash with basic inert types like integers but does with strings or arrays because they are more complex and allocate memory for themselves on creation/destruction.
The reason it’s crashing is that UnsafeMutablePointer memory needs to be initialized before it’s used (and similarly, needs to de-inited with destroy before it is deallocated).
So instead of assigning to the memory property, you should use the initialize method:
for index in 0..<count {
(root + index).initialize(repeatedValue)
}
Since initializing from another collection of values is so common, there’s another version of initialize that takes one. You could use that in conjunction with another helper struct, Repeat, that is a collection of the same value repeated multiple times:
init(count: Int, repeatedValue: T) {
self.init(capacity: count)
root.initializeFrom(Repeat(count: count, repeatedValue: repeatedValue))
self.count = count
}
However, there’s something else you need to be aware of which is that this code is currently inevitably going to leak memory. The reason being, you will need to destroy the contents and dealloc the pointed-to memory at some point before your DynamicArray struct is destroyed, otherwise you’ll leak. Since you can’t have a deinit in a struct, only a class, this won’t be possible to do automatically (this is assuming you aren’t expecting users of your array to do this themselves manually before it goes out of scope).
Additionally, if you want to implement value semantics (as with Array and String) via copy-on-write, you’ll also need a way of detecting if your internal buffer is being referenced multiple times. Take a look at ManagedBufferPointer to see a class that handles this for you.

Resources