How big is an empty Hashtable object - java-me

What is the size (in bytes) of the Hashtable object in J2ME? I mean what is the overhead for using a Hashtable?

For an empty hashtable this will probably vary widely by device.
You can get a ballpark measurement yourself as follows:
Runtime rt = Runtime.getRuntime();
long freeMem = rt.freeMemory();
Hashtable ht = new Hashtable();
long sizeofHashtable = freeMem - rt.freeMemory();

A hashtable is 24 bytes for the basic object + 2 ints (4 bytes each) for the _numberOfKeys and the _threshold. The _hash, _key and _value (internal hashtable variables) will be determined by the capacity of the hashtable and the size of the objects in the hashtable. The capacity is set to 11 if you don't pass it in the constructor, and the hashtable has logic to increase capacity if more is required.
The _hash is an array of ints (the hashs) and therefore equals the hashtable capacity (notice: capacity not number of keys) * 4 bytes. The _key and _value are arrays of Object type, so even if they're all null, they take the 4 bytes for empty pointers.
Hope this helps anyone!

Related

will variables defined in non static method be garbage collected with object

class Utility{
public String a = "aaaa huge string";
public void doSomething() {
String b = "bbbb huge string";
.....
}
}
given class Utility, here are my method calls.
Step 1) Utility u = new Utility();
Step 2) u.doSomething();
Step 3) u = null;
When object u is garbage collected after step 3, will the String b also be removed from the String pool?
When will strings a and b be loaded and removed (if at all) from memory?
Variables defined in methods are not members of the instance, so the GC of objects they reference is completely independent of the GC of the instance.
When object u is garbage collected after step 3
If object u is garbage collected after step 3
will the String b also be removed from the String pool?
u has nothing to do with b: see above.
When will strings a and b be loaded and removed (if at all) from memory?
Loaded with the class, because they are string literals. Removal depends on GC and interning.

Allocation memory for string

In D string is alias on immutable char[]. So every operation on string processing with allocation of memory. I tied to check it, but after replacing symbol in string I see the same address.
string str = "big";
writeln(&str);
str.replace("i","a");
writeln(&str);
Output:
> app.exe
19FE10
19FE10
I tried use ptr:
string str = "big";
writeln(str.ptr);
str.replace(`i`,`a`);
writeln(str.ptr);
And got next output:
42E080
42E080
So it's showing the same address. Why?
You made a simple error in your code:
str.replace("i","a");
str.replace returns the new string with the replacement done, it doesn't actually replace the existing variable. So try str = str.replace("i", "a"); to see the change.
But you also made a too broadly general statement about allocations:
So every operation on string processing with allocation of memory.
That's false, a great many operations do not require allocation of new memory. Anything that can slice the existing string will do so, avoiding needing new memory:
import std.string;
import std.stdio;
void main() {
string a = " foo ";
string b = a.strip();
assert(b == "foo"); // whitespace stripped off...
writeln(a.ptr);
writeln(b.ptr); // but notice how close those ptrs are
assert(b.ptr == a.ptr + 2); // yes, b is a slice of a
}
replace will also return the original string if no replacement was actually done:
string a = " foo ";
string b = a.replace("p", "a"); // there is no p to replace
assert(a.ptr is b.ptr); // so same string returned
Indexing and iteration require no new allocation (of course). Believe it or not, but even appending sometimes will not allocate because there may be memory left at the end of the slice that is not yet used (though it usually will).
There's also various functions that return range objects that do the changes as you iterate through them, avoiding allocation. For example, instead of replace(a, "f", "");, you might do something like filter!(ch => ch != 'f')(a); and loop through, which doesn't allocate a new string unless you ask it to.
So it is a lot more nuanced than you might think!
In D all arrays are a length + a pointer to the start of the array values. These are usually stored on the stack which just so happens to be RAM.
When you go take an address of a variable (which is in a function body) what you really are doing is getting a pointer to the stack.
To get the address of an array values use .ptr.
So replace &str with str.ptr and you will get the correct output.

Fastest, leanest way to append characters to form a string in Swift

I come from a C# background where System.String is immutable and string concatenation is relatively expensive (as it requires reallocating the string) we know to use the StringBuilder type instead as it preallocates a larger buffer where single characters (Char, a 16-bit value-type) and short strings can be concatenated cheaply without extra allocation.
I'm porting some C# code to Swift which reads from a bit-array ([Bool]) at sub-octet indexes with character lengths less than 8 bits (it's a very space-conscious file format).
My C# code does something like this:
StringBuilder sb = new StringBuilder( expectedCharacterCount );
int idxInBits = 0;
Boolean[] bits = ...;
for(int i = 0; i < someLength; i++) {
Char c = ReadNextCharacter( ref idxInBits, 6 ); // each character is 6 bits in this example
sb.Append( c );
}
In Swift, I assume NSMutableString is the equivalent of .NET's StringBuilder, and I found this QA about appending individual characters ( How to append a character to string in Swift? ) so in Swift I have this:
var buffer: NSMutableString
for i in 0..<charCount {
let charValue: Character = readNextCharacter( ... )
buffer.AppendWithFormat("%c", charValue)
}
return String(buffer)
But I don't know why it goes through a format-string first, that seems inefficient (reparsing the format-string on every iteration) and as my code is running on iOS devices I want to be very conservative with my program's CPU and memory usage.
As I was writing this, I learned my code should really be using UnicodeScalar instead of Character, problem is NSMutableString does not let you append a UnicodeScalar value, you have to use Swift's own mutable String type, so now my code looks like:
var buffer: String
for i in 0..<charCount {
let x: UnicodeScalar = readNextCharacter( ... )
buffer.append(x)
}
return buffer
I thought that String was immutable, but I noticed its append method returns Void.
I still feel uncomfortable doing this because I don't know how Swift's String type is implemented internally, and I don't see how I can preallocate a large buffer to avoid reallocations (assuming Swift's String uses a growing algorithm).
(This answer was written based on documentation and source code valid for Swift 2 and 3: possibly needs updates and amendments once Swift 4 arrives)
Since Swift is now open-source, we can actually have a look at the source code for Swift:s native String
swift/stdlib/public/core/String.swift
From the source above, we have following comment
/// Growth and Capacity
/// ===================
///
/// When a string's contiguous storage fills up, new storage must be
/// allocated and characters must be moved to the new storage.
/// `String` uses an exponential growth strategy that makes `append` a
/// constant time operation *when amortized over many invocations*.
Given the above, you shouldn't need to worry about the performance of appending characters in Swift (be it via append(_: Character), append(_: UniodeScalar) or appendContentsOf(_: String)), as reallocation of the contiguous storage for a certain String instance should not be very frequent w.r.t. number of single characters needed to be appended for this re-allocation to occur.
Also note that NSMutableString is not "purely native" Swift, but belong to the family of bridged Obj-C classes (accessible via Foundation).
A note to your comment
"I thought that String was immutable, but I noticed its append method returns Void."
String is just a (value) type, that may be used by mutable as well as immutable properties
var foo = "foo" // mutable
let bar = "bar" // immutable
/* (both the above inferred to be of type 'String') */
The mutating void-return instance methods append(_: Character) and append(_: UniodeScalar) are accessible to mutable as well as immutable String instances, but naturally using them with the latter will yield a compile time error
let chars : [Character] = ["b","a","r"]
foo.append(chars[0]) // "foob"
bar.append(chars[0]) // error: cannot use mutating member on immutable value ...

ctypes objects at very similar memory address

I was debugging another project, when I recognized this funny behaviour:
If I generate two c callables from python callables they are always at very similar locations:
from ctypes import *
def foo():
print("foo")
def bar():
print("bar")
c_cm=CFUNCTYPE(c_voidp)
c_foo=c_cm(foo)
print(c_foo)
c_bar=c_cm(bar)
print(c_bar)
running this a few times:
<CFunctionType object at 0x7f8ddb65d048>
<CFunctionType object at 0x7f8ddb65d110>
<CFunctionType object at 0x7f40a022e048>
<CFunctionType object at 0x7f40a022e110>
<CFunctionType object at 0x7fa1f1fb1048>
<CFunctionType object at 0x7fa1f1fb1110>
the 7f is not the interesting part, but the 048 and 110.
Does this mean, that my program is always located at a very similar place in ram?
info: I am on linux 3.18.x
CPython's small-object allocator uses 256 KB arenas, divided into 4 KB pools, in which a given pool is dedicated to a particular allocation size (ranging from 8 to 512 bytes, in steps of 8). The lower 3 hexadecimal digits (12 bits) of the address are the object offset into the pool. This design is discussed in extensive comments in Objects/obmalloc.c.
In the case of 64-bit Linux, a ctypes function pointer object is 200 (0xc8) bytes, i.e. sys.getsizeof(c_bar) == 200, so a pool holds 20 function pointers. Note that the first allocated object in the pool is at offset 0x048 instead of 0x000. The pool itself has an initial header (pool_header) that's 48 (0x030) bytes, plus each ctypes object has a garbage collection header (PyGC_Head) that's 24 (0x018) bytes. Without the GC header, a ctypes function pointer is 176 bytes (0x0b0). Thus the next function pointer's GC header is at offset 0x0f8, with the object proper starting 24 bytes later at offset 0x110.
You can print out a bunch to see the pattern, once it starts allocating from completely free pools. For example, funcs = [c_cm(foo) for i in range(10000)][-40:]; idx = 0; while id(funcs[idx]) & 0xfff != 0x048: idx +=1; print(*[funcs[n] for n in range(idx, idx+20)], sep='\n'):
<CFunctionType object at 0x7f66ca3df048>
<CFunctionType object at 0x7f66ca3df110>
<CFunctionType object at 0x7f66ca3df1d8>
<CFunctionType object at 0x7f66ca3df2a0>
<CFunctionType object at 0x7f66ca3df368>
<CFunctionType object at 0x7f66ca3df430>
<CFunctionType object at 0x7f66ca3df4f8>
<CFunctionType object at 0x7f66ca3df5c0>
<CFunctionType object at 0x7f66ca3df688>
<CFunctionType object at 0x7f66ca3df750>
<CFunctionType object at 0x7f66ca3df818>
<CFunctionType object at 0x7f66ca3df8e0>
<CFunctionType object at 0x7f66ca3df9a8>
<CFunctionType object at 0x7f66ca3dfa70>
<CFunctionType object at 0x7f66ca3dfb38>
<CFunctionType object at 0x7f66ca3dfc00>
<CFunctionType object at 0x7f66ca3dfcc8>
<CFunctionType object at 0x7f66ca3dfd90>
<CFunctionType object at 0x7f66ca3dfe58>
<CFunctionType object at 0x7f66ca3dff20>
Note that the function pointer object's base address that's printed in the repr has nothing directly to do with the address that gets passed to a C library. A function pointer object (i.e. PyCFuncPtrObject) has a b_ptr field that points at a buffer that holds the actual function address that gets passed to C. You can inspect this value by creating a void * pointer from the function pointer, e.g. addr_bar = c_void_p.from_buffer(c_bar).value. For a callback, ctypes allocates a block of executable memory in which it writes a bit of code that sets up calling closure_fcn to call the target Python function. This is a CThunkObject, which is referenced (kept live) as, for example, c_foo._objects['0'].
The least significant digits are the same, but observe that the digits in the middle are not. So, your program is moving around in memory on different runs, but there's some kind of code page alignment going on, such that the addresses always start at a multiple of 4096. Not really surprising.

hashcode hashmap and equals

Where am I getting wrong is following understanding?
Scenario 1:
Float[] f1= new Float[2];
Float[] f2= new Float[2];
System.out.println("Values"+f1[0]+","+f2[0]+","+(f1==f2));
Output: 0.0, 0.0, false.
What I understood from this is:
1) Whenever we declare a float type array all the values are initialized to 0.0
2) Though both f1 and f2 have same values, they are pointing to different addressed in Heap.
3) .equals() in Object Class compares address of the values in head along with the values.
HashMap<String, Integer> hashMap = new HashMap();
String str1 = new String("test");
String str2 = new String("test");
System.out.println((str1==str2));
System.out.println((str1.hashCode()==str2.hashCode()));
hashMap.put(str1,1);
hashMap.put(str2,2);
System.out.println(hashMap.get(str1));
Output:false
true
2
Scenario 2
What I understood from this is
1) Two immutable string literals "test" are created. Both of them are pointing to different locations in the memory
2) From Oracle Docs, if values returned by equals is same then the value returned hashcode must be same and not vice versa. So deflection from equals method in second output is justifiable (i.e. in comparison of hashValue). How is hashCode internally working in this case? Isn't it comparing address? Here they are pointing to different locations so they should have different addresses right?
3) They have same hashCode so hashMap is replacing the value of str1 with str2.
Scenario #1
1) Whenever we declare a float type array all the values are initialized to 0.0
Yes that true
2) Though both f1 and f2 have same values, they are pointing to different addressed in Heap.
Yes you are comparing the refrence of two arrays with in the heap.
3) .equals() in Object Class compares address of the values in head along with the values.
yes its same as Answer for #2 above. It compares reference of two objects.
Scenario #2
1) Two immutable string literals "test" are created. Both of them are pointing to different locations in the memory
Yes that's true.
2) From Oracle Docs, if values returned by equals is same then the value returned hashcode must be same and not vice versa. So deflection from equals method in second output is justifiable (i.e. in comparison of hashValue). How is hashCode internally working in this case? Isn't it comparing address? Here they are pointing to different locations so they should have different addresses right?
This is how hashCode method works on String:
public int hashCode() {
int h = hash;
if (h == 0 && value.length > 0) {
char val[] = value;
for (int i = 0; i < value.length; i++) {
h = 31 * h + val[i];
}
hash = h;
}
return h;
}
3) They have same hashCode so hashMap is replacing the value of str1 with str2.
In addition to hashCode, it would check for equality of two objects as well and if both condition satisfies then it will overwrite the value for the key which is same.

Resources