What is more efficient - nodejs buffers or typed arrays? What should I use for better performance?
I think that only those who know interiors of V8 and NodeJs could answer this question.
A Node.js buffer should be more efficient than a typed array. The reason is simply because when a new Node.js Buffer is created it does not need to be initialized to all 0's. Whereas, the HTML5 spec states that initialization of typed arrays must have their values set to 0. Allocating the memory and then setting all of the memory to 0's takes more time.
In most applications picking either one won't matter. As always, the devil lies in the benchmarks :) However, I recommend that you pick one and stick with it. If you're often converting back and forth between the two, you'll take a performance hit.
Nice discussion here: https://github.com/joyent/node/issues/4884
There are a few things that I think are worth mentioning:
Buffer instances are Uint8Array instances but there are subtle incompatibilities with the TypedArray specification in ECMAScript 2015. For example, while ArrayBuffer#slice() creates a copy of the slice, the implementation of Buffer#slice() creates a view over the existing Buffer without copying, making Buffer#slice() far more efficient.
When using Buffer.allocUnsafe() and Buffer.allocUnsafeSlow() the memory isn't zeroed-out (as many have pointed out already). So make sure you completely overwrite the allocated memory or you can allow the old data to be leaked when the Buffer memory is read.
TypedArrays are not readable right away, you'll need a DataView for that. Which means you might need to rewrite your code if you were to migrate back to Buffer. Adapter pattern could help here.
You can use for-of on Buffer. You cannot on TypedArrays. Also you won't have the classic entries(), values(), keys() and length support.
Buffer is not supported in the frontend while TypedArray may well be. So if your code is shared between frontend or backend you might consider sticking to one.
More info in the docs here.
This is a tough one, but I think that it will depend on what are you planning to do with them and how much data you are planning to work with?
typed arrays themselves need node buffers, but are easier to play with and you can overcome the 1GB limit (kMaxLength = 0x3fffffff).
If you are doing common stuff such as iterations, setting, getting, slicing, etc... then typed arrays should be your best shot for performance, not memory ( specially if you are dealing with float and 64bits integer types ).
In the end, probably only a good benchmark with what you want to do can shed real light on this doubt.
Related
When running Raku code on Rakudo with the MoarVM backend, is there any way to print information about how a given Str is stored in memory from inside the running program? In particular, I am curious whether there's a way to see how many Strands currently make up the Str (whether via Raku introspection, NQP, or something that accesses the MoarVM level (does such a thing even exist at runtime?).
If there isn't any way to access this info at runtime, is there a way to get at it through output from one of Rakudo's command-line flags, such as --target, or --tracing? Or through a debugger?
Finally, does MoarVM manage the number of Strands in a given Str? I often hear (or say) that one of Raku's super powers is that is can index into Unicode strings in O(1) time, but I've been thinking about the pathological case, and it feels like it would be O(n). For example,
(^$n).map({~rand}).join
seems like it would create a Str with a length proportional to $n that consists of $n Strands – and, if I'm understanding the datastructure correctly, that means that into this Str would require checking the length of each Strand, for a time complexity of O(n). But I know that it's possible to flatten a Strand-ed Str; would MoarVM do something like that in this case? Or have I misunderstood something more basic?
When running Raku code on Rakudo with the MoarVM backend, is there any way to print information about how a given Str is stored in memory from inside the running program?
My educated guess is yes, as described below for App::MoarVM modules. That said, my education came from a degree I started at the Unseen University, and a wizard had me expelled for guessing too much, so...
In particular, I am curious whether there's a way to see how many Strands currently make up the Str (whether via Raku introspection, NQP, or something that accesses the MoarVM level (does such a thing even exist at runtime?).
I'm 99.99% sure strands are purely an implementation detail of the backend, and there'll be no Raku or NQP access to that information without MoarVM specific tricks. That said, read on.
If there isn't any way to access this info at runtime
I can see there is access at runtime via MoarVM.
is there a way to get at it through output from one of Rakudo's command-line flags, such as --target, or --tracing? Or through a debugger?
I'm 99.99% sure there are multiple ways.
For example, there's a bunch of strand debugging code in MoarVM's ops.c file starting with #define MVM_DEBUG_STRANDS ....
Perhaps more interesting are what appears to be a veritable goldmine of sophisticated debugging and profiling features built into MoarVM. Plus what appear to be Rakudo specific modules that drive those features, presumably via Raku code. For a dozen or so articles discussing some aspects of those features, I suggest reading timotimo's blog. Browsing github I see ongoing commits related to MoarVM's debugging features for years and on into 2021.
Finally, does MoarVM manage the number of Strands in a given Str?
Yes. I can see that the string handling code (some links are below), which was written by samcv (extremely smart and careful) and, I believe, reviewed by jnthn, has logic limiting the number of strands.
I often hear (or say) that one of Raku's super powers is that is can index into Unicode strings in O(1) time, but I've been thinking about the pathological case, and it feels like it would be O(n).
Yes, if a backend that supported strands did not manage the number of strands.
But for MoarVM I think the intent is to set an absolute upper bound with #define MVM_STRING_MAX_STRANDS 64 in MoarVM's MVMString.h file, and logic that checks against that (and other characteristics of strings; see this else if statement as an exemplar). But the logic is sufficiently complex, and my C chops sufficiently meagre, that I am nowhere near being able to express confidence in that, even if I can say that that appears to be the intent.
For example, (^$n).map({~rand}).join seems like it would create a Str with a length proportional to $n that consists of $n Strands
I'm 95% confident that the strings constructed by simple joins like that will be O(1).
This is based on me thinking that a Raku/NQP level string join operation is handled by MVM_string_join, and my attempts to understand what that code does.
But I know that it's possible to flatten a Strand-ed Str; would MoarVM do something like that in this case?
If you read the code you will find it's doing very sophisticated handling.
Or have I misunderstood something more basic?
I'm pretty sure I will have misunderstood something basic so I sure ain't gonna comment on whether you have. :)
As far as I understand it, the fact that MoarVM implements strands (aka, a concatenating two strings will only result in creation of a strand that consists of "references" to the original strings), is really that: an implementation detail.
You can implement the Raku Programming Language without needing to implement strands. Therefore there is no way to introspect this, at least to my knowledge.
There has been a PR to expose the nqp:: op that would actually concatenate strands into a single string, but that has been refused / closed: https://github.com/rakudo/rakudo/pull/3975
I have been writing a small kernel lately based on x86-64 architecture. When taking care of some user space code, I realized I am virtually not using rbp. I then looked up at some other things and found out compilers are getting smarter these days and they really dont use rbp anymore. (I could be wrong here.)
I was wondering if conventional use of rbp/epb is not required anymore in many instances or am I wrong here. If that kind of usage is not required then can it be used like a general purpose register?
Thanks
It is only needed if you have variable-length arrays in your stack frames (recording the array length would require more memory and more computations). It is no longer needed for unwinding because there is now metadata for that.
It is still useful if you are hand-writing entire assembly functions, but who does that? Assembly should only be used as glue to jump into a C (or whatever) function.
Copying (generational) garbage collection offers the best performance of any form of automatic memory management, but requires pointers to relocated chunks of data be fixed up. This is enabled, in languages which support this memory management technique, by disallowing pointer arithmetic and making sure all pointers are to the beginning of identifiable objects.
If you're generating code at run time with a JIT compiler, things look a bit trickier because return addresses on the call stack will point to, not the beginning of code blocks, but random locations within them, so fixup is a problem.
How is this typically solved?
Quite often, you don't relocate code. This is both because it is indeed complicated to fix the stack and other addresses (think jumps across code fragments), and because you don't actually need garbage collection for such code (as it is only manipulated by code you write anyway, so you can do manual memory management). You also don't expect to create a whole lot of machine code (compared to application objects), so fragmentation etc. is not a concern.
If you insist on moving machine code and fixing up the stack, there is a way, I think: Similar to Mark-Compact, build a "break table" (I have no idea where this name comes from; "relocation table" might be clearer) that tells you the amount by which you should adjust pointers to moved objects. Now, walk the stack for return addresses (highly platform-specific, of course) and fix them if they refer to relocated code. Instead of looking for exact matches, search for the highest address lower than the return address you're currently replacing. You can check that this address indeed refers to some machine code that moved by looking at the object size (you have a pointer to the start of the object, after all). This approach isn't feasible for all objects, for various reasons.
There are other reasons to do something similar though. Some JIT compilers feature on-stack replacement, which means creating a new version (e.g. more optimized, or less optimized) of some machine code and replacing all occurrences of the old version with it. This is far more complicated than just fixing the return addresses though. You have to ensure the new version logically continue where the old one was left hanging. I am not familiar with how this is implemented, so I will not go into detail.
When studying Java I learned that Strings were not safe for storing passwords, since you can't manually clear the memory associated with them (you can't be sure they will eventually be gc'ed, interned strings may never be, and even after gc you can't be sure the physical memory contents were really wiped). Instead, I were to use char arrays, so I can zero-out them after use. I've tried to search for similar practices in other languages and platforms, but so far I couldn't find the relevant info (usually all I see are code examples of passwords stored in strings with no mention of any security issue).
I'm particularly interested in the situation with browsers. I use jQuery a lot, and my usual approach is just the set the value of a password field to an empty string and forget about it:
$(myPasswordField).val("");
But I'm not 100% convinced it is enough. I also have no idea whether or not the strings used for intermediate access are safe (for instance, when I use $.ajax to send the password to the server). As for other languages, usually I see no mention of this issue (another language I'm interested in particular is Python).
I know questions attempting to build lists are controversial, but since this deals with a common security issue that is largely overlooked, IMHO it's worth it. If I'm mistaken, I'd be happy to know just from JavaScript (in browsers) and Python then. I was also unsure whether to ask here, at security.SE or at programmers.SE, but since it involves the actual code to safely perform the task (not a conceptual question) I believe this site is the best option.
Note: in low-level languages, or languages that unambiguously support characters as primitive types, the answer should be obvious (Edit: not really obvious, as #Gabe showed in his answer below). I'm asking for those high level languages in which "everything is an object" or something like that, and also for those that perform automatic string interning behind the scenes (so you may create a security hole without realizing it, even if you're reasonably careful).
Update: according to an answer in a related question, even using char[] in Java is not guaranteed to be bulletproof (or .NET SecureString, for that matter), since the gc might move the array around so its contents might stick in the memory even after clearing (SecureString at least sticks in the same RAM address, guaranteeing clearing, but its consumers/producers might still leave traces).
I guess #NiklasB. is right, even though the vulnerability exists, the likelyhood of an exploit is low and the difficulty to prevent it is high, that might be the reason this issue is mostly ignored. I wish I could find at least some reference of this problem concerning browsers, but googling for it has been fruitless so far (does this scenario at least have a name?).
The .NET solution to this is SecureString.
A SecureString object is similar to a String object in that it has a text value. However, the value of a SecureString object is automatically encrypted, can be modified until your application marks it as read-only, and can be deleted from computer memory by either your application or the .NET Framework garbage collector.
Note that even for low-level languages like C, the answer isn't as obvious as it seems. Modern compilers can determine that you are writing to the string (zeroing it out) but never reading the values you read out, and just optimize away the zeroing. In order to prevent optimizing away the security, Windows provides SecureZeroMemory.
For Python, there's no way to do that that, according to this answer. A possibility would be using lists of characters (as length-1 strings or maybe code units as integers) instead of strings, so you can overwrite that list after use, but that would require every code that touches it to support this format (if even a single one of them creates a string with its contents, it's over).
There is also a mention to a method using ctypes, but the link is broken, so I'm unaware of its contents. This other answer also refers to it, but there's not a lot of detail.
Tagged pointers are a common optimization when implementing dynamic languages: take advantage of alignment requirements that mean the low two or three bits of a pointer will always be zero, and use them to store type information.
Suppose you're using the Boehm garbage collector, which basically works by looking at active data for things that look like pointers. Tagged pointers don't look like pointers, in the sense that their low bits are nonzero.
Is this a showstopper, i.e. do you have to ditch tagged pointers if you're using Boehm? Or does it have a way around this problem?
AFAIK Boehm can handle this with the right options. It is capable, at a small price, of detecting interior pointers. It is also possible to write your own scanning code. Basically there are probably enough hooks to handle just about anything.
I have written my own collector, it is precise on the heap and conservative on the stack. It does not touch C made pointers. For some applications it will be faster because it knows a lot about my language allocated objects and doesn't care about other stuff which is managed, say, using traditional C++ destructors.
However it isn't incremental or generational, and it doesn't handle threads as well (it's not smart enough to stop threads with signals). On the plus side, however, it doesn't require magic linkage techniques which Boehm does (to capture mallocs, etc). On the seriously minus side you can't put managed objects into unmanaged ones.