Why does MSVC STL not implement sync_with_stdio()? - visual-c++

I looked for the implementation of this function in MSVC STL source on github, but found no code other than setting the synchronization flag.
Next, I wrote a simple program with a call to this function, and ran it under a debugger, hoping to find a reading of the address of this flag. But during the operation of the program, nothing reads this flag, except for the function itself.
Why didn't Microsoft implement this feature?
After that, I went to cppreference and was surprised by the detailed description of the function. Where does this information come from? It turns out that GCC is used in the cppreference examples, I went to look at libstdc++ source, where I found that this function is fully implemented.
Why did Microsoft decide to do this? Maybe their goal is to ensure the security and full synchronization of I/O streams?

"setting the synchronization flag" and returning the old one is all it is supposed to do as far as the standard goes: https://eel.is/c++draft/ios.base#ios.members.static-1
Otherwise, called with a false argument, it allows the standard streams to operate independently of the standard C streams.
"allows" doesn't mean "shall". It just means if you didn't call it with false, library is not allowed to buffer.
Cppreference says the same https://en.cppreference.com/w/cpp/io/ios_base/sync_with_stdio
If the synchronization is turned off, the C++ standard streams are allowed to buffer their I/O independently
Microsoft here and
LLVM here did not make use of the allowance and simply update the static boolean flag value that is not otherwise used. GNU libstdc++ is one current implementation that took a step further and gave unsynchronized C++ streams their own buffers here

Related

Static patching of ELF64 binary on Linux

I have a static executable ELF64 binary. There are certain functions that I want to override and change the handling. It will result in new functions getting added. Now I need to patch this executable with the new handling.
I understand that this ain't a new problem but I couldn't find anything conclusive that works:
LD_PRELOAD ain't useful because I am working with static binaries.
eresi/elfsh doesn't work for X86-64 ELF binaries
pwntools and other elf tools lack the feature of patching and extending segments.
Don't require runtime instrumentation provided by Pin, Valgrind, and similar other tools. Runtime instrumentation impacts runtime performance. My executable will be spawned into thousand processes on the same machine. (Performance hit to the tune of ~5% is still ok)
https://www.blackhat.com/presentations/bh-asia-02/Clowes/bh-asia-02-clowes.pdf
Explains the problem statement and the technical aspects without specifying the tools used in the context.
What are my other options?
Hot patching would be easier to do and would give you a greater range of choice in implementation, but it sounds like you've already considered this and decided you prefer patching the binary by replacing the symbols instead.
Static binary patchers are easier to develop but they put more burden on the user. Consider elfpatch. In order to use this tool to patch an executable called myelf to replace a function called myfunc() with your own function that returns 0 you would do:
elfpatch.py --apply --symbol-file symbols.txt myelf
Here the contents of symbols.txt would be:
myfunc b800000000c3
Where b800000000c3 is x86 machine code for return 0;. This is simple enough as long as the patched function is simple, but the more complicated the function gets, the more complicated the process is going to get, especially if you have to preserve and call the original function, and if that function calls other functions in other libraries...

v8 Engine - Why is calling native code from JS so expensive?

Based on multiple answers to other questions, calling native C++ from Javascript is expensive.
I checked myself with the node module "benchmark" and came to the same conclusion.
A simple JS function can get ~90 000 000 calls directly, when calling a C++ function I can get a maximum of about 25 000 000 calls. That in itself is not that bad.
But when adding the creation of an object the JS still is about 70 000 000 calls/sec, but the native version suffers dramatically and goes down to about 2 000 000.
I assume this has todo with the dynamic nature of how the v8 engine works, and that it compiles the JS code to byte code.
But what keeps them from implementing the same optimizations for the C++ code? (or at least calling / insight into what would help there)
(V8 developer here.) Without seeing the code that you ran, it's hard to be entirely sure what effect you were observing, and based on your descriptions I can't reproduce it. Microbenchmarks in particular tend to be tricky, and the relative speedups or slowdowns they appear to be measuring are often misleading, unless you've verified that what happens under the hood is exactly what you expect to be happening. For example, it could be the case that the optimizing compiler was able to eliminate the entire workload because it could statically prove that the result isn't used anywhere. Or it could be the case that no calls were happening at all, because the compiler chose to inline the callee.
Generally speaking, crossing the JS/C++ boundary is what has a certain cost, due to different calling conventions and some other checks and preparations that need to be done, like checking for exceptions that may have been thrown. Both one JavaScript function calling another, and one C++ function calling another, will be faster than JavaScript calling into C++ or the other way round.
This boundary crossing cost is unrelated to the level of compiler optimization on either side. It's also unrelated to byte code. ("Hot", i.e. frequently executed, JavaScript functions are compiled to machine code anyway.)
Lastly, V8 is not a C++ compiler. It's simply not built to do any optimizations for C++ code. And even if it tried to, there's no reason to assume it could do a better job than your existing C++ compiler with -O3. (V8 also doesn't even see the source code of your C++ module, so before you could experiment with recompiling that, you'd have to figure out how to provide that source.)
Without delving into specific V8 versions and their intrinsic reasons, I can say that the overhead is not the in the way the C++ backend works vs. the Javascript, instead the pathway between the languages - that is, the binary interface which implements the invocation of a native method from the Javascript land, and vice versa.
The operations involved in a cross-invocation, in my understanding are:
Prepare the arguments.
Save the JS context.
Invoke a gate code [ which implements the bridge ]
The bridge translates the arguments into C++ style params
The bridge also translates the calling convention to match C++
Invokes a C++ Runtime API wrapper in V8.
This wrapper calls the actual method to perform the operation.
The same is reversely performed when the C++ function returns.
May be there are additional steps involved here, but I guess this in itself suffices to explain why the overhead surfaces.
Now, coming to JS optimizations: the JIT compiler which comes with the V8 engine has 2 parts to it: the first just converts the script into machine code, and the second optimizes the code based on collected profile information. This, is a purely dynamic event and a great, unique opportunity which a C++ compiler cannot match, which works in the static compilation space. For example, the information that an Object is created and destroyed in a block of JS code without escaping its scope outside the block would cause the JIT version to optimize the object creation, such stack allocation (OSR), whereas this will always be in the JS heap, when the native version is invoked.
Thanks for bringing this up, it is an interesting conversation!

Relation between MSVC Compiler & linker option for COMDAT folding

This question has some answers on SO but mine is slightly different. Before marking as duplicate, please give it a shot.
MSVC has always provided the /Gy compiler option to enable identical functions to be folded into COMDAT sections. At the same time, the linker also provides the /OPT:ICF option. Is my understanding right that these two options must be used in conjunction? That is, while the former packages functions into COMDAT, the latter eliminates redundant COMDATs. Is that correct?
If yes, then either we use both or turn off both?
Answer from someone who communicated with me off-line. Helped me understand these options a lot better.
===================================
That is essentially true. Suppose we talk just C, or C++ but with no member functions. Without /Gy, the compiler creates object files that are in some sense irreducible. If the linker wants just one function from the object, it gets them all. This is specially a consideration in programming for libraries, such that if you mean to be kind to the library's users, you should write your library as lots of small object files, typically one non-static function per object, so that the user of the library doesn't bloat from having to carry code that actually never executes.
With /Gy, the compiler creates object files that have COMDATs. Each function is in its own COMDAT, which is to some extent a mini-object. If the linker wants just one function from the object, it can pick out just that one. The linker's /OPT switch gives you some control over what the linker does with this selectivity - but without /Gy there's nothing to select.
Or very little. It's at least conceivable that the linker could, for instance, fold functions that are each the whole of the code in an object file and happen to have identical code. It's certainly conceivable that the linker could eliminate a whole object file that contains nothing that's referenced. After all, it does this with object files in libraries. The rule in practice, however, used to be that if you add a non-COMDAT object file to the linker's command line, then you're saying you want that in the binary even if unreferenced. The difference between what's conceivable and what's done is typically huge.
Best, then, to stick with the quick answer. The linker options benefit from being able to separate functions (and variables) from inside each object file, but the separation depends on the code and data to have been organised into COMDATs, which is the compiler's work.
===================================
As answered by Raymond Chen in Jan 2013
As explained in the documentation for /Gy, function-level linking
allows functions to be discardable during the "unused function" pass,
if you ask for it via /OPT:REF. It does not alter the actual classical
model for linking. The flag name is misleading. It's not "perform
function-level linking". It merely enables it by telling the linker
where functions begin and end. And it's not so much function-level
linking as it is function-level unlinking. -Raymond
(This snippet might make more sense with some further context:here are the posts about classical linking model:1, 2
So in a nutshell - yes. If you activate one switch without the other, there would be no observable impact.

Android NDK shared memory: how to use ashmem_create_region?

I've found some guides on using shared memory in Ansroid OS. I've learned that shm_open is not exist in Android amymore due to memory leaks caused by forced killing processes by OS or user.
ASHMEM functions are developed instead. But I cannot find in my NDK the declaration of ashmem_create_region() and other function. Where they are?
As with so many things in Android, the answer is to use JNI. The Java class java.nio.MappedByteBuffer wraps ashmem and provides read/write methods to access it.
Unfortunately, if you're using shared memory to boost performance, multiple round trips through JNI aren't an attractive proposition. Cedric Fung proposes using reflection to retrieve the ashmem handle by name, which will work but may break in future frameworks. (This does happen, BTW. All it takes is somebody deciding that "mFD" is too vague and "mFileDescriptor" would be a better name, or some such.) If you want to play with fire, I'd suggest retrieving the descriptor by type rather than by name, since the type is very unlikely to change.
Cedric also proposes implementing a Binder in C++, but this puts you back where you started because Binder is also not included in the NDK. Instead, you'd need to pass the handle via a binder service implemented in Java.
It's a lot of work for such a simple feature, I know. It's easier to just mmap a file and use that instead, which is too bad since a basic file mapping isn't nearly as mobile-friendly as ashmem. :-(
the header is inside system/core/include/cutils/ashmem.h of the aosp.
You must not use it for a regular application as ashmem functions aren't part of the NDK:
https://groups.google.com/forum/#!topic/android-ndk/eS9QK8EY968

How to make a fix in one of the shared libraries (.so) in the project on linux?

I want to make a quick fix to one of the project's .so libraries. Is it safe to just recompile the .so and replace the original? Or I have to rebuild and reinstall the whole project? Or it depends?
It depends. Shared library needs to be binary-compatible with your executable.
For example,
if you changed the behaviour of one of library's internal functions, you probably don't need to recompile.
If you changed the size of a struct (e.g. by adding a member) that's known by the application, you will need to recompile, otherwise the library and the application will think the struct is smaller than it is, and will crash when the library tries to read an extra uninitialized member that the application didn't write to.
If you change the type or the position of arguments of any functions visible from the applications, you do need to recompile, because the library will try to read more arguments off the stack than the application has put on it (this is the case with C, in C++ argument types are the part of function signature, so the app will refuse run, rather than crashing).
The rule of thumb (for production releases) is that, if you are not consciously aware that you are maintaining binary compatibility, or not sure what binary compatibility is, you should recompile.
That's certainly the intent of using dynamic libraries: if something in the library needs updating, then you just update the library, and programs that use it don't need to be changed. If the signature of the function you're changing doesn't change, and it accomplishes the same thing, then this will in general be fine.
There are of course always edge cases where a program depends on some undocumented side-effect of a function, and then changing that function's implementation might change the side-effect and break the program; but c'est la vie.
If you have not changed the ABI of the shared library, you can just rebuild and replace the library.
It depends yes.
However, I assume you have the exact same source and compiler that built the other stuff and now if you only change in a .cpp file something, it is fine.
Other things e.g. changing an interface (between the shared lib and the rest of the system) in a header file is not fine.
If you don't change your library binary interface, it's ok to recompile and redeploy only the shared library.
Good references:
How To Write Shared Libraries
The Little Manual of API Design

Resources