Size of fay generated file - haskell

I tried fay-jquery and the included sample test.hs file results in whooping 150 kb of js file.
Even with closure compiling it is still 20 kb.
I understand that it must carry a runtime, stdlib and jquery wrappers with it.
I can tell fay not to generate stdlib (--no-stdlib and --no-builtins flags).
But i do not know how to tell it not to include jquery code.
So my question is, how can i split those static parts into a separate js file and only generate module specific code?
This way large parts of code will be loaded only once (and cached) and i can create many smaller js files for separate web pages.

Yes it's safe to split modules up, as of Fay 0.16 all modules can exist standalone (before that you could still have the runtime and fay-base separate). There are some flags for this, --print-runtime and --no-stdlib. Compile with optimizations (-O, this increases the output size, but closure will be able to minimize it even better).
Also remember that the web server should gzip this. That brings the code size down to 4.5kiB. That's pretty decent, right?
You might want to consider putting all of your javascript in one file, that means a slower initial load but then users will have it cached for future page loads.
The reason the file size is so big is that fay-jquery has a lot of FFI bindings which produce a lot of transcoding information. I think fay-jquery could be optimized a lot here to for instance use Ptr JQuery rather than just JQuery in the types, or by figuring out that a lot of this is unnecessary while compiling, or abstracting the conversions more in the compiler's output.
Another possible issue I realized a couple of days ago is that the output is now in the global scope rather than in a closure, which might mean that google closure can't remove redundant code as well as previously (haven't had time to investigate this yet). The modules generation should perhaps be changed to produce a closure for each module.
Also see Reducing output size on the wiki.

Related

Clang++: Use AddressSanitizer for small portion of library

I am working on a quite large code base and some snippets I added are causing weird memory behaviour so I considered using the -fsanitize=address option for clang++.
However it seems I cannot make it compile the whole library, because of a whole bunch of errors caused by the linker.
I am not really aware of the inner workings of the AddressSanitizer, but is it generally possible to apply it only to a small portion of a library's code base (let's say to a single header) and if so how?

How to Decompile Bytenode "jsc" files?

I've just seen this library ByteNode it's the same as ByteCode of java but this is for NodeJS.
This library compiles your JavaScript code into V8 bytecode, which protect your source code, I'm wondering is there anyway to Decompile byteNode therefore it's not secure enough. I'm wondering because I would like to protect my source code using this library?
TL;DR It'll raise the bar to someone copying the code and trying to pass it off as their own. It won't prevent a dedicated person from doing so. But the primary way to protect your work isn't technical, it's legal.
This library compiles your JavaScript code into V8 bytecode, which protect your source code...
Well, we don't know it's V8 bytecode, but it's "compiled" in some sense. All we know is that it creates a "code cache" via the built-in vm.Script.prototype.createCachedData API, which is officially just a cache used to speed up recompiling the code a second time, third time, etc. In theory, you're supposed to also provide the original source code as a string to the vm.Script constructor. But if you go digging into Node.js's vm.Script and V8 far enough it seems to be the actual code in some compiled form (whether actual V8 bytecode or not), and the code string you give it when running is ignored. (The ByteNode library provides a dummy string when running the code from the code cache, so clearly the actual code isn't [always?] needed.)
I'm wondering is there anyway to Decompile byteNode therefore it's not secure enough.
Naturally, otherwise it would be useless because Node.js wouldn't be able to run it. I didn't find a tool to do it that already exists, but since V8 is open source, it would presumably be possible to find the necessary information to write a decompiler for it that outputs valid JavaScript source code which someone could then try to understand.
Experimenting with it, local variable names appear to be lost, although function names don't. Comments appear to get lost (this may not be as obvious as it seems, given that Function.prototype.toString is required to either return the original source text or a synthetic version [details]).
So if you run the code through a minifier (particularly one that renames functions), then run it through ByteNode (or just do it with vm.Script yourself, ByteNode is a fairly thin wrapper), it will be feasible for someone to decompile it into something resembling source code, but that source code will be very hard to understand. This is very similar to shipping Java class files, which can be decompiled (there's even a standard tool to do it in the JDK, javap), except that the format Java class files are well-documented and don't change from one dot release to the next (though they can change from one major release to another; new releases always support the older format, though), whereas the format of this data is not documented (though it's an open source project) and is subject to change from one dot release to the next.
Certain changes, such as changing the copyright message, are probably fairly easy to make to said source code. More meaningful changes will be harder.
Note that the code cache appears to have a checksum or other similar integrity mechanism, since directly editing the .jsc file to swap one letter for another in a literal string makes the code cache fail to load. So someone tampering with it (for instance, to change a copyright notice) would either need to go the decompilation/recompilation route, or dive into the V8 source to find out how to correct the integrity check.
Fundamentally, the way to protect your work is to ensure that you've put all the relevant notices in the relevant places such that the fact copying it is a violation of copyright is clear, then pursue your legal recourse should you find out about someone passing it off as their own.
is there any way
You could get a hundred answers here saying "I don't know a way", but that still won't guarantee that there isn't one.
not secure enough
Secure enough for what? What's your deployment scenario? What kind of scenario/attack are you trying to defend against?
FWIW, I don't know of an existing tool that "decompiles" V8 bytecode (i.e. produces JavaScript source code with the same behavior). That said, considering that the bytecode is a fairly straightforward translation of the source code, I'm sure it wouldn't be very hard to write such a tool, if someone had a reason to spend some time on it. After all, V8's JS-to-bytecode compiler is open source, so one would only have to look at those sources and implement the reverse direction. So I would assume that shipping as bytecode provides about as much "protection" as shipping as uglified JavaScript, i.e. none that I would trust.
Before you make any decisions, please also keep in mind that bytecode is considered an internal implementation detail of V8; in particular it is not versioned and can change at any time, so it has to be created by exactly the same V8 version that consumes it. If you want to update your Node.js you'll have to recreate all the bytecode, and there is no checking or warning in place that will point out when you forgot to do that.
Node.js source already contains code for decompiling binary bytecode.
You can get a text string from your V8 bytecode and then you would need to analyze it.
But text string would be very long and miss some important information such as a constant pool. So you need to modify the Node.js source.
Please check https://github.com/3DGISKing/pkg10.17.0
I have attached exported xml file.
If you study V8, it would be possible to analyze it and get source code from it.
It keeping it short and sweet, You can try Ghidra node.js package which is based on Ghidra reverse engineering framework which was open-sourced by NSA in the year 2019. Ghidra is capable of disassembling and decompiling the v8 bytecode. The inner working of disassembling is quite complex, this answer is short but sufficient.

React: Importing modules with object destructuring, or individually?

In React, some packages allow you to import Components using either individual assignment: import Card from "#material-ui/core/Card", or via object destructuring: import { Card } from "#material-ui/core".
I read in a Blog that using the object destructuring syntax can have performance ramifications if your environment doesn't have proper tree-shaking functionality. The result being that every component of #material-ui/core is imported, not just the one you wanted.
In what situations could using object destructuring imports cause a decline in application performance and how serious would the impact be? Also, in an environment that does have all the bells and whistles, like the default create-react-app configuration, will using one over the other make any difference at all?
Relying on package internal structure is often discouraged but it's officially valid in Material UI:
import Card from '#material-ui/core/Card';
In order to not depend on this and keep imports shorter, top-level exports can be used
import { Card } from "#material-ui/core"
Both are interchangeable, as long as the setup supports tree-shaking. In case unused top-level exports can be tree-shaken, the second option is preferable. Otherwise the the first option is preferable, it guarantees unused package imports to not be included into the bundle.
create-react-app uses Webpack configuration that supports tree-shaking and can benefit from the second option.
Loading in extra code, such as numerous components from material-ui you may not need, has two primary performance impacts: Download time, and execution time.
Download time is simple: Your JS file(s) are larger, therefore take longer to download, especially over slower connections such as mobile. Properly slimming down your JS using mechanisms like tree shaking is always a good idea.
Execution time is a little less apparent, but also has a similar effect, this time to browsers with less computing power available - again, primarily mobile. Even if the components are never used, the browser must still parse and execute the source and pull it into memory. On your desktop with a powerful processor and plenty of memory you'll probably never notice the difference, but on a slower/older computer or mobile device you may notice a small lag even after the file(s) finish downloading as they are processed.
Assuming your build tooling has properly working tree shaking, my opinion is generally they are roughly equivalent. The build tool will not include the unused components into the compiled JS, so it shouldn't impact either download or execution time.

template instantiation statistics from compilers

Is there a way to get a summary of the instantiated templates (with what types and how many times - like a histogram) within a translation unit or for the whole project (shared object/executable)?
If I have a large codebase and I want to take advantage of the C++11 extern keyword I would like to know which templates are most used within my project (or from the internals of stl - like std::less<MyString> for example).
Also is it possible to have a weight assigned to each template instantiation (time spent by the compiler)?
Even if only one (c++11 enabled) compiler gives me such statistics I would be happy.
How difficult would it be to implement such a thing with Clang's LibTooling?
And is this even reasonable? Many people told me that I can reason which template instantiations I should extern without the use of a tool...
There are several ways to attack this problem.
If you are working with an open-source compiler, it's not hard to make a simple change to the source code that will trace all template substantiations.
If that sounds like too much hassle, you can also try to force the compiler to produce a warning on each template instantiation for a given symbol. Steven Watanabe has written a set of tools that can help you with that.
Finally, possibly the best options is to use the debugging symbols (or map files), generated by the compiler, to track down how many times each function appears in the final image and more importantly how much does it add to the weight in bytes. The best example for such a tool is Andrian Stone's SymbolSort, which is based on the Microsoft's toolset. Another similar tool is the Map File Browser.

What are the porting issues going from VC8 (VS2005) to VC9 (VS2008)?

I have inherited a very large and complex project (actually, a 'solution' consisting of 119 'projects', most of which are DLLs) that was built and tested under VC8 (VS2005), and I have the task of porting it to VC9 (VS2008).
The porting process I used was:
Copy the VC8 .sln file and rename it
to a VC9 .sln file.
Copy all of
the VC8 project files, and rename
them to VC9 project files.
Edit
all of the VC9 project files,
s/vc8/vc9.
Edit the VC9 .sln,
s/vc8/vc9/
Load the VC9 .sln with
VS2008, and let the IDE 'convert'
all of the project files.
Fix
compiler and linker errors until I
got a good build.
So far, I have run into the following issues in that last step.
1) A change in the way decorated names are calculated, causing truncation of the names.
This is more than just a warning (http://msdn.microsoft.com/en-us/library/074af4b6.aspx). Libraries built with this warning will not link with other modules. Applying the solution given in MSDN was non-trivial, but doable. I addressed this problem separately in How do I increase the allowed decorated name length in VC9 (MSVC 2008)?
2) A change that does not allow the assignment of zero to an iterator. This is per the spec, and it was fairly easy to find and fix these previously-allowed coding errors. Instead of assignment of zero to an iterator, use the value end().
3) for-loop scope is now per the ANSI standard. Another easy-to-fix problem.
4) More space required for pre-compiled headers. In some cases a LOT more space was required. I ended up using /Zm999 to provide the maximum PCH space. If PCH memory usage gets bumped up again, I assume that I will have to forgo PCH altogether, and just endure the increase in what is already a very long build time.
5) A change in requirements for copy ctors and default dtors. It appears that in template classes, under certain conditions that I haven't quite figured out yet, the compiler no longer generates a default ctor or a default dtor. I suspect this is a bug in VC9, but there may be something else that I'm doing wrong. If so, I'd sure like to know what it is.
6) The GUIDs in the sln and vcproj files were not changed. This does not appear to impact the build in any way that I can detect, but it is worrisome nevertheless.
Note that despite all of these issues, the project built, ran, and passed extensive QA testing under VC8. I have also back-ported all of the changes to the VC8 projects, where they still build and run just as happily as they did before (using VS2005/VC8). So, all of my changes required for a VC9 build at least appear to be backward-compatible, although the regression testing is still underway.
Now for the really hard problem: I have run into a difference in the startup sequence between VC8 and VC9 projects. The program uses a small-object allocator modeled after Loki, in Andrei Alexandrescu's Book Modern C++ Design. This allocator is initialized using a global variable defined in the main program module.
Under VC8, this global variable is constructed at the very beginning of the program startup, from code in a module crtexe.c. Under VC9, the first module that executes is crtdll.c, which indicates that the startup sequence has been changed. The DLLs that are starting up appear to be confusing the small-object allocator by allocating and deallocating memory before the global object can initialize the statistics, which leads to some spurious diagnostics. The operation of the program does not appear to be materially affected, but the QA folks will not allow the spurious diagnostics to get past them.
Is there some way to force the construction of a global object prior to loading DLLs?
What other porting issues am I likely to encounter?
Is there some way to force the construction of a global object prior to loading DLLs?
How about the DELAYLOAD option? So that DLLs aren't loaded until their first call?
That is a tough problem, mostly because you've inherited a design that's inherently dangerous because you're not supposed to rely on the initialization order of global variables.
It sounds like something you could try to work around by replacing the global variable with a singleton that other functions retrieve by calling a global function or method that returns a pointer to the singleton object. If the object exists at the time of the call, the function returns a pointer to it. Otherwise, it allocates a new one and returns a pointer to the newly allocated object.
The problem, of course, is that I can't think of a singleton implementation that would avoid the problem you're describing. Maybe this discussion would be useful: http://www.oneunified.net/blog/Personal/SoftwareDevelopment/CPP/Singleton.article
That's certainly an interesting problem. I don't have a solution other than perhaps to change the design so that there is no dependence on undefined behavior of the order or link/dll startup. Have you considered linking with the older linker? (or whatever the VS.NET term is)
Because the behavior of your variable and allocator relied on some (unknown at the time) arbitrary order of startup I would probably fix that so that it is not an issue in the future. I guess you are really asking if anyone knows how to do some voodoo in VC9 to make the problem disappear. I am interested in hearing it as well.
How about this,
Make your main program a DLL too, call it main.dll, linked to all the other ones, and export the main function as say, mainEntry(). Remove the global variable.
Create a new main exe which has the global variable and its initialization, but doesn't link statically to any of the other application DLLs (except for the allocator stuff).
This new main.exe then dynamically loads the main.dll using LoadLibrary(), then uses GetProcAddress to call mainEntry().
The solution to the problem turned out to be more straightforward than I originally thought. The initialization order problem was caused by the existence of several global variables of types derived from std container types (a basic design flaw that predated my position with that company). The solution was to replace all such globals with singletons. There were about 100 of them.
Once this was done, the initialization (and destruction) order was under programmer control.

Resources