I downloaded the .ply files in the Stanford 3D scanning repository, and am using Stanford's code from that page (ply.h, plyfile.c) to parse them. However, looking at this code, I see that it's rife with mallocs that are never freed. I could close my eyes and look the other way, but it makes my teeth itch.
I can think of two workarounds:
One is to use Hans Boehm's garbage collector, or something similar, which redefines "malloc" so that it does so within a garbage collector. I've never used this library, but perhaps there's a way to have it operate just on the mallocs in the Stanford code and not anywhere else.
The other workaround is to use a different parser, preferably a C++ one with nicely RAII-ified memory management. I see a few alternative parsers and converters listed at the above link, but rather than kill a day or two trying them all, I was hoping to get a recommendation here.
Can anybody recommend a way to parse .ply files without memory leaks, either by containing the memory leaks in the Stanford parser, or using a different parser, or by some third method I haven't thought of?
Try also RPly.
This library looks promising; until someone else answers this question, I'll mark this as the answer: http://assimp.sourceforge.net/
Another library is the one used by MeshLab
http://vcg.sourceforge.net/index.php/Tutorial
Related
I understand that this question verges into implementation specific domains, but at this point, Rakudo/MoarVM specific answers would help me too.
I am working on some NativeCall modules, and wondering how to debug memory leaks. Some memory is handled in the C library, and I have a good handle over there. I know that domain is my responsibility and there is nothing that MoarVM can do over there. What can I do in the MoarVM domain? what is the best way to check for dangling objects, circular references, etc.?
Is there a way at the end of a series of operations, where I think all of my Perl objects are out of scope to say "Run Garbage Collection and tell me about anything left"?
Is there some Rakudo/NQP/MoarVM specific code I can run to help me? This isn't to release in production, just for testing/diagnostics while I am developing.
Garbage Collection in MoarVM gives a tantalizing overview, but not enough information for me to do anything with it.
Firstly, while leaked memory on the C-side isn't your problem in this case, it's worth knowing that Rakudo installs a perl6-valgrind-m that runs the program under valgrind. I've used this a number of times to figure out segfaults and leaks when writing native library bindings.
For looking into objects managed by MoarVM, it's possible to get the VM to dump heap snapshots. They are taken after each GC run, and an extra GC run is forced and a final snapshot taken at the end of the program. To record snapshots, run with --profile=heap. The output file can then be fed to moar-ha, which can be installed using zef install App::MoarVM::HeapAnalyzer (it's implemented in Perl 6, which may be worth knowing should you wish to extend it in some way to help you solve you problems).
If you have any idea of what kind of objects might be leaking, then it can be useful to search for objects of that type with the find command. There is then a path command that shows how that object is being kept alive. It can also be useful to look at counts of objects between different heap snapshots, to see what is growing in use. Unfortunately there's not yet a snapshot diff feature.
One thing to note is that the snapshots include everything that runs atop of the VM. That means the Perl 6 compiler will be in memory, as well as a bunch of objects for things from the language built-ins. (The tool was developed to help track down managed leaks in the compiler and built-ins, so this is considered a feature. :-) Some kind of filtering may be feasible in the future, however.)
Finally, you mentioned circular references. These are not a problem in Perl 6, since GC is done through tracing, not reference counting.
I am working on an obfuscated binary as a part of a crackme challenge. It has got a sequence of push, pop and nop instructions (which repeats for thousands of times). Functionally, these chunks do not have any effect on the program. But, they make generation of CFGs and the process of reversing, very hard.
There are solutions on how to change the instructions to nop so that I can remove them. But in my case, I would like to completely strip off those instructions, so that I can get a better view of the CFG. If instructions are stripped off, I understand that the memory offsets must be modified too. As far as I could see, there were no tools available to achieve this directly.
I am using IDA Pro evaluation version. I am open to solutions using other reverse engineering frameworks too. It is preferable, if it is scriptable.
I went through a similar question but, the proposed solution is not applicable in my case.
I would like to completely strip off those instructions ... I understand that the memory offsets must be modified too ...
In general, this is practically impossible:
If the binary exports any dynamic symbols, you would have to update the .dynsym (these are probably the offsets you are thinking of).
You would have to find every statically-assigned function pointer, and update it with the new address, but there is no effective way to find such pointers.
Computed GOTOs and switch statements create function pointer tables even when none are present in the program source.
As Peter Cordes pointed out, it's possible to write programs that use delta between two assembly labels, and use such deltas (small immediate values directly encoded into instructions) to control program flow.
It's possible that your target program is free from all of the above complications, but spending much effort on a technique that only works for that one program seems wasteful.
What is more efficient - nodejs buffers or typed arrays? What should I use for better performance?
I think that only those who know interiors of V8 and NodeJs could answer this question.
A Node.js buffer should be more efficient than a typed array. The reason is simply because when a new Node.js Buffer is created it does not need to be initialized to all 0's. Whereas, the HTML5 spec states that initialization of typed arrays must have their values set to 0. Allocating the memory and then setting all of the memory to 0's takes more time.
In most applications picking either one won't matter. As always, the devil lies in the benchmarks :) However, I recommend that you pick one and stick with it. If you're often converting back and forth between the two, you'll take a performance hit.
Nice discussion here: https://github.com/joyent/node/issues/4884
There are a few things that I think are worth mentioning:
Buffer instances are Uint8Array instances but there are subtle incompatibilities with the TypedArray specification in ECMAScript 2015. For example, while ArrayBuffer#slice() creates a copy of the slice, the implementation of Buffer#slice() creates a view over the existing Buffer without copying, making Buffer#slice() far more efficient.
When using Buffer.allocUnsafe() and Buffer.allocUnsafeSlow() the memory isn't zeroed-out (as many have pointed out already). So make sure you completely overwrite the allocated memory or you can allow the old data to be leaked when the Buffer memory is read.
TypedArrays are not readable right away, you'll need a DataView for that. Which means you might need to rewrite your code if you were to migrate back to Buffer. Adapter pattern could help here.
You can use for-of on Buffer. You cannot on TypedArrays. Also you won't have the classic entries(), values(), keys() and length support.
Buffer is not supported in the frontend while TypedArray may well be. So if your code is shared between frontend or backend you might consider sticking to one.
More info in the docs here.
This is a tough one, but I think that it will depend on what are you planning to do with them and how much data you are planning to work with?
typed arrays themselves need node buffers, but are easier to play with and you can overcome the 1GB limit (kMaxLength = 0x3fffffff).
If you are doing common stuff such as iterations, setting, getting, slicing, etc... then typed arrays should be your best shot for performance, not memory ( specially if you are dealing with float and 64bits integer types ).
In the end, probably only a good benchmark with what you want to do can shed real light on this doubt.
The goal is to emulate multi-threaded behavior for a .so which is not thread-safe. Memory is plentiful, not a problem. What is important for me is down-calls via JNI. What is not important is up-calls and sharing anything between .so instances (the goal is complete isolation).
I have heard that it is possible to link a shared lib more than once, but I have not seen anyone actually do it.
There is an opinion that doing this is a bad idea, but I am not convinced by the argument.
Is this a good/bad idea, and why?
If this turns out to be a good idea under certain conditions, where can I read more about it? Can anyone share some code that does this?
Let me add that making .so thread-safe is not really an option, and that mutex is the current implementation which I am trying to improve.
The idea of a shared library is just to share a common code segment among multiple applications.
Once you realized this basic fact, you'd realize that what you're trying to do doesn't makes sense. Because the memory allocation will be within your process space.
I am in the process of tackling the Linux Kernel learning curve and trying to get my head round the information stored in nested struct specifically to resolve a ALSA driver issue.
Hence, I am spending a lot of my time in the source code tracing through structures that have pointers to other structures that in turn have pointers to yet other structures...by which time my head has become so full that I start to loose track of the big picture!
Can anybody point me at either a tool or a website (along the lines of the highly usful Linux Cross Reference http://lxr.linux.no/) that will allow me to, ideally graphically, expand down through the nested struct of the source code?
At the moment we are developing for an Embedded PowerPC in Eclipse CDT version 4.0 but wouldn't be opposed to switching tool chains.
Regards
KermitG
This may sound old fashion but I've found that tracing through data structures with a pencil and paper helps you reverse engineer the code better than tools that automagically do this. So, my recommendation is that you draw them yourself so that you don't have to keep it all in your head. Once you've done this your learning curve becomes a lot less steep.
Just a copy/paste of my comment, so that this question has at least 1 answer.
Or alternatively you could use something like Doxygen to generate the diagrams for you. It's worth noting a lot of the DocBook books get their structures directly from annotated code.
I am currently using Kdevelop4 (svn version) to walk through the Linux kernel. The navigation capabilities are great, but it takes a big while to parse it (just give it the directories you need, omitting all drivers you are not interested in for example) and is still a little bit crashy.
Once the stability improves and the parser can cache previously parsed data, I think this will become the most convenient way to walk through the kernel.