How to add security to Spring boot jar file? [duplicate] - security

How can I package my Java application into an executable jar that cannot be decompiled (for example , by Jadclipse)?

You can't. If the JRE can run it, an application can de-compile it.
The best you can hope for is to make it very hard to read (replace all symbols with combinations of 'l' and '1' and 'O' and '0', put in lots of useless code and so on). You'd be surprised how unreadable you can make code, even with a relatively dumb translation tool.
This is called obfuscation and, while not perfect, it's sometimes adequate.
Remember, you can't stop the determined hacker any more than the determined burglar. What you're trying to do is make things very hard for the casual attacker. When presented with the symbols O001l1ll10O, O001llll10O, OO01l1ll10O, O0Ol11ll10O and O001l1ll1OO, and code that doesn't seem to do anything useful, most people will just give up.

First you can't avoid people reverse engineering your code. The JVM bytecode has to be plain to be executed and there are several programs to reverse engineer it (same applies to .NET CLR). You can only make it more and more difficult to raise the barrier (i.e. cost) to see and understand your code.
Usual way is to obfuscate the source with some tool. Classes, methods and fields are renamed throughout the codebase, even with invalid identifiers if you choose to, making the code next to impossible to comprehend. I had good results with JODE in the past. After obfuscating use a decompiler to see what your code looks like...
Next to obfuscation you can encrypt your class files (all but a small starter class) with some method and use a custom class loader to decrypt them. Unfortunately the class loader class can't be encrypted itself, so people might figure out the decryption algorithm by reading the decompiled code of your class loader. But the window to attack your code got smaller. Again this does not prevent people from seeing your code, just makes it harder for the casual attacker.
You could also try to convert the Java application to some windows EXE which would hide the clue that it's Java at all (to some degree) or really compile into machine code, depending on your need of JVM features. (I did not try this.)

GCJ is a free tool that can compile to either bytecode or native code. Keeping in mind, that does sort of defeat the purpose of Java.

A little late I know, but the answer is no.
Even if you write in C and compile to native code, there are dissasemblers / debuggers which will allow people to step through your code. Granted - debugging optimized code without symbolic information is a pain - but it can be done, I've had to do it on occasion.
There are steps that you can take to make this harder - e.g. on windows you can call the IsDebuggerPresent API in a loop to see if somebody is debugging your process, and if yes and it is a release build - terminate the process. Of course a sufficiently determined attacker could intercept your call to IsDebuggerPresent and always return false.
There are a whole variety of techniques that have cropped up - people who want to protect something and people who are out to crack it wide open, it is a veritable arms race! Once you go down this path - you will have to constantly keep updating/upgrading your defenses, there is no stopping.

This not my practical solution but , here i think good collection or resource and tutorials for making it happen to highest level of satisfaction.
A suggestion from this website (oracle community)
(clean way), Obfuscate your code, there are many open source and free
obfuscator tools, here is a simple list of them : [Open source
obfuscators list] .
These tools make your code unreadable( though still you can decompile
it) by changing names. this is the most common way to protect your
code.
2.(Not so clean way) If you have a specific target platform (like windows) or you can have different versions for different platforms,
you can write a sophisticated part of your algorithms in a low level
language like C (which is very hard to decompile and understand) and
use it as a native library in you java application. it is not clean,
because many of us use java for it's cross-platform abilities, and
this method fades that ability.
and this one below a step by step follow :
ProtectYourJavaCode
Enjoy!
Keep your solutions added we need this more.

Related

How to Decompile Bytenode "jsc" files?

I've just seen this library ByteNode it's the same as ByteCode of java but this is for NodeJS.
This library compiles your JavaScript code into V8 bytecode, which protect your source code, I'm wondering is there anyway to Decompile byteNode therefore it's not secure enough. I'm wondering because I would like to protect my source code using this library?
TL;DR It'll raise the bar to someone copying the code and trying to pass it off as their own. It won't prevent a dedicated person from doing so. But the primary way to protect your work isn't technical, it's legal.
This library compiles your JavaScript code into V8 bytecode, which protect your source code...
Well, we don't know it's V8 bytecode, but it's "compiled" in some sense. All we know is that it creates a "code cache" via the built-in vm.Script.prototype.createCachedData API, which is officially just a cache used to speed up recompiling the code a second time, third time, etc. In theory, you're supposed to also provide the original source code as a string to the vm.Script constructor. But if you go digging into Node.js's vm.Script and V8 far enough it seems to be the actual code in some compiled form (whether actual V8 bytecode or not), and the code string you give it when running is ignored. (The ByteNode library provides a dummy string when running the code from the code cache, so clearly the actual code isn't [always?] needed.)
I'm wondering is there anyway to Decompile byteNode therefore it's not secure enough.
Naturally, otherwise it would be useless because Node.js wouldn't be able to run it. I didn't find a tool to do it that already exists, but since V8 is open source, it would presumably be possible to find the necessary information to write a decompiler for it that outputs valid JavaScript source code which someone could then try to understand.
Experimenting with it, local variable names appear to be lost, although function names don't. Comments appear to get lost (this may not be as obvious as it seems, given that Function.prototype.toString is required to either return the original source text or a synthetic version [details]).
So if you run the code through a minifier (particularly one that renames functions), then run it through ByteNode (or just do it with vm.Script yourself, ByteNode is a fairly thin wrapper), it will be feasible for someone to decompile it into something resembling source code, but that source code will be very hard to understand. This is very similar to shipping Java class files, which can be decompiled (there's even a standard tool to do it in the JDK, javap), except that the format Java class files are well-documented and don't change from one dot release to the next (though they can change from one major release to another; new releases always support the older format, though), whereas the format of this data is not documented (though it's an open source project) and is subject to change from one dot release to the next.
Certain changes, such as changing the copyright message, are probably fairly easy to make to said source code. More meaningful changes will be harder.
Note that the code cache appears to have a checksum or other similar integrity mechanism, since directly editing the .jsc file to swap one letter for another in a literal string makes the code cache fail to load. So someone tampering with it (for instance, to change a copyright notice) would either need to go the decompilation/recompilation route, or dive into the V8 source to find out how to correct the integrity check.
Fundamentally, the way to protect your work is to ensure that you've put all the relevant notices in the relevant places such that the fact copying it is a violation of copyright is clear, then pursue your legal recourse should you find out about someone passing it off as their own.
is there any way
You could get a hundred answers here saying "I don't know a way", but that still won't guarantee that there isn't one.
not secure enough
Secure enough for what? What's your deployment scenario? What kind of scenario/attack are you trying to defend against?
FWIW, I don't know of an existing tool that "decompiles" V8 bytecode (i.e. produces JavaScript source code with the same behavior). That said, considering that the bytecode is a fairly straightforward translation of the source code, I'm sure it wouldn't be very hard to write such a tool, if someone had a reason to spend some time on it. After all, V8's JS-to-bytecode compiler is open source, so one would only have to look at those sources and implement the reverse direction. So I would assume that shipping as bytecode provides about as much "protection" as shipping as uglified JavaScript, i.e. none that I would trust.
Before you make any decisions, please also keep in mind that bytecode is considered an internal implementation detail of V8; in particular it is not versioned and can change at any time, so it has to be created by exactly the same V8 version that consumes it. If you want to update your Node.js you'll have to recreate all the bytecode, and there is no checking or warning in place that will point out when you forgot to do that.
Node.js source already contains code for decompiling binary bytecode.
You can get a text string from your V8 bytecode and then you would need to analyze it.
But text string would be very long and miss some important information such as a constant pool. So you need to modify the Node.js source.
Please check https://github.com/3DGISKing/pkg10.17.0
I have attached exported xml file.
If you study V8, it would be possible to analyze it and get source code from it.
It keeping it short and sweet, You can try Ghidra node.js package which is based on Ghidra reverse engineering framework which was open-sourced by NSA in the year 2019. Ghidra is capable of disassembling and decompiling the v8 bytecode. The inner working of disassembling is quite complex, this answer is short but sufficient.

Securely running user's code

I am looking to create an AI environment where users can submit their own code for the AI and let them compete. The language could be anything, but something easy to learn like JavaScript or Python is preferred.
Basically I see three options with a couple of variants:
Make my own language, e.g. a JavaScript clone with only very basic features like variables, loops, conditionals, arrays, etc. This is a lot of work if I want to properly implement common language features.
1.1 Take an existing language and strip it to its core. Just remove lots of features from, say, Python until there is nothing left but the above (variables, conditionals, etc.). Still a lot of work, especially if I want to keep up to date with upstream (though I just could also just ignore upstream).
Use a language's built-in features to lock it down. I know from PHP that you can disable functions and searching around, similar solutions seem to exist for Python (with lots and lots of caveats). For this I'd need to have a good understanding of all the language's features and not miss anything.
2.1. Make a preprocessor that rejects code with dangerous stuff (preferably whitelist based). Similar to option 1, except that I only have to implement the parser and not implement all features: the preprocessor has to understand the language so that you can have variables named "eval" but not call the function named "eval". Still a lot of work, but more manageable than option 1.
2.2. Run the code in a very locked-down environment. Chroot, no unnecessary permissions... perhaps in a virtual machine or container. Something in that sense. I'd have to research how to achieve this and how to make it give me the results in a secure way, but that seems doable.
Manually read through all code. Doable on a small scale or with moderators, though still tedious and error-prone (I might miss stuff like if (user.id = 0)).
The way I imagine 2.2 to work is like this: run both AIs in a virtual machine (or something) and constrain it to communicate with the host machine only (no other Internet or LAN access). Both AIs run in a separate machine and communicate with each other (well, with the playing field, and thereby they see each other's positions) through an API running on the host.
Option 2.2 seems the most doable, but also relatively hacky... I let someone's code loose in a virtualized or locked down environment, hoping that that'll keep them in while giving them free game to DoS or break out of the environment. Then again, most other options are not much better.
TL;DR: in essence my question is: how do I let people give me 'logic' for an AI (which I think is most easily done using code) and then run that without compromising the functionality of the system? There must be at least 2 AIs working on the same playing field.
This is really just a plugin system, so researching how others implement plugins is a good starting point. In particular, I'd look at web browsers like Chrome and Safari and their plugin systems.
A common theme in modern plugins systems is process isolation. Ideally you should run the plugin in its own process space in a sandbox. In OS X look at XPC, which is designed explicitly for this problem. On Linux (or more portably), I would probably look at NaCl (Native Client). The JVM is also designed to provide sandboxing, and offers a rich selection of languages. (That said, I don't personally consider the JVM a very strong sandbox. It's had a history of security problems.)
In general, my preference on these kinds of projects is a language-agnostic API. I most often use REST APIs (or "REST-like"). This allows the plugin to be highly restricted, while not restricting the language choice. I like simple HTTP for communications whenever possible because it has rich support in numerous languages, so it puts little restriction on the plugin. In fact, given your description, you wouldn't even have to run the plugin on your hardware (and certainly not on the main server). Making the plugins remote clients removes many potential concerns.
But ultimately, I think something like your "2.2" is the right direction.

Protecting shared library

Is there any way to protect a shared library (.so file) against reverse engineering ?
Is there any free tool to do that ?
The obvious first step is to strip the library of all symbols, except the ones required for the published API you provide.
The "standard" disassembly-prevention techniques (such as jumping into the middle of instruction, or encrypting some parts of code and decrypting them on-demand) apply to shared libraries.
Other techniques, e.g. detecting that you are running under debugger and failing, do not really apply (unless you want to drive your end-users completely insane).
Assuming you want your end-users to be able to debug the applications they are developing using your library, obfuscation is a mostly lost cause. Your efforts are really much better spent providing features and support.
Reverse engineering protection comes in many forms, here are just a few:
Detecting reversing environments, such as being run in a debugger or a virtual machine, and aborting. This prevents an analyst from figuring out what is going on. Usually used by malware. A common trick is to run undocumented instructions that behave differently in VMWare than on a real CPU.
Formatting the binary so that it is malformed, e.g. missing ELF sections. You're trying to prevent normal analysis tools from being able to open the file. On Linux, this means doing something that libbfd doesn't understand (but other libraries like capstone may still work).
Randomizing the binary's symbols and code blocks so that they don't look like what a compiler would produce. This makes decompiling (guessing at the original source code) more difficult. Some commercial programs (like games) are deployed with this kind of protection.
Polymorphic code that changes itself on the fly (e.g. decompresses into memory when loaded). The most sophisticated ones are designed for use by malware and are called packers (avoid these unless you want to get flagged by anti-malware tools). There are also academic ones like UPX http://upx.sourceforge.net/ (which provides a tool to undo the UPX'ing).

Code obfuscation usage in various languages

I recently learned about code obfuscation. Its nice thing to do, when you have spare time, but I have different question. Why to do it?
First, there are languages in which I am sure its great thing - interpreted ones, like php, JavaScript and much more. There it seems like a good and more secure thing.
Second, there are languages where this seems to have no real effect for me - all the native code compiled languages. Take C for example. when compiled, all the variable names, function names, most of obfuscation techniques go away. If some can make it into native code, it would be things like recursion instead of for cycles and so, but disassembled code will anyway have instead of names some disassembler-generated identifiers, right?
And last category are languages I am not quite sure about. And that's the main reason I ask. These languages would be Java, C# (.NET),and the last Silverlight used in WP7. I ask because I read some article that state that on WP7 apps, code obfuscation helps preventing code from hacking. But I always thought of byte-code as being very similar to standard assembler codes, therefore again not having any information about real pre-compilation variable names, function names, etc. So, where is the truth?
Do it if you want, but don't expect any determined person to be scared away by it. There exist de-obfuscators, people can read obfuscated code as well (just as there are people who can read optimized assembly and reconstruct the original C code). Code obfuscation just gives you a false sense of security and might deter a person who is just curious (instead of deterring those who are serious about stealing your code). All it gives you is a false sense of security but no real one. Schneier aptly names this "security theater".
Yes, many modern languages that retain more information about the source can be obfuscated better than those that are compiled right to machine code. For the latter the compiler already does quite a good job with optimization. Your notion of bytecode being akin to traditional assembler is slightly wrong here, though. Especially .NET bytecode retains enough metadata to reconstruct the original source almost exactly (see Reflector). What isn't retained there are the names of local variables and arguments to methods. But you still need and retain the method and class names.
Another issue you should be aware of: If you give your customers an obfuscated executable and your program crashes, make sure you have a way of getting the real stacktrace back instead of the obfuscated one. Saying "Sorry, I cannot determine the root cause of why my program killed hours of your work since I chose to obfuscate it" isn't going to cut it, I guess :-)
Obfuscation is a common technique for mobile applications where you have hardware restrictions. Obfuscated code tends to have shorter identifiers and therefore smaller binaries.

What the best Language to use when creating Windows Shell Context Menu?

I'm writing a app which integrates with windows shell and adds an additional context menu.
And am considering a couple of languages to write it in:
MS .NET - I'd rather not use managed code for this type of app
win32asm - This is my first choice
VC++/C++ - Not sure
So basically its a toss up between assembly and C++ anyone have any thoughts or considerations that might make my choice easier?
You want shell context applications to have small footprints. This rules out managed code at least for now. This may speak somewhat in favour of win32asm, although the C++ libraries aren't really all that large compared to the .NET runtime (less than a MB, all told, isn't that big these days)!
You want shell context applications to be stable, since otherwise people will kick them out to save their explorer.exe processes. This speaks heavily against win32asm. If you know only you will ever maintain the app, and you have great assembler skills, win32asm may work, though I myself wouldn't go that way. You still have to implement COM interfaces, which is a big enough headache without adding the complexities of assembly coding.
I'd go for VC++ with ATL support, without further thought, but with serious unit testing and safeguards against resource leakage. But if you aren't comfortable with C++ and templates, this may present a rocky road for you. On the plus side, you'll have a much smaller set of source code to maintain, and have a much easier time finding others to help or take over. You may also have improved a still relevant valuable skill set.

Resources