Spin - Formal Verification

Spin - Formal Verification - model-checking

Has anyone have contact with model checking with this tool SPIN, even more any information of model checking (concurrent programs)

Yes, SPIN is a very good model checker but I am wondering what you want? Do you just want to hear that yes, I have heard and played with SPIN or do you want suggestions for how to model check source code?
For example, if you're a C programmer, get your hands in ESBMC, write a little program and run ESBMC on it.
That should get you started to understanding what can be done and how to do it. By the way, for starters Model Checking is not static analysis. It is actually much more powerful. It's the anti-static analysis. Model Checking actually 'in a (very narrow) sense' simulates your program and find situations (argument combinations, exceptional situations, border cases) where it would actually fail.

Related

Comparing a concrete execution trace with an Alloy Model

I'm using Alloy to model a system. I would like to check the implemented system matches the Alloy model by comparing log traces from a concrete execution of the actual system with the model.
The way I see this working is:
Add logs to the implemented system at points that correspond to the high level concepts modelled in Alloy, such as "Receptionist checks in guest G1"
Pre-process these into a form understood by Alloy
Give this to Alloy (or some other tool) and say 'Does this model admit this trace?' (this question)
This would be run over the operational logs of the system (or maybe subsets if performance is a problem) and continuously validate that the system was operating 'to spec'.
Is that possible / reasonable?

Possible yes.
Reasonable I'm not quite sure.
To me, Alloy shine at finding unknown unknowns, i.e. pitfalls, in your specifications.
Once the specification is fool-proofed using Alloy analysis, I don't see the point of encumbering your program with unnecessary translations and analysis steps. It's not only error prone, but you might also find yourself limited with the scalability of the analyzer if the traces you want to validate are substantial...
But again, it's doable. So if you want it, sure, do it ... :-)

I'm using Alloy to model a system. I would like to check the implemented system matches the Alloy model by comparing log traces from a concrete execution of the actual system with the model.
Yes, I think that is a bit of work but it should be doable. I would be very interested in getting this to work. I have been thinking about this for a long time.
Loïc argues correctly that Alloy shines in finding solutions but to keep this manageable, Alloy must keep the scope small. Although this is true, Alloy is also a specification language. The timing issue is only in finding a solution. However, the problem you sketch is different, you already have the solution in the log. Each event specifies a transition in the state.
If you're familiar with the Alloy Evaluator then you should be aware that once you have a solution, you can run any Alloy code on that instance. Inside Alloy, there is a full set of classes to simulate an instance and run Alloy code against it.
So I think you can start with an initial instance and use your log event to create a secondary instance and then use Alloy to verify this is a valid transition. This will be very fast and I do not see why this could not handle a very large number of objects. Surely thousands and with a bit of caching wizardry millions.
We are currently working hard on Alloy 6, which will be integrate Electrum, where we will have full temporal logic that will make the rules easier to express.
I've been looking for a customer for a long time that would like to develop the necessary code to bridge Alloy & the trenches. If this works as I think it can work, it would be very interesting for the software industry.

How to Protect an Exe File from Decompilation

What are the methods for protecting an Exe file from Reverse Engineering.Many Packers are available to pack an exe file.Such an approach is mentioned in http://c-madeeasy.blogspot.com/2011/07/protecting-your-c-programexe-files-from.html
Is this method efficient?

The only good way to prevent a program from being reverse-engineered ("understood") is to revise its structure to essentially force the opponent into understanding Turing Machines. Essentially what you do is:
take some problem which generally proven to be computationally difficult
synthesize a version of that whose outcome you know; this is generally pretty easy compared to solving a version
make the correct program execution dependent on the correct answer
make the program compute nonsense if the answer is not correct
Now an opponent staring at your code has to figure what the "correct" computation is, by solving algorithmically hard problems. There's tons of NP-hard problems that nobody has solved efficiently in the literature in 40 years; its a pretty good bet if your program depends on one of these, that J. Random Reverse-Engineer won't suddenly be able to solve them.
One generally does this by transforming the original program to obscure its control flow, and/or its dataflow. Some techniques scramble the control flow by converting some control flow into essentially data flow ("jump indirect through this pointer array"), and then implementing data flow algorithms that require precise points-to analysis, which is both provably hard and has proven difficult in practice.
Here's a paper that describes a variety of techniques rather shallowly but its an easy read:
http://www.cs.sjsu.edu/faculty/stamp/students/kundu_deepti.pdf
Here's another that focuses on how to ensure that the obfuscating transformations lead to results that are gauranteed to be computationally hard:
http://www.springerlink.com/content/41135jkqxv9l3xme/
Here's one that surveys a wide variety of control flow transformation methods,
including those that provide levels of gaurantees about security:
http://www.springerlink.com/content/g157gxr14m149l13/
This paper obfuscates control flows in binary programs with low overhead:
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.167.3773&rank=2
Now, one could go through a lot of trouble to prevent a program from being decompiled. But if the decompiled one was impossible to understand, you simply might not bother; that's the approach I'd take.
If you insist on preventing decompilation, you can attack that by considering what decompilation is intended to accomplish. Decompilation essentially proposes that you can convert each byte of the target program into some piece of code. One way to make that fail, is to ensure that the application can apparently use each byte
as both computer instructions, and as data, even if if does not actually do so, and that the decision to do so is obfuscated by the above kinds of methods. One variation on this is to have lots of conditional branches in the code that are in fact unconditional (using control flow obfuscation methods); the other side of the branch falls into nonsense code that looks valid but branches to crazy places in the existing code. Another variant on this idea is to implement your program as an obfuscated interpreter, and implement the actual functionality as a set of interpreted data.
A fun way to make this fail is to generate code at run time and execute it on the fly; most conventional languages such as C have pretty much no way to represent this.
A program built like this would be difficult to decompile, let alone understand after the fact.
Tools that are claimed to a good job at protecting binary code are listed at:
https://security.stackexchange.com/questions/1069/any-comprehensive-solutions-for-binary-code-protection-and-anti-reverse-engineeri

Packing, compressing and any other methods of binary protection will only every serve to hinder or slow reversal of your code, they have never been and never will be 100% secure solutions (though the marketing of some would have you believe that). You basically need to evaluate what sort of level of hacker you are up against, if they are script kids, then any packer that require real effort and skill (ie:those that lack unpacking scripts/programs/tutorials) will deter them. If your facing people with skills and resources, then you can forget about keeping your code safe (as many of the comments say: if the OS can read it to execute it, so can you, it'll just take a while longer). If your concern is not so much your IP but rather the security of something your program does, then you might be better served in redesigning in a manner where it cannot be attack even with the original source (chrome takes this approach).

Decompilation is always possible. The statement
This threat can be eliminated to extend by packing/compressing the
executable(.exe).
on your linked site is a plain lie.

Currently many solutions can be used to protect your application from being anti-compiled. Such as compressing, Obfuscation, Code snippet, etc.
You can looking for a company to help you achieve this.
Such as Nelpeiron, the website is:https://www.nalpeiron.com/
Which can cover many platforms, Windows, Linux, ARM-Linux, Android.
What is more Virbox is also can be taken into consideration:
The website is: https://lm-global.virbox.com/index.html
I recommend is because they have more options to protect your source code, such as import table protection, memory check.

Haskell for mission-critical systems [duplicate]

I've been curious to understand if it is possible to apply the power of Haskell to embedded realtime world, and in googling have found the Atom package. I'd assume that in the complex case the code might have all the classical C bugs - crashes, memory corruptions, etc, which would then need to be traced to the original Haskell code that
caused them. So, this is the first part of the question: "If you had the experience with Atom, how did you deal with the task of debugging the low-level bugs in compiled C code and fixing them in Haskell original code ?"
I searched for some more examples for Atom, this blog post mentions the resulting C code 22KLOC (and obviously no code:), the included example is a toy. This and this references have a bit more practical code, but this is where this ends. And the reason I put "sizable" in the subject is, I'm most interested if you might share your experiences of working with the generated C code in the range of 300KLOC+.
As I am a Haskell newbie, obviously there may be other ways that I did not find due to my unknown unknowns, so any other pointers for self-education in this area would be greatly appreciated - and this is the second part of the question - "what would be some other practical methods (if) of doing real-time development in Haskell?". If the multicore is also in the picture, that's an extra plus :-)
(About usage of Haskell itself for this purpose: from what I read in this blog post, the garbage collection and laziness in Haskell makes it rather nondeterministic scheduling-wise, but maybe in two years something has changed. Real world Haskell programming question on SO was the closest that I could find to this topic)
Note: "real-time" above is would be closer to "hard realtime" - I'm curious if it is possible to ensure that the pause time when the main task is not executing is under 0.5ms.

At Galois we use Haskell for two things:
Soft real time (OS device layers, networking), where 1-5 ms response times are plausible. GHC generates fast code, and has plenty of support for tuning the garbage collector and scheduler to get the right timings.
for true real time systems EDSLs are used to generate code for other languages that provide stronger timing guarantees. E.g. Cryptol, Atom and Copilot.
So be careful to distinguish the EDSL (Copilot or Atom) from the host language (Haskell).
Some examples of critical systems, and in some cases, real-time systems, either written or generated from Haskell, produced by Galois.
EDSLs
Copilot: A Hard Real-Time Runtime Monitor -- a DSL for real-time avionics monitoring
Equivalence and Safety Checking in Cryptol -- a DSL for cryptographic components of critical systems
Systems
HaLVM -- a lightweight microkernel for embedded and mobile applications
TSE -- a cross-domain (security level) network appliance

It will be a long time before there is a Haskell system that fits in small memory and can guarantee sub-millisecond pause times. The community of Haskell implementors just doesn't seem to be interested in this kind of target.
There is healthy interest in using Haskell or something Haskell-like to compile down to something very efficient; for example, Bluespec compiles to hardware.
I don't think it will meet your needs, but if you're interested in functional programming and embedded systems you should learn about Erlang.

Andrew,
Yes, it can be tricky to debug problems through the generated code back to the original source. One thing Atom provides is a means to probe internal expressions, then leaves if up to the user how to handle these probes. For vehicle testing, we build a transmitter (in Atom) and stream the probes out over a CAN bus. We can then capture this data, formated it, then view it with tools like GTKWave, either in post-processing or realtime. For software simulation, probes are handled differently. Instead of getting probe data from a CAN protocol, hooks are made to the C code to lift the probe values directly. The probe values are then used in the unit testing framework (distributed with Atom) to determine if a test passes or fails and to calculate simulation coverage.

I don't think Haskell, or other Garbage Collected languages are very well-suited to hard-realtime systems, as GC's tend to amortize their runtimes into short pauses.
Writing in Atom is not exactly programming in Haskell, as Haskell here can be seen as purely a preprocessor for the actual program you are writing.
I think Haskell is an awesome preprocessor, and using DSEL's like Atom is probably a great way to create sizable hard-realtime systems, but I don't know if Atom fits the bill or not. If it doesn't, I'm pretty sure it is possible (and I encourage anyone who does!) to implement a DSEL that does.
Having a very strong pre-processor like Haskell for a low-level language opens up a huge window of opportunity to implement abstractions through code-generation that are much more clumsy when implemented as C code text generators.

I've been fooling around with Atom. It is pretty cool, but I think it is best for small systems. Yes it runs in trucks and buses and implements real-world, critical applications, but that doesn't mean those applications are necessarily large or complex. It really is for hard-real-time apps and goes to great lengths to make every operation take the exact same amount of time. For example, instead of an if/else statement that conditionally executes one of two code branches that might differ in running time, it has a "mux" statement that always executes both branches before conditionally selecting one of the two computed values (so the total execution time is the same whichever value is selected). It doesn't have any significant type system other than built-in types (comparable to C's) that are enforced through GADT values passed through the Atom monad. The author is working on a static verification tool that analyzes the output C code, which is pretty cool (it uses an SMT solver), but I think Atom would benefit from more source-level features and checks. Even in my toy-sized app (LED flashlight controller), I've made a number of newbie errors that someone more experienced with the package might avoid, but that resulted in buggy output code that I'd rather have been caught by the compiler instead of through testing. On the other hand, it's still at version 0.1.something so improvements are undoubtedly coming.

Using Polymorphic Code for Legitimate Purposes?

I recently came across the term Polymorphic Code, and was wondering if anyone could suggest a legitimate (i.e. in legal and business appropriate software) reason to use it in a computer program? Links to real world examples would be appreciated!
Before someone answers, telling us all about the benefits of polymorphism in object oriented programming, please read the following definition for polymorphic code (taken from Wikipedia):
"Polymorphic code is code that uses a polymorphic engine to mutate while keeping the original algorithm intact. That is, the code changes itself each time it runs, but the function of the code in whole will not change at all."
Thanks, MagicAndi.
Update
Summary of answers so far:
Runtime optimization of the original code
Assigning a "DNA fingerprint" to each individual copy of an application
Obfuscate a program to prevent reverse-engineering
I was also introduced to the term 'metamorphic code'.

Runtime optimization of the original code, based on actual performance statistics gathered when running the application in its real environment and real inputs.

Digitally watermarking music is something often done to determine who was responsible for leaking a track, for example. It makes each copy of the music unique so that copies can be traced back to the original owner, but doesn't affect the audible qualities of the track.
Something similar could be done for compiled software by running each individual copy through a polymorphic engine before distributing it. Then if a cracked version of this software is released onto the Internet, the developer might be able to tell who cracked it by looking for specific variations produced the polymorphic engine (a sort of DNA test). As far as I know, this technique has never been used in practice.
It's not exactly what you were looking for I guess, since the polymorphic engine is not distributed with the code, but I think it's the closest to a legitimate business use you will find for this kind of technique.

Polymorphic code is a nice thing, but metamorphic is even nicer. To the legitimate uses: well, I can't think of anything other than anti-cracking and copy protection. Look at vx.org.ua if you wan't real world uses (not that legitimate though)

As Sami notes, on-the-fly optimisation is an excellent application of polymorphic code. A great example of this is the Fastest Fourier Transform in the West. It has a number of solvers at its disposal, which it combines with self-profiling to adjust the code path and solver parameters on subsequent executions. The result is the program optimises itself for your computing environment, getting faster with subsequent runs!
A related idea that may possibly be of interest is computational steering. This is the practice of altering the execution path of large simulations as the run proceeds, to focus on areas of interest to the researcher. The overall purpose of the simulation is not changed, but the feedback cycle acts to optimise the calculation. In this case the executable code is not being explicitly rewritten, but the effect from a user perspective is similar.

Polymorph code can be used to obfuscate weak or proprietary algorithms - that may use encryption e. g.. There're many "legitimate" uses for that. The term legitimate these days is kind of narrow-minded when it comes to IT. The core-paradigms of IT contain security. Whether you use polymorph shellcode in exploits or detect such code with an AV scanner. You have to know about it.

Obfuscate a program i.e. prevent reverse-engineering: goal being to protect IP (Intellectual Property).

Expert System engine

Next year will be my graduate year to be an informatics engineering person and I am trying to find ideas about the jounior project. Actually, I have an idea of making an expert system engine. I worked with clips and prolog and I liked clips very much but it seems to be an old engine. Can any one advice me about this idea or give me sources for papers or any topics that may help me? I am thinking to use C language to obtain the high performance, and to build a robust data structure. Also, I am thinking about an idea (I dont know if it could be done) of writing facts and rules (like clips) and then generate a C++ optimal code from these rules such that I can obtain the speed of the machine and use exe file.
I need help to make this idea more clear and how it can be done. Specially because I read about fuzzy logic, nueral network and heard about the new generation of expert system, so I dont know how that can be related to such topic.

For your junior project, I would recommend against writing it in C. Your problem sounds like it needs correctness more than it needs speed. Writing it in C will take longer because you will need to implement a lot of primitives that are not included in the language or any standard library. Also, since C is relatively low-level, there are a lot of opportunities to make low-level mistakes. Write it in a higher level language that is closer to the problem domain. You will have more time to focus on your actual problem because you will spend less time getting the framework set up. If you already know Prolog, it would be good to stick with that. Perhaps you might consider Mercury. It is similar to Prolog, but also designed for speed.

JBoss Rules (also known as Drools) offers the best approach to rule-processing. It's written in Java. It allows you to integrate program components in the rules, and rule-bases into your program components. You can even build or modify rule-bases on the fly.
I've heard that Java is catching up in its ability to do math, but outside of that, you have nothing to fear from performance.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string