It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
By scopes I mean function scopes, classes scopes, libraries, DLLs and so on.
If at the end all code is translated to series of instructions(by compiler), how does the high-level scopes in an high-level language have any effect on the code at the lowest level?
That's the whole point - one of them is high-level, the other is low-level. The CPU instructions don't know anything about scopes. It's a matter of memory management.
"Scopes" are part of the high-level programming environment - the compiler checks for you (according to rules of the programming language) that you can access the stuff you're trying to access (for example, you can't access variable x that's local to another function). One might say that this is a means of reducing bugs.
The "real" code being executed doesn't know about scopes.
Each variable name, or function name, etc., corresponds to a location in memory where the corresponding information or code is stored. (That's a slight simplification but you get the idea). A scope represents a mapping from names to locations. Think of it as a dictionary or similar data structure in your favorite language.
Nested scopes work as a sort of stack. When the language syntax introduces a new scope, a new mapping is pushed to the stack. When you use a variable name, say i, the compiler or interpreter looks for it at the top-level mapping, then (if it's not found there) upwards according to the rules of the language. When a scope ends, the relevant mapping is popped off the stack and the previous mapping comes back in effect.
The ugly details of how this works could vary: In the simplest case, a compiler could generate code that refers to memory locations directly. More realistically: a C compiler produces executable objects that include a "symbol table" which maps variable and function names to locations in memory. The linker resolves references between modules, by looking up references from one module in the symbol table of another. And names are not mapped to absolute locations, but to offsets from some reference point. This allows libraries to be "relocatable code", which means they could be loaded to any location in memory during execution and still work properly. Languages that compile to a virtual machine may cut some corners, but the principles are the same.
Scope is a high level language construct and mechanism. The low level CPU instructions are not concerned with scope.
Related
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
Code clones, also known as Duplicate code is often considered harmful to the system quality.
I'm wondering whether these duplicate code could be seen in standard APIs or other mature tools.
If it is indeed the case, then which language(such like C,Java,Python,common lisp etc.) do you think should introduce code clone practice with a higher probability?
Code cloning is extremely common no matter what programming language is used, yes, even in C, Python and Java.
People do it because it makes them efficient in the short term; they're doing code reuse. Its arguably bad, because it causes group inefficiencies in the long term: clones reuse code bugs and assumptions, and when those are discovered and need to be fixed, all the code clones need to be fixed, and the programmer doing the fixing doesn't know where the clones are, or even if there are any.
I don't think clones are bad, because of the code reuse effect. I think what is bad is not managing them.
To help with the latter problem, I build clone detectors (see our CloneDR) that automatically find exact and near-miss duplicated code, using the structure of the programming language to guide the search. CloneDR works for a wide variety of programming languages (including OP's set).
In any software system of 100K SLOC or more, at least 10% of the code is cloned. (OK, OK, Sun's JDK is built by an exceptionally good team, they have only about 9.5%). It tends to be worse in older conventional applications; I suspect because the programmers clone more code out of self defense.
(I have seen applications in which the clones comprise 50%+ of code, yes, those programs tend be awful for many reasons, not just cloning).
You can see clone reports at the link for applications in several langauges, look at the statistics, and see what the clones look like.
All code is the same, regardless of who writes it. Any API that you cite was written by human beings, who made decisions along the way. I haven't seen the perfect API yet - all of them get to redo things in hindsight.
Cloning code flies in the face of DRY, so of course it's recommended that you not do it. It's harmful because more code means more bugs, and duplication means you'll have to remember to fix them in all the clones.
But every rule has its exceptions. I'm sure everyone can think of a circumstance in which they did something that "best practices" and dogma would say they shouldn't, but they did it anyway. Time and other constraints don't always allow you to be perfect.
Suggesting that permission needs to be granted to allow such a thing is ludicrous to me. Be as DRY as you can. If you end up violating it, understand why you did it and remember to go back and fix it if you get a chance.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I have learned + am learning programming on my own. I see these words often. I would appreciated if someone explained them in the context of programming:
Bootstrap
Sandbox
Scaffolding
Syntactic Sugar
Tear Down
Boiler Plate
VPN
Nightly Builds
Unamaged DLL, e.g.: aspnet_isapi.dll
Bootstrap
The very early part of a computer start-up process. True "bootstrap" loaders have not existed on most systems in 20 years or so -- the term comes from the way a "bootstrap loader" was only big enough to read in the next few instructions and overwrite itself with a new, larger loader. This was necessary since the bootstrap loader had to be keyed in by hand, a tedious process involving switches and lights on the front panel of the computer. "Bootstrap" comes from the phrase "pulling yourself up by your bootstraps".
Sandbox
This is a partition of some sort in a computer system where one can experiment and "play" with new concepts without danger of damaging the rest of the system. This term alludes to the "sandbox" that many US kids played in all summer in the days before video games. It was a large box, typically about 6 feet square and a foot deep, filled with sand. Children (mostly boys) would play in it with toy tractors, toy soldiers, small shovels and pails, etc.
Scaffolding
In the construction trades this is a temporary structure used to assist in the construction or maintenance of something more permanent. You will often see, eg, scaffolding erected around a building to paint it or to repair masonry or what-have-you. In computing its a similar concept -- the scaffolding is a (purportedly) temporary piece of software used as a "stand in" for more permanent code and to permit testing of a partially coded application. It may, eg, be a "driver" to test a subcomponent separately from a larger system, or it may be a substitute for a subcomponent that has not yet been coded.
Syntactic sugar
This refers symbols or words in a language syntax that are there purely for human understanding, vs being necessary to specify the intended semantics to the computer. For instance, a language might have a "GO TO xxx" statement, when the "TO" is unnecessary, given that there's no ambiguity in simply saying "GO xxx". C/C++/Java have relatively little syntactic sugar (can't think of any obvious examples offhand), but COBOL, SQL, and a number of other languages have quite a lot.
Boiler plate
Not sure where this term originated, but I suspect it came from business and most likely contract law. It refers to the long, tedious "fine print" sections in some document that were, in all likelihood, copied verbatim from a prior document (and which, with modern word processors, are often embedded in a document using a single macro or document inclusion). Basically it's stuff that's meaningless drivel to all but the lawyers. So, by extension, in software "boiler plate" may be stuff that's always included in a program or procedure, and usually provided automatically or via macros.
VPN
Virtual private network. A concept where a program running on your laptop, say, will provide other programs on your box with an IP connection that is fully encrypted and which connects to a secure computer on the other end. (Ie, it "looks like" a physical ethernet connection to other software.) This allows you to, eg, use a regular browser or email client to communicate with the other end with no fear of having the messages intercepted (except by the CIA, of course), and without having to individually manage the encryption schemes for each tool.
Nightly builds
A technique used in some software shops where every night a product under development is recompiled from scratch and usually subjected to a series of "unit tests". This process may be wholly automatic or may be managed by humans to varying degrees. This is usually reserved for fairly large products (eg, operating systems), or it may be used in, eg, app shops to rebuild and test all of the apps currently under development.
Honestly I don't know them all, but I can say that:
Bootstrap -> it refers to a starting process and related activities
Sandbox -> it represents a mechanism in which an application/code is contained into an area and can't access external resources/hardware/code (iPhone applications are a perfect example)
Tear down -> means destroy objects! It's fundamentally related to Unit Testing frameworks... they have a tearDown() method into which is possible to release/destroy object used for the test
Boiler Plate -> it's a block of "precooked" code that can be used as a starting point to write your own code... methods generated automatically by IDE can be considered boilerplate code
Bootstrap - the program that resides on a special location on harddrive and is responsible for loading and executing the Operating System.
Syntactic Sugar - refers to a syntax that simplifies another syntax that does same thing. For example i+= 1; is syntactic sugar to i = i + 1;
Tear down - refers to the process of freeing resources after they're no longer required.
nightly builds - a program that is compiled every night using current source code usually from a repository such as SVN
Unmanaged DLL - refers to any none .net dlls.
You should be able to find information about them pretty easily if you have some background …
Syntactic Sugar: a + b does not mean anything more than a.__add__(b) (Python).
VPN: Virtual private network -- be in a network from outside with a tunnel.
nightly build: compile during the night
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 12 years ago.
I'm looking for programming languages that let you redefine their type system without having to hack into the compiler. Is there anything out there that allows you to do that?
Thanks
In C you can use DEFINE to redefine everything.
#DEFINE int double
Whether it's good or bad you can find out here:
What is the worst real-world macros/pre-processor abuse you've ever come across?
If you're talking about redefining an actual type system, like making a statically typed language dynamic or making a weakly-typed language strongly-typed, then no.
Practically every language lets you define your own types, so I don't think that's what you meant either.
The only thing I can think of that might fit into what you're asking about are Macros in Common Lisp, which let you extend the syntax. This might be able to acheive what you are looking for, but until you state what it is exactly you're looking for, I can't really elaborate.
Also OCaml and its related languages allow you to do some pretty cool things with types. You can basically define any kind of type you can think of and then match against it with pattern matching, which makes it especially good to write compilers in.
Javascript, Ruby, and Smalltalk, just that i know of, allow you to do all kinds of stuff, even redefining on the fly what an Object can do. Perl allows you to redefine practically the whole language. Basically any decent scripting language, especially one that allows duck typing, should have equal power. But it seems to be really common among functional languages and those with functional abilities.
If I remember correctly, Ada have neat type-creation possibilities, specially for measures (for instance, defining a minimum and a maximum, checking operations between differents measures...). I've seen it quoted as an example to avoid very stupid bugs.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
So is a decompiler really a thing that gives gives the source of a compiled/interpreted piece of code? Because to me that sounds impossible. How would you get the names of the functions, variables, classes, etc if it is compiled. Or am I misinterpreting the definition? How does it work? And what is the general principal behind making one?
You're right about your definition of a decompiler: it takes a compiled application and produces source code to match. However, it does not in most cases know the name and structure of variables/functions/classes--it just guesses. It analyzes the flow of the program and tries to find a way to represent that flow through a certain programming language, typically C. However, because the programming language of choice (C, in this example) is often at a higher level than the state of the underlying program (a binary executable), some parts of the program might be impossible to represent accurately; in this case, the decompiler would fail and you would need to use a disassembler. This is why many people like to obfuscate their code: it makes it much harder for decompilers to open it.
Building a decompiler is not a simple task. Basically, you have to take the application that you are decompiling (be it an executable or some other form of compiled application) and parse it into some kind of tree you can work with in memory. You would then analyze the flow of the program and try to find patters that might suggest that an if statement/variable/function/etc was used in a certain location in the code. It's all really just a guessing game: you'd have to know the patterns that the compiler makes in compiled code, then search for those patterns and replace them with equivalent human-readable source code.
This is all much simpler for higher-level programs like Java or .NET, where you don't have to deal with assembly instructions, and things like variables are mostly taken care of for you. There, you don't have to guess as much as just directly translate. You might not have exact variable/method names, but you can at least deduce the program structure fairly easily.
Disclaimer: I have never written a decompiler and thus don't know every detail of what I'm talking about. If you are really interested in writing a decompiler, you should get a book on the topic.
A decompiler basically takes the machine code and reverts it back to the language it was formatted in. If I'm not mistaken, I think the decompiler needs to know what language it was compiled in, otherwise it won't work.
The basic purpose of the decompiler is to get back to your source code; for example, one time my Java file got corrupted and the only thing I could so to bring it back was by using a decompiler (since the class file wasn't corrupted).
It works by deducing a "reasonable" (based on some heuristics) representation of what's in the object code. The degree of resemblance between what it produces and what was originally there tends to depend heavily upon how much information is contained in binary it starts from. If you start with basically a "pure" binary, it's generally stuck with just making up "reasonable" names for the variables, such as using things like i, j and k for loop indexes, and longer names for most others.
On the other hand, a language that supports introspection needs to embed a great deal more information about variable names, types, etc., into the executable. In a case like this, decompiling can produce something much closer to the original, such as typically retaining the original names for functions, variables, etc. In such a case, the decompiler can often produce something quite similar to the original -- possibly losing little more than formatting and comments.
That depends on what language you are decompiling. If you are decompiling something like C or C++, then the only information provided to you is function names and arguments (In DLLs). If you are dealing with java, then the compiler usually inserts line numbers, variable names, field and method names, and so on. If there are no variable names, then you would get names like localInt1, localInt2, localException1. Or whatever the compiler is. And it can tell the spacing between lines, because of the line numbers.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 13 years ago.
Often, I am told that Security functions are not available at a level of abstraction that a developer with little security knowledge can use them. What changes will the developers want in their developement environment say for Java that will make securing their software much easier than today.
I am looking at new ways like providing configurability at the level where the programmer just has to declare the security function he desires and the level he wants and only really power programmers will need to go and do something really extra.
So 2 part question - what services will you want as a developer and how would you like it to be integrated into your IDE (your development environment) so that you can easily use it.
where the programmer just has to declare the security function he desires
That's like asking "What type of scalpel can I buy so I don't have to learn doctorin'?"
You can't.
The issue of "security" covers a very broad range of concerns. You don't just "turn on security" and it's done. Security issues involve guarding your software from an ever-growing number of malicious behaviors.
At the root of it, computers have to let people do things. To give users that freedom, people will always find ways of getting into things they are not supposed to get into. The people who write operating systems, frameworks, and development environments can patch holes and abstract away some of today's security concerns but new ones will always be developed. Daily. It's a moving target.
Security is complicated because you have to know what types of vulnerabilities your application can be subject to. The only way to learn that is through vigilant study and experience.
You can't just flip a switch.
The more I think about it, the more I realize that what you want actually exists. Programmers tell the computer what they want it to do in a very specific manner, using well defined languages. The computer is expected to do exactly what the programmer dictates, within the definition of the language. If the programmer wants security, s/he can already tell the computer to act in a secure manner - such as using String classes in C++ to avoid buffer overruns with char arrays.
Of course, this isn't enough. Programmers still get things wrong, so language designers try to help with things like Java. In Java, buffer overruns are much harder to exploit beyond a simple application crash (char[]c = new char[1]; System.out.println(c[10]);).
But that's not really enough - the compiler can have bugs that will insert overruns, or the virtual machine may have such bugs. And other vulnerabilities will exist - like bad permissions on files, or exploitable race conditions (aka TOCTOU).
As we find vulnerability types, we can take steps to prevent them (developer training, new language features, new compiler features, and new OS features), and we can take steps to identify them (dynamic analysis, source code analysis, binary analysis), but we can't eliminate all bugs. This is especially true as new technologies come into play (once XSS and SQL injection were understood, developers started introducing LDAP injection).
OWASP is trying to do this with it's ESAPI project.
The best way for security to work is to have it built into the API, with very context-aware, context-specific programming methods. This is what SqlParameters do, in .NET, and similar things in other languages (they grab as much context as possible; types of variables, and so on, and perform validation).
If you want to get involved, you can probably get in on the OWASP action, as long as you are motivated.
The best way to do security is to let someone else do it, and follow the API and instructions to the letter. But also, you need to have significant trust in the people doing the underlying work, and you need to stay up to date with what is happening with the libraries you use.
On that last point, you also need to stay flexible. If it so happens that a vulnerability is discovered in your underlying system X, then you may need to update or remove it completely (but most likely update). You need to have the facility to do this ASAP. I.e. to swap out hashing functions, or to change encryption routines.
Obviously, this area is complicated and interesting. I suggest to you that OWASP is a good place to get started.