Are "subroutine" and "routine" the same concept? - programming-languages

I have seen both "subroutine" and "routine" used in programming language books. Are they the same concept? What does "sub-" mean?
I guess there are many examples which you might have seen in computer science books, besides the following one from Programming Language Pragmatics, by Scott:
In Section 3.2.2 we discussed the allocation of space on a subroutine
call stack (Figure 3.1). Each routine, as it is called, is given a new
stack frame, or activation record, at the top of the stack. This frame
may contain arguments and/or return values, bookkeeping information
(including the return address and saved registers), local variables,
and/or temporaries. When a subroutine returns, its frame is popped
from the stack.
Thanks.

It is my understanding that subroutine or routine are just names for self-contained blocks of code or instructions the program runs. For example, in Ruby we'd call subroutines methods where as in JavaScript they are called functions.
In the context of the Programming Language Pragmatics example you provided, the subroutine appears to be the call stack of actions to be executed and each item of the stack are routines that launch their own self-contained stack. After all of the processes are performed, the routine exits and the subroutine moves down to the next routine.
Wikipedia has a great high-level explanation of what is happening within the call stack and how subroutines got their name.

Both terms refer to the same thing : a subroutine is a routine called inside a routine. Think of it as a main program (a routine) that has function calls inside and every call to a function is a subroutine. However there are few differences between functions and routines, you can read more here

Related

Is it possible to embed Haskell in a C library opaquely?

i.e. is it possible to embed Haskell code in a C library so that the user of the library doesn't have to know Haskell is being used? In particular, so that the user could use multiple libraries that embed Haskell, without any conflicts?
As far as I understand things, you embed between calls to hs_init and hs_exit, but these involve global state shenanigans and should conflict with other calls, no?
Yes, it's possible to call Haskell code from C (and vice versa) through FFI, the Foreign Function Interface. Unfortunately, as the haskell.org docs says, you can't avoid the calls to initialize and finalize the haskell environment:
The call to hs_init() initializes GHC's runtime system. Do NOT try to
invoke any Haskell functions before calling hs_init(): bad things will
undoubtedly happen.
But, this is interesting also:
There can be multiple calls to hs_init(), but each one should be
matched by one (and only one) call to hs_exit()
And furthermore:
The FFI spec requires the implementation to support re-initialising
itself after being shut down with hs_exit(), but GHC does not
currently support that.
Basically my idea is that you may exploit this specifications in order to write youself a wrapper C++ class that manages the calls to hs_init and hs_exit for you, in example by using template methods surrounded by hs_init and hs_exit that you can override using any haskell call you want.
However, beware of interactions with other libraries calling haskell code: nested layers of calls to hs_init and hs_exit should be OK (so it's safe to use libraries which calls them in between your wrappers), but the total number of calls should always match, meaning that if those libraries only initialize the environment without trying to close it, then it's up to you to finish the job.
Another (probably better) idea, without exploiting inheritance and overriding, may be to have a simple class HaskellEnv that calls hs_init in the constructor and hs_exit in the destructor. If you declare them as automatic variables, you'll obtain that the calls to hs_init and hs_exit will always be matched, and the latest call to hs_exit will be made as soon as the latest HaskellEnv object is destructed when you leave its scope.
Have a look at this question in order to prevent the creation of objects on the heap (they may be dangerous in this case).

What is the fastcall keyword used for in visual c?

I have seen the fastcall notation appended before many functions. Why it is used?
That notation before the function is called the "calling convention." It specifies how (at a low level) the compiler will pass input parameters to the function and retrieve its results once it's been executed.
There are many different calling conventions, the most popular being stdcall and cdecl.
You might think there's only one way of doing it, but in reality, there are dozens of ways you could call a function and pass variables in and out. You could place the input parameters on a stack (push, push, push to call; pop, pop, pop to read input parameters). Or perhaps you would rather stick them in registers (this is fastcall - it tries to fit some of the input params in registers for speed).
But then what about the order? Do you push them from left to right or right to left? What about the result - there's always only one (assuming no reference parameters), so do you place the result on the stack, in a register, at a certain memory address?
Also, let's assume you're using the stack for communication - who's job is it to actually clear the stack after the function is called - the caller or the callee?
What about backing up and then restoring the contents of (certain) CPU registers - should the caller do it, or will the callee guarantee that it'll return everything the way it was?
The most popular calling convention (by far) is cdecl, which is the standard calling convention in both C and C++. The WIN32 API uses stdcall, which means any code that calls the WIN32 API needs to use stdcall for those function calls (making it another popular choice).
fastcall is a bit of an oddball - people realized for many functions with only one in/out parameter, pushing and popping from a memory-based stack is quite a bit of overhead and makes function calls a little bit heavy so the different compilers introduced (different) calling conventions that will place one or more parameters in registers before placing the rest in the stack for better performance. The problem is, not all compilers used the same rules for what goes where and who does what with fastcall, and as a result you have to be careful when using it because you'll never know who does what. Finally, see Is fastcall really faster? for info on fastcall performance benefits.
Complicated stuff.
Something important to keep in mind: don't add or change calling conventions if you don't know exactly what you're doing, because if both the caller and the callee do not agree on the calling convention, you'll likely end up with stack corruption and a segfault. This usually happens when you have the function being called in a DLL/shared library and a program is written that depends on the DLL/SO/dylib being a certain calling convention (say, cdecl), then the library is recompiled with a different calling convention (say, fastcall). Now the old program can no longer communicate with the new library.
Wikipedia states that
Conventions entitled fastcall have not been standardized, and have been implemented differently, depending on the compiler vendor. Typically fastcall calling conventions pass one or more arguments in registers which reduces the number of memory accesses required for the call.

Origins of the name 'main' for program entry point?

Out of curiosity, what are the origins of the name 'main' for a program entry point?
Before C, there was IBM's PL/I. In PL/I you declared a procedure with options. If you wrote
PROC MUMBLE OPTIONS(MAIN);
that told the compiler that the MUMBLE procedure was the main procedure. PL/I may have adopted this convention from elsewhere, or C may have adopted it from PL/I, or maybe it was just in the air. But it definitely predates C.
(If anyone is wondering why all upper case, the IBM keypunches of the day did not support lower-case characters. Yes, I wrote programs on punched cards. That's probably why I'm a bit shaky on the syntax; it has been a while.)
I'm pretty sure that it has to do with the fact that it is the 'main' function of the program. Anything more than that is unknown to me.
In Fortran the main program was the main program even though it didn't have a name. It was distinguished from subroutines and functions by having an executable statement (or other non-commentary statement) without a preceding SUBROUTINE or FUNCTION statement.
When later languages decided they wanted the main routine to start with a beginning line like other procedures or functions, some of them adopted the word MAIN or main in various ways.
As someone else pointed out, Pascal did it differently. Shell scripts and Perl resemble Fortran.
My understanding (though I couldn't find a reference to confirm) is that some early languages had a notion of a main procedure (the first might have been Ada), even though you did not have to name it main().
I think that C was the first language to actually use this token as a name. C largely replaced Pascal which didn't have a named start procedure, if I remember correctly.
From there it influenced subsequent languages that were C inspired like C++, Java and C#.
It also influenced culturally languages that do not mandate such a function, like Python.

Is the valid state domain of a program a regular language?

If you look at the call stack of a program and treat each return pointer as a token, what kind of automata is needed to build a recognizer for the valid states of the program?
As a corollary, what kind of automata is needed to build a recognizer for a specific bug state?
(Note: I'm only looking at the info that could be had from this function.)
My thought is that if these form regular languages than some interesting tools could be built around that. E.g. given a set of crash/failure dumps, automatically group them and generate a recognizer to identify new instances of know bugs.
Note: I'm not suggesting this as a diagnostic tool but as a data management tool for turning a pile of crash reports into something more useful.
"These 54 crashes seem related, as do those 42."
"These new crashes seem unrelated to anything before date X."
etc.
It would seem that I've not been clear about what I'm thinking of accomplishing, so here's an example:
Say you have a program that has three bugs in it.
Two bugs that cause invalid args to be passed to a single function tripping the same sanity check.
A function that if given a (valid) corner case goes into an infinite recursion.
Also as that when the program crashes (failed assert, uncaught exception, seg-V, stack overflow, etc.) it grabs a stack trace, extracts the call sites on it and ships them to a QA reporting server. (I'm assuming that only that information is extracted because 1, it's easy to get with a one time per project cost and 2, it has a simple, definite meaning that can be used without any special knowledge about the program)
What I'm proposing would be a tool that would attempt to classify incoming reports as connected to one of the known bugs (or as a new bug).
The simplest thing would be to assume that one failure site is one bug, but in the first example, two bugs get detected in the same place. The next easiest thing would be to require the entire stack to match, but again, this doesn't work in cases like the second example where you have multiple pieces of (valid) valid code that can trip the same bug.
The return pointer on the stack is just a pointer to memory. In theory if you look at the call stack of a program that just makes one function call, the return pointer (for that one function) can have different value for every execution of the program. How would you analyze that?
In theory you could read through a core dump using a map file. But doing so is extremely platform and compiler specific. You would not be able to create a general tool for doing this with any program. Read your compiler's documentation to see if it includes any tools for doing postmortem analysis.
If your program is decorated with assert statements, then each assert statement defines a valid state. The program statements between the assertions define the valid state changes.
A program that crashes has violated enough assertions that something broken.
A program that's incorrect but "flaky" has violated at least one assertion but hasn't failed.
It's not at all clear what you're looking for. The valid states are -- sometimes -- hard to define but -- usually -- easy to represent as simple assert statements.
Since a crashed program has violated one or more assertions, a program with explicit, executable assertions, doesn't need an crash debugging. It will simply fail an assert statement and die visibly.
If you don't want to put in assert statements then it's essentially impossible to know what state should have been true and which (never-actually-stated) assertion was violated.
Unwinding the call stack to work out the position and the nesting is trivial. But it's not clear what that shows. It tells you what broke, but not what other things lead to the breakage. That would require guessing what assertions where supposed to have been true, which requires deep knowledge of the design.
Edit.
"seem related" and "seem unrelated" are undefinable without recourse to the actual design of the actual application and the actual assertions that should be true in each stack frame.
If you don't know the assertions that should be true, all you have is a random puddle of variables. What can you claim about "related" given a random pile of values?
Crash 1: a = 2, b = 3, c = 4
Crash 2: a = 3, b = 4, c = 5
Related? Unrelated? How can you classify these without knowing everything about the code? If you know everything about the code, you can formulate standard assert-statement conditions that should have been true. And then you know what the actual crash is.

Something higher in the call stack making a call

When a caller is higher in the stack, what does this mean? For example, lets say I start a program, a form loads up (we'll call this a), then this form calls another form (b). The called form will be at the top of the stack, so if this form called form a, will this be a caller higher in the stack making a call to something below?
Thanks
I think you have the wrong impression of the call stack. The call stach is just a "list" of the functions that have been called. When ou have a call chain like you describe, a calls b which calls a, your stack is just:
a.second
b.first
a.first
You can't really call "down" to something. You make another call, and it goes on top of the stack, even if it has been called before, the previous call is completely different, the new call starts a whole new "stack frame".
You need to distinguish between the object making the call (if any), the target of the call, and the method being called. For instance, your call stack could easily look like this:
FormA.Method3()
FormB.Method2()
FormA.Method1()
This is an instance of FormA executing Method1, calling Method2 on an instance of FormB. That then calls Method3 on an instance of FormA - either the same FormA as the first one, or a different one. It doesn't really matter.
It's not really a case of calling "something below" because the stack frames don't represent objects - they represent methods (and the state within those methods). Does that help at all, or is it just confusing things more?

Resources