What is ... variadic argument syntactically? - variadic-functions

What does a C/C++ compiler think ... is? To be clear, I don't think this is a duplicate question becuase other stdarg questions are about "what are variadic argument lists/how do they work?" That's not my question.
I have looked through MSVC's include files and found stdarg.h, vcruntime.h, etc., but haven't satisfied myself yet.
Does the compiler see ... as an operator? A linker symbol? A macro? It can't be an identifier, because that source character (.) isn't allowed in identifiers.
If I had to guess, I'd say it's something akin to using __attribute__ macros or inline or register compiler "hints" to inhibit warnings/errors upon invoking the function with multiple parameters.
From ISO9899:
6.5.2.2 Function calls
Constraints
6 The ellipsis notation in a function prototype declarator causes
argument type conversion to stop after the last declared parameter. The default argument
promotions are performed on trailing arguments.
I suppose not everything needs to be nailed down exactly, but I was curious if maybe there was more technical information out there.

A punctuator.
ISO 9899:
6.4.6 PunctuatorsSemantics2  A punctuator is a symbol that has independent syntactic and semantic significance. Depending on context, it may specify an operation to be performed (which in turn may yield a value or a function designator, produce a side effect, or some combination thereof) in which case it is known as an operator (other forms of operator also exist in somecontexts). An operand is an entity on which an operator acts.

Related

Why a function can be a literal and a expression can't?

I understand that the concept literal is applied to whenever you represent a fixed value in source code, exactly as it is meant to be interpreted, vs. a variable or a constant, which are names for several of a class or one of them respectively.
But they are also opposed to expressions. I thought it was because they could incorporate variables. But even expressions like 1+2 are not (see first answer in What does the word "literal" mean?).
So, when I define a variable this way:
var=1+2
1+2 is not a literal even though it is not a name and evaluates to a single value. I could then guess that it is because it doesn't represent the target value directly; in other words, a literal represents a value "exactly as it is".
But then how is it possible that a function like this one is a literal (as pointed it the same linked answer)?
(x) => x*x
Only anonymous functions can be literal because they are not bound to an identifier
so (x)=>x*x is a literal because it is a anonymous function,or function literal
but a
void my_func()
{#something}
is not a literal cause it is bound to an identifier;
read these,
https://en.wikipedia.org/wiki/Literal_(computer_programming)
https://en.wikipedia.org/wiki/Anonymous_function
Expressions can be divided into two general types: atomic expressions and composite expressions.
Composite expressions can be divided by operator, and so on; atomic expressions can be divided into variables, constants, and literals. I guess different authors might use other categories or boundaries here, so it might not be universal. But I will argue why this categorization might make sense.
It's fairly obvious why strings or numbers are literals, or why a sum isn't. A function call can be considered composite, as it operates on subexpressions - its parameters. A function definition does not operate on subexpressions. Only when the so defined function is called, that call passes parameters into the function. In a compiled language, the anonymous function will likely be replaced by a target address where the corresponding code is located - that memory location is obviously not dependent on any subexpression.
#rdRahul's answer references this Wikipedia article, which says that object literals, such as {"cat", "dog"} can be considered literals. This can be easily argued by pointing out that the object which is the value of the expression is just one opaque thing, such as a pointer to the memory location of the object.

Function pattern matching in Haskell

I'm trying to learn Haskell in guidance of Learn You a Haskell, but the following puzzles me.
lucky :: (Integral a) => a -> String
lucky 7 = "LUCKY NUMBER SEVEN!"  
lucky x = "Sorry, you're out of luck, pal!"
As you can see, there's one line up there stating the exact types of the function. But is this necessary? Can't the types of parameters and return values be deduced from the patterns below that line?
You are right, they are absolutely not necessary. However, it's a very common practice to state the type of the function nevertheless, for at least two reasons :
To tell the compiler what you actually mean. In case you make a mistake writing the function, the compiler will not infer a bad type, but warn you of your mistake
To tell the people who read your code. They'll have to find out the type of the function anyway while understanding the code, so you might as well make it easier for them. Having the type explicitly makes the code more readable.
This is why, although they are optional, the types of top level functions are almost always spelled out in Haskell code.
To complete with what Zeta said, it is not necessary in this case. However, in some situations, it is necessary to specify the type of the function when the code is too ambiguous to infer.
For documentation purpose, and because for some type extensions the automatic inference fails. Read here.

Equality operator while checking condition in C++

IS there a difference between these two conditions:
if (a==5) and if (5==a)?
No, there is no difference at all.
People used to write this expression 5==a instead of a==5 so the could catch a=5 errors on C/C++ where that expression is perfectly valid and always evaluates to true. That way, if programmer writes (by mistake) the expression 5=a then it will get a compiler error.
The two are normally the same.
Some people recommend putting the constant first (if (5==a)) because this way, if you mis-type and leave out one of the = to get: if (5=a), the compiler will give an error message, whereas if (a=5) will compile and execute, but probably not do what you want.
Some compilers will give a warning for the latter (e.g., recent iterations of gnu do) but others don't (and Visual C++ is among the latter).
If 'a' points to an object that overrides ==, then you may get different results in theory.

why do some languages require function to be declared in code before calling?

Suppose you have this pseudo-code
do_something();
function do_something(){
print "I am saying hello.";
}
Why do some programming languages require the call to do_something() to appear below the function declaration in order for the code to run?
Programming languages use a symbol table to hold the various classes, functions, etc. that are used in the source code. Some languages compile in a single pass, whereby the symbols are pulled out of the symbol table as soon as they are used. Others use two passes, where the first pass is used to populate the table, and then the second is used to find the entries.
Most languages with a static type system are designed to require definition before use, which means there must be some sort of declaration of a function before the call so that the call can be checked (e.g., is the function getting the right number and types of arguments). This sort of design helps both a person and a compiler reading the program: everything you see has already been defined. The ease of reading and the popularity of one-pass compilers may explain the popularity of this design rule.
Unfortunately definition before use does not play well with mutual recursion, and so language designers resorted to an ugly hack whereby you have
Declaration (sometimes called a "forward declaration" from the keyword in Pascal)
Use
Definition
You see the same phenomenon at the type level in C in the form of the "incomplete struct declaration."
Around 1990 some language designers figured out that the one-pass compiler with no abstract-syntax tree should be a thing of the past, and two very nice designs from that era—Modula-3 and Haskell got rid of definition before use: in those languages, any defined function or variable is visible throughout its scope, including parts of the program textually before the definition. In other words, mutual recursion is the default for both types and functions. Good on them, I say—these languages have no ugly and unnecessary forward declarations.
Why [have definition before use]?
Easy to write a one-pass compiler in 1975.
without definition before use, you have to think harder about mutual recursion, especially mutually recursive type definitions.
Some people think it makes it easier for a person to read the code.

Why do a lot of programming languages put the type *after* the variable name?

I just came across this question in the Go FAQ, and it reminded me of something that's been bugging me for a while. Unfortunately, I don't really see what the answer is getting at.
It seems like almost every non C-like language puts the type after the variable name, like so:
var : int
Just out of sheer curiosity, why is this? Are there advantages to choosing one or the other?
There is a parsing issue, as Keith Randall says, but it isn't what he describes. The "not knowing whether it is a declaration or an expression" simply doesn't matter - you don't care whether it's an expression or a declaration until you've parsed the whole thing anyway, at which point the ambiguity is resolved.
Using a context-free parser, it doesn't matter in the slightest whether the type comes before or after the variable name. What matters is that you don't need to look up user-defined type names to understand the type specification - you don't need to have understood everything that came before in order to understand the current token.
Pascal syntax is context-free - if not completely, at least WRT this issue. The fact that the variable name comes first is less important than details such as the colon separator and the syntax of type descriptions.
C syntax is context-sensitive. In order for the parser to determine where a type description ends and which token is the variable name, it needs to have already interpreted everything that came before so that it can determine whether a given identifier token is the variable name or just another token contributing to the type description.
Because C syntax is context-sensitive, it very difficult (if not impossible) to parse using traditional parser-generator tools such as yacc/bison, whereas Pascal syntax is easy to parse using the same tools. That said, there are parser generators now that can cope with C and even C++ syntax. Although it's not properly documented or in a 1.? release etc, my personal favorite is Kelbt, which uses backtracking LR and supports semantic "undo" - basically undoing additions to the symbol table when speculative parses turn out to be wrong.
In practice, C and C++ parsers are usually hand-written, mixing recursive descent and precedence parsing. I assume the same applies to Java and C#.
Incidentally, similar issues with context sensitivity in C++ parsing have created a lot of nasties. The "Alternative Function Syntax" for C++0x is working around a similar issue by moving a type specification to the end and placing it after a separator - very much like the Pascal colon for function return types. It doesn't get rid of the context sensitivity, but adopting that Pascal-like convention does make it a bit more manageable.
the 'most other' languages you speak of are those that are more declarative. They aim to allow you to program more along the lines you think in (assuming you aren't boxed into imperative thinking).
type last reads as 'create a variable called NAME of type TYPE'
this is the opposite of course to saying 'create a TYPE called NAME', but when you think about it, what the value is for is more important than the type, the type is merely a programmatic constraint on the data
If the name of the variable starts at column 0, it's easier to find the name of the variable.
Compare
QHash<QString, QPair<int, QString> > hash;
and
hash : QHash<QString, QPair<int, QString> >;
Now imagine how much more readable your typical C++ header could be.
In formal language theory and type theory, it's almost always written as var: type. For instance, in the typed lambda calculus you'll see proofs containing statements such as:
x : A y : B
-------------
\x.y : A->B
I don't think it really matters, but I think there are two justifications: one is that "x : A" is read "x is of type A", the other is that a type is like a set (e.g. int is the set of integers), and the notation is related to "x ε A".
Some of this stuff pre-dates the modern languages you're thinking of.
An increasing trend is to not state the type at all, or to optionally state the type. This could be a dynamically typed langauge where there really is no type on the variable, or it could be a statically typed language which infers the type from the context.
If the type is sometimes given and sometimes inferred, then it's easier to read if the optional bit comes afterwards.
There are also trends related to whether a language regards itself as coming from the C school or the functional school or whatever, but these are a waste of time. The languages which improve on their predecessors and are worth learning are the ones that are willing to accept input from all different schools based on merit, not be picky about a feature's heritage.
"Those who cannot remember the past are condemned to repeat it."
Putting the type before the variable started innocuously enough with Fortran and Algol, but it got really ugly in C, where some type modifiers are applied before the variable, others after. That's why in C you have such beauties as
int (*p)[10];
or
void (*signal(int x, void (*f)(int)))(int)
together with a utility (cdecl) whose purpose is to decrypt such gibberish.
In Pascal, the type comes after the variable, so the first examples becomes
p: pointer to array[10] of int
Contrast with
q: array[10] of pointer to int
which, in C, is
int *q[10]
In C, you need parentheses to distinguish this from int (*p)[10]. Parentheses are not required in Pascal, where only the order matters.
The signal function would be
signal: function(x: int, f: function(int) to void) to (function(int) to void)
Still a mouthful, but at least within the realm of human comprehension.
In fairness, the problem isn't that C put the types before the name, but that it perversely insists on putting bits and pieces before, and others after, the name.
But if you try to put everything before the name, the order is still unintuitive:
int [10] a // an int, ahem, ten of them, called a
int [10]* a // an int, no wait, ten, actually a pointer thereto, called a
So, the answer is: A sensibly designed programming language puts the variables before the types because the result is more readable for humans.
I'm not sure, but I think it's got to do with the "name vs. noun" concept.
Essentially, if you put the type first (such as "int varname"), you're declaring an "integer named 'varname'"; that is, you're giving an instance of a type a name. However, if you put the name first, and then the type (such as "varname : int"), you're saying "this is 'varname'; it's an integer". In the first case, you're giving an instance of something a name; in the second, you're defining a noun and stating that it's an instance of something.
It's a bit like if you were defining a table as a piece of furniture; saying "this is furniture and I call it 'table'" (type first) is different from saying "a table is a kind of furniture" (type last).
It's just how the language was designed. Visual Basic has always been this way.
Most (if not all) curly brace languages put the type first. This is more intuitive to me, as the same position also specifies the return type of a method. So the inputs go into the parenthesis, and the output goes out the back of the method name.
I always thought the way C does it was slightly peculiar: instead of constructing types, the user has to declare them implicitly. It's not just before/after the variable name; in general, you may need to embed the variable name among the type attributes (or, in some usage, to embed an empty space where the name would be if you were actually declaring one).
As a weak form of pattern-matching, it is intelligable to some extent, but it doesn't seem to provide any particular advantages, either. And, trying to write (or read) a function pointer type can easily take you beyond the point of ready intelligability. So overall this aspect of C is a disadvantage, and I'm happy to see that Go has left it behind.
Putting the type first helps in parsing. For instance, in C, if you declared variables like
x int;
When you parse just the x, then you don't know whether x is a declaration or an expression. In contrast, with
int x;
When you parse the int, you know you're in a declaration (types always start a declaration of some sort).
Given progress in parsing languages, this slight help isn't terribly useful nowadays.
Fortran puts the type first:
REAL*4 I,J,K
INTEGER*4 A,B,C
And yes, there's a (very feeble) joke there for those familiar with Fortran.
There is room to argue that this is easier than C, which puts the type information around the name when the type is complex enough (pointers to functions, for example).
What about dynamically (cheers #wcoenen) typed languages? You just use the variable.

Resources