Why is it illegal for variables to start with numbers? - programming-languages

Why is it illegal for variables to start with numbers?I know it's a convention but what's the reason?
Edit:
I mean variables like "1foo" or "23bar" not only numbers like "3"

Because the lexer in most languages will assume you are trying to specify a numeric literal. And then you could declare variables that are indistinguishable from numeric literals, creating a huge bombshell of ambiguity.

Pop quiz: in a hypothetical language that permits a variable to begin with a number, what is this?
0xDEADBEEF
In C (and related languages) this can only be a hexadecimal number. If a language allows a variable name to begin with a digit, that could be a variable or a hexadecimal number. That's one quick example of potentially millions.

Numbers are interpreted 'as is' without any syntax whereas strings/characters are mostly represented with quotes.
So, the program can understand the difference between a variable name containing characters and a string of characters but it does not goes the same with numerals.

One reason, probably the most obvious one, is that it would make your life more difficult, without bringing anything reasonably useful to the table. For example, in C, you wouldn't be able to tell whether a string of digits is an identifier or a numeric literal.
int 10 = 15;
int 15 = 10 + 5;
In the second line, is 10 a variable holding the numeric literal 15 or is it the numeric literal 10?
Another reason is that allowing a variable name to begin with a digit makes error checking during compilation a lot more complicated, again, without bringing anything reasonably useful to the table.

In languages such as Prolog, Erlang, and some early versions of Fortran, you very nearly got to do this, for completely different reasons.
Prolog/Erlang don't have variable assignment, they have unification. IIRC, if X is a variable, then code following 2 = X, or X = 2 is processed if X may have the value 2. So if X is already unified with a value, then that value must be 2, and if not, X becomes 2 from then on. So writing 3 = 3 is fine - it should become a no-op, and 2 = 3 always fails - either a non-match in Prolog or (I think) a runtime error in Erlang. Numbers behave like variables which have already been unified with the value the numbers represent.
In early Fortran ( apologies for not having used fortran in twenty years and forgetting its syntax ), all function arguments were passed by reference, so if you have a function which was equivalent to void foo ( int &x ) { x = 3; } and called it with a number, the compiler would store the number in a static variable and pass that. So calling foo (2) would set that static stored value of 2 to 3. If it happened to use the same static variable for the literal 2 somewhere else, such as calling another function with the literal 2, then the value passed to the second function would be 3 instead.
So you can have variables which are syntactically identical to numbers, as long as they are automatically initialised to the value of the literal. But if you allow them to be mutable rather pure variables, weirdness abounds.

Related

Are operations like `2*7` considered literals?

I just had a small question.
Are operations considered literals? Would 2*7, for example, be a literal? Is "hello, " + "world!" a literal?
I know the operands are literals, but the expression is not explicitly 14 or "hello, world!".
The question Is 2+3 considered as a literal?
asks basically what I am asking but most answers weren't even helpful, all they do is break the variable declaration down or talk about what compilers do with them, but I'm not looking for that, so I would like a more in depth explanation.
Thank you
It will depend on the language and the compiler, sorry. But just using the concept that a literal is a kind of token, then no, the result is a compile-time constant, not a token.
In C/C++ 2*7 will be optimised by the compiler to make a new constant but it isn't actually legally defined as a literal, though it can be treated as a compile-time constant.
Concatenating "hello" "world" (note no plus) is actually described as a preprocessing step in c++, so does generate a new literal constant string, but then in original C this didn't work.
But note that in C, a macro will treat the parameter phrase 2+7 as separate tokens, and #define STUPIDMUL3(val) 3 * val for 2+7 will give the answer 13, not 18. If you could find a way to force macros to treat the two halves of the string differently, I think it would.
I would expect an interpreter to take longer to process 2*7 than it would 14 because it might interpret and solve it every time.

What are the rules for cpython's string interning?

In python 3.5, is it possible to predict when we will get an interned string or when we will get a copy? After reading a few Stack Overflow answers on this issue I've found this one the most helpful but still not comprehensive. Than I looked at Python docs, but the interning is not guaranteed by default
Normally, the names used in Python programs are automatically interned, and the dictionaries used to hold module, class or instance attributes have interned keys.
So, my question is about inner intern() conditions, i.e. decision-making (whether to intern string literal or not): why the same piece of code works on one system and not on another one and what rules did author of the answer on mentioned topic mean when saying
the rules for when this happens are quite convoluted
You think there are rules?
The only rule for interning is that the return value of intern is interned. Everything else is up to the whims of whoever decided some piece of code should or shouldn't do interning. For example, "left" gets interned by PyCodeNew:
/* Intern selected string constants */
for (i = PyTuple_GET_SIZE(consts); --i >= 0; ) {
PyObject *v = PyTuple_GetItem(consts, i);
if (!all_name_chars(v))
continue;
PyUnicode_InternInPlace(&PyTuple_GET_ITEM(consts, i));
}
The "rule" here is that a string object in the co_consts of a Python code object gets interned if it consists purely of ASCII characters that are legal in a Python identifier. "left" gets interned, but "as,df" wouldn't be, and "1234" would be interned even though an identifier can't start with a digit. While identifiers can contain non-ASCII characters, such characters are still rejected by this check. Actual identifiers don't ever pass through this code; they get unconditionally interned a few lines up, ASCII or not. This code is subject to change, and there's plenty of other code that does interning or interning-like things.
Asking us for the "rules" for string interning is like asking a meteorologist what the rules are for whether it rains on your wedding. We can tell you quite a lot about how it works, but it won't be much use to you, and you'll always get surprises.
From what I understood from the post you linked:
When you use if a == b, you are checking if the value of a is the value of b, whereas when you use if a is b, you are checking if a and b are the same object (or share the same spot in the memory).
Now python interns the constant strings (defined by "blabla").
So:
>>> a = "abcdef"
>>> a is "abcdef"
True
But when you do:
>>> a = "".join([chr(i) for i in range(ord('a'), ord('g'))])
>>> a
'abcdef'
>>> a is "abcdef"
False
In the C programming language, using a string with "" will make it a const char *. I think this is what is happening here.

Why is the keyword `string` used to verify a variable type

For example, suppose we have a variable named i and set to 10. To check if it is an integer, in tcl one types : string is integer $i.
Why is there the keyword string ? Does it mean the same as in python and C++ ? How to check if a tcl string (in the meaning of a sequence of characters) is a string ? string is string $myString does not work because string is not a class in tcl.
Tcl doesn't have types. Or rather it does, but they're all serializable to strings and that happens magically behind the scenes; it looks like it doesn't have types, and you're not supposed to talk about them. Tcl does have classes, but they're not used for types of atomic values; something like 1.3 is not an instance of an object, it's just a value (often of floating point type, but it could also be a string or a singleton list or version identifier, or even a command name or variable name if you really want). Tcl's classes define objects that are commands, and those are (deliberately!) heavyweight entities.
The string is family of tests check whether a value meets the requirements for being interpreted as a particular kind of value. There's quite a few kinds of value, some of which make no sense as types at all (e.g., an all-uppercase string). There's nothing for string is string because everything you can ask that about would automatically pass; all values are already strings, or may be transparently converted to them.
There's exactly one way to probe what the type of a value currently is, and that is the command ::tcl::unsupported::representation (8.6 only). That reports the current type of a value as part of its output, and you're not supposed to rely on it (there's quite a few types under the hood, many of which are pretty obscure unless you know a lot about Tcl's implementation).
% set s 1.3
1.3
% ::tcl::unsupported::representation $s
value is a pure string with a refcount of 4, object pointer at 0x100836ca0, string representation "1.3"
% expr {$s + 3}
4.3
% ::tcl::unsupported::representation $s
value is a double with a refcount of 4, object pointer at 0x100836ca0, internal representation 0x3ff4cccccccccccd:0x0, string representation "1.3"
As you can see, types are pretty flexible. You're supposed to ignore them. We mean it. Make your code demand the types it needs, and throw an error if it can't get them. That's what Tcl's C API does for you.

Interpret strings as variable names in Fortran [duplicate]

This question already has answers here:
Determine variable names dynamically according to a string in Fortran
(4 answers)
Closed 5 years ago.
I'd like to access a real variable with a name equal to a string of characters that I have. Something like this (I'll make the example as clean as possible):
character(len=5) :: some_string
real :: value
value = 100.0
some_string = 'value'
At this point, how do I create an association between the character array value and the name of my real variable, value, so that I can write the value of 100.0 by referring to the string some_string?
That's pretty much not going to happen in Fortran. There are no "dynamic" language features like this available in the language. Variable names are a compile-time only thing, and simply don't exist at runtime (the names have been translated to machine addresses by the compiler).
This is how I work around this:
character(100) :: s
integer :: val
val = 100
write(s,*) val
print *,trim(s)
This prints 100 to the screen. There is some strangeness which I do not understand however, the character s needs to be very large (100 int his case). For instance, if you use 3 instead of 100, it does not work. This is not a critical thing, as the use of trim fixes this, but it would be nice if somebody could answer why this is the case.
Either way, this should work.

What is a strictly typed language? [duplicate]

This question already has answers here:
What are the key aspects of a strongly typed language?
(8 answers)
Closed 1 year ago.
What is a strictly typed language?
Strictly typed languages enforce typing on all data being interacted with.
For example
int i = 3
string s = "4"
From here on out, whenever you use i, you can only interact with it as an integer type. That means you are restricted to using with methods that work with integers.
As for string s you can only interact with it as a string type. You can concatenate it with other string, print it out, etc. However, even though it contains that character "4", you cannot add to an integer without using some function to convert the string to an integer type.
In a dynamically typed language, you have a lot more flexibility:
i = 3
s = "4"
Types are inferred; meaning they are determined based on the data they are set to. i is obstensively a number type, and s is a string type, based on how they were set. However when you have i + s; type inference is used and depending on your environment, you may get the result i + s = 7; since s was implicitly converted to an int by the programming environment. However, this operation could also result in the string "34", if the environment infers an int + string should equal a concatenation operation vs an addition operation.
This flexibility has made loosely typed languages very popular. However, because these type inference can sometimes produce unexpected results; they can also result in more bugs in your code if you're not careful. In a typed language, if I perform i + s, I am forced by the compiler to change s into an int first, so I know by adding i to s, I will get 7 because I was forced to convert s to an explicit int first. In a dynamic language, it attempts to do this for you implicitly, but the results may not be what you were expecting, since anything can be in i or s; a string, a number, or even an object. You don't know until you run your code and see what happens.
I tried to look up "strict typing" and wasn't able to find a definitive definition for the term. Perhaps it refers to a strongly typed language?
Strong typing refers to a type system in which there are restrictions to the operation on which two variables of different types can be performed. For example, in a very strongly typed language, trying to add a string and number may lead to an error.
string s;
number n;
s + n; <-- Type error.
The error may occur at compile time for statically typed languages or at runtime for dynamically typed languages. It should be noted that static/dynamic and strong/weak may sound like similar concepts, they are quite different.
A less strongly typed language may allow casting of variables to allow operations between variables originating from different types:
s + (string)n; <-- Allowed, as (number) has been explicitly
casted to (string), so variable types match.
In a weakly typed language, variables of differing types may become automatically casted to compatible types.
s + n; <-- Allowed, where the language will cast
the (number) to (string)
Perhaps, the "strictly typed language" refers to a very strongly typed language in which there are more strict restrictions as to how operations can be performed on variables of different types.
There's dissenting opinions about how strong or weak various type systems are, but I've generally heard "strictly typed programming language" to mean a very strongly typed programming language. This often describes the static type systems found in several functional languages.
Languages where variables must be declared to contain a specific type of data.
If your variable declarations look like:
String myString = "Fred";
then your language is strictly typed, variable "myString" is explicitly declared to contain only string data.
If the following works:
x = 10;
x = "Fred";
then it's loosely typed (two different types of data in the same variable and scope).
languages where '1' + 3 would be illegal, because it's adding a string to an integer.

Resources