In which programming language did semicolon first appear as a separator? - history

In similar manner to this question about generics: In which programming language did the semicolon make its first appearance, and how did it spread to C?

Definitely ALGOL.
Hmm, somebody changed the question. That's not quite cricket.
As for how it spread, well semicolons spread Algol 60 -> Simula -> C. K+R said that Pascal didnt influence them, IIRC, though some disputed this claim.
Statement terminators (other than new-line) spread from COBOL -> Jovial -> C. Though each had a different character as the teminator.

It's not a separator in C - it's a terminator.
However, I believe ALGOL may have been the first to use the semicolon in this sort of way.

Pascal had semicolons as terminators before C did; not sure if it was the first language to have them, though.

ALGOL is my guess too.
The significance is that it freed the user from punch-card-style fixed format.
If you have to use Fortran 77, you know what that means.

Related

Why can't an identifier start with a number?

I have a file named 1_add.rs, and I tried to add it into the lib.rs. Yet, I got the following error during compilation.
error: expected identifier, found `1_add`
--> src/lib.rs:1:5
|
1 | mod 1_add;
| ^^^^^ expected identifier
It seems the identifier that starts with a digit is invalid. But why would Rust has this restriction? Is there any workaround if I want to indicate the sequence of different rust files (for managing the exercise files).
In your case (you want to name the files like 1_foo.rs) you can write
#[path="1_foo.rs"]
mod mod_1_foo;
Allowing identifies to start with digits can conflict with type annotations. E.g.
let foo = 1_u32;
sets to type to u32. It would be confusing when 1_u256 means another variable.
But why would Rust has this restriction?
Not only rust, but most every language I've written a line of code in has this restriction as well.
Food for thought:
let a = 1_2;
Is 1_2 a variable name or is it a literal for value 12? What if variable 1_2 does not exist now, but you add it later, does this token stop being a number literal?
While rust compiler probably could make it work, it's not worth all the confusion, IMHO.
Allowing identifiers to start with a digit would caus conflicts with many other token types. Here are a few examples:
1e1 is a floating point number.
0x0 is a hexadecimal integer.
8u8 is an integer with explicit type annotation.
Most importantly, though, I believe allowing identifiers starting with digit would hurt readability. Currently everything starting with a digit is some kind of number, which in my opinion helps when reading code.
An incomplete list of programming languages not allowing identifiers to start with a digit: Python, Java, JavaScript, C#, Ruby, C, C++, Pascal. I can't think of a language that does allow this (which most likely does exist).
Rust identifiers are based on Unicode® Standard Annex #31
(see The Rust RFC Book), which standardizes some common rules for identifiers in programming languages. It might make it easier to parse text that could otherwise be ambiguous, like 1e10?
"Why?" cannot be reasoned here but by historical tales, the rules are as such. You cannot play against them.
If you urgently want to start your identifiers with a digit, at least for human readers, prepend an underscore like this: _1_add.
Note: To make sure that sorting works well, use also leading zeroes as many as appropriate (_001_add if you expect more than 99 files).

Why do programming languages use commas to separate function parameters?

It seems like all programming languages use commas (,) to separate function parameters.
Why don't they use just spaces instead?
Absolutely not. What about this function call:
function(a, b - c);
How would that look with a space instead of the comma?
function(a b - c);
Does that mean function(a, b - c); or function(a, b, -c);? The use of the comma presumably comes from mathematics, where commas have been used to separate function parameters for centuries.
First of all, your premise is false. There are languages that use space as a separator (lisp, ML, haskell, possibly others).
The reason that most languages don't is probably that a) f(x,y) is the notation most people are used to from mathematics and b) using spaces leads to lots of nested parentheses (also called "the lisp effect").
Lisp-like languages use: (f arg1 arg2 arg3) which is essentially what you're asking for.
ML-like languages use concatenation to apply curried arguments, so you would write f arg1 arg2 arg3.
Tcl uses space as a separator between words passed to commands. Where it has a composite argument, that has to be bracketed or otherwise quoted. Mind you, even there you will find the use of commas as separators – in expression syntax only – but that's because the notation is in common use outside of programming. Mathematics has written n-ary function applications that way for a very long time; computing (notably Fortran) just borrowed.
You don't have to look further than most of our natural languages to see that comma is used for separation items in lists. So, using anything other than comma for enumerating parameters would be unexpected for anyone learning a programming language for the first time.
There's a number of historical reasons already pointed out.
Also, it's because in most languages, where , serves as separator, whitespace sequences are largely ignored, or to be more exact, although they may separate tokens, they do not act as tokens themselves. This is moreless true for all languages deriving their syntax from C. A sequence of whitespaces is much like the empty word and having the empty word delimit anything probably is not the best of ideas.
Also, I think it is clearer and easier to read. Why have whitespaces, which are invisible characters, and essentially serve nothing but the purpose of formatting, as really meaningful delimiters. It only introduces ambiguity. One example is that provided by Carl.
A second would f(a (b + c)). Now is that f(a(b+c)) or f(a, b+c)?
The creators of JavaScript had a very useful idea, similar to yours, which yields just the same problems. The idea was, that ENTER could also serve as ;, if the statement was complete. Observe:
function a() {
return "some really long string or expression or whatsoever";
}
function b() {
return
"some really long string or expression or whatsoever";
}
alert(a());//"some really long string or expression or whatsoever"
alert(b());//"undefined" or "null" or whatever, because 'return;' is a valid statement
As a matter of fact, I sometimes tend to use the latter notation in languages, that do not have this 'feature'. JavaScript forces a way to format my code upon me, because someone had the cool idea, of using ENTER instead of ;.
I think, there is a number of good reasons why some languages are the way they are. Especially in dynamic languages (as PHP), where there's no compile time check, where the compiler could warn you, that the way it resolved an ambiguity as given above, doesn't match the signature of the call you want to make. You'd have a lot of weird runtime errors and a really hard life.
There are languages, which allow this, but there's a number of reasons, why they do so. First and foremost, because a bunch of very clever people sat down and spent quite some time designing a language and then discovered, that its syntax makes the , obsolete most of the time, and thus took the decision to eliminate it.
This may sound a bit wise but I gather for the same reason why most earth-planet languages use it (english, french, and those few others ;-) Also, it is intuitive to most.
Haskell doesn't use commas.
Example
multList :: [Int] -> Int -> [Int]
multList (x : xs) y = (x * y) : (multList xs y)
multList [] _ = []
The reason for using commas in C/C++ is that reading a long argument list without a separator can be difficult without commas
Try reading this
void foo(void * ptr point & * big list<pointers<point> > * t)
commas are useful like spaces are. In Latin nothing was written with spaces, periods, or lower case letters.
Try reading this
IAMTHEVERYMODELOFAWHATDOYOUWANTNOTHATSMYBUCKET
it's primarily to help you read things.
This is not true. Some languages don't use commas. Functions have been Maths concepts before programming constructs, so some languages keep the old notation. Than most of the newer has been inspired by C (Javascript, Java, C#, PHP too, they share some formal rules like comma).
While some languages do use spaces, using a comma avoids ambiguous situations without the need for parentheses. A more interesting question might be why C uses the same character as a separator as is used for the "a then b" operator; the latter question is in some ways more interesting given that the C character set has at three other characters that do not appear in any context (dollar sign, commercial-at, and grave, and I know at least one of those (the dollar sign) dates back to the 40-character punchcard set.
It seems like all programming languages use commas (,) to separate function parameters.
In natural languages that include comma in their script, that character is used to separate things. For instance, if you where to enumerate fruits, you'd write: "lemon, orange, strawberry, grape" That is, using comma.
Hence, using comma to separate parameters in a function is more natural that using other character ( | for instance )
Consider:
someFunction( name, age, location )
vs.
someFunction( name|age|location )
Why don't they use just spaces instead?
Thats possible. Lisp does it.
The main reason is, space, is already used to separate tokens, and it's easier not to assign an extra functionality.
I have programmed in quite a few languages and while the comma does not rule supreme it is certainly in front. The comma is good because it is a visible character so that script can be compressed by removing spaces without breaking things. If you have space then you can have tabs and that can be a pain in the ... There are issues with new-lines and spaces at the end of a line. Give me a comma any day, you can see it and you know what it does. Spaces are for readability (generally) and commas are part of syntax. Mind you there are plenty of exceptions where a space is required or de rigueur. I also like curly brackets.
It is probably tradition. If they used space they could not pass expression as param e.g.
f(a-b c)
would be very different from
f(a -b c)
Some languages, like Boo, allow you to specify the type of parameters or leave it out, like so:
def MyFunction(obj1, obj2, title as String, count as Int):
...do stuff...
Meaning: obj1 and obj2 can be of any type (inherited from object), where as title and count must be of type String and Int respectively. This would be hard to do using spaces as separators.

Are there programming languages that rely on non-latin alphabets?

Every programming language I have ever seen has been based on the Latin alphabet, this is not surprising considering I live in Canada...
But it only really makes sense that there would be programming languages based on other alphabets, or else bright computer scientists across the world would have to learn a new alphabet to go on in the field. I know for a fact that people in countries dominated by other alphabets develop languages based off the Latin alphabet (eg. Ruby from Japan), but just how common is it for programming languages to be based off of other alphabets like Arabic, or Cyrillic, or even writing systems which are not alphabetic but rather logographic in nature such as Japanese Kanji?
Also are any of these languages in active widespread use, or are they mainly used as teaching tools?
This is something that has bugged me since I started programming, and I have never run across someone who could think of a real answer.
Have you seen Perl?
APL is probably the most widely known. It even has a cool keyboard overlay (or was it a special keyboard you had to buy?):
In the non-alphabetic category, we also have programming languages like LabVIEW, which is mostly graphical. (You can label objects, and you can still do string manipulation, so there's some textual content.) LabVIEW has been used in data acquisition and automation for years, but gained a bit of popularity when it became the default platform for Lego Mindstorms.
There's a list on Wikipedia. I don't think any of them is really prevalent though. Many programmers can learn to write programs with english keywords even if they didn't understand the language. Ruby is a good example, you'll still see Japanese identifiers and comments in some Ruby code.
Well, Brainf* uses no latin characters, if you'll pardon the language...and the pun.
Many languages allow Unicode identifiers. It's part of standard Java, and both g++ (though you have to use \uNNNN escapes) and MSVC++ allow them (see also this question) And some allow using #define (or maybe better) to rename control structures.
But in reality, people don't do this for the most part. See past questions such as Language of variable names?, Should all code be written in English?, etc.
Agda.
Sample Snippet:
mutual
data ωChain : Set where
_∷_,_ : ∀ (x : carrier) (xω : ∞ ωChain) (p : x ≼ xω) → ωChain
head : ωChain → carrier
head (x ∷ _ , _) = x
_≼_ : carrier → ∞ ωChain → Set
x ≼ xω = x ≤ head (♭ xω)
Well, there's always APL. That has its own UNICODE characters, and I believe it used to require a special keyboard too.
There'is one langauge used in russian ERP system called after company, which developed it 1C. But it's identifiers and operators has english analogs.
Also, I know that haskell has unicode identifiers support, so you can write programs in any alphabet. But this is not useful (My native language is russian). It's quite enough that you have to type program messages and helpful comments in native alphabet.
Other people are answering with languages that use punctuation marks in addition to Latin letters. I wonder why no one mentioned digits 0 to 9 as well.
In some languages, and in some implementations of some languages, programmers can use a wide range of characters in identifiers, such as Arabic or Chinese characters. This doesn't mean that the language relies on them though.
In most languages, programmers can use a wide range of characters in string literals (in quotation marks) and in comments. Again this doesn't mean that the language relies on them.
In every programming language that I've seen, the language does rely on punctuation marks and digits. So this answers your question but not in the way you expect.
Now let's try to find something meaningful. Is there a programming language where keywords are chosen from non-Latin alphabets? I would guess not, except maybe for joke languages. What would be the point of inventing a programming language that makes it impossible for some programmers to even input a program?
EDIT: My guess is wrong. Besides APL's usage of various invented punctuation marks, it does depend on a few Greek keywords, where each keyword is one letter long, such as the letter rho.
I just found an interesting wiki for "esoteric programming languages".

Why don't popular programming languages use some other character to delimit strings? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 13 years ago.
Every programming language I know (Perl, Javascript, PHP, Python, ASP, ActionScript, Commodore Basic) uses single and double quotes to delimit strings.
This creates the ongoing situation of having to go to great lengths to treat quotes correctly, since the quote is extremely common in the contents of strings.
Why do programming languages not use some other character to delimit strings, one that is not used in normal conversation \, | or { } for example) so we can just get on with our lives?
Is this true, or am I overlooking something? Is there an easy way to stop using quotes for strings in a modern programming language?
print <<<END
I know about here document syntax, but for minor string manipulation it's overly complicated and it complicates formatting.
END;
[UPDATE] Many of you made a good point about the importance of using only ASCII characters. I have updated the examples to reflect that (the backslash, the pipe and braces).
Perl lets you use whatever characters you like
"foo $bar" eq
qq(foo $bar) eq
qq[foo $bar] eq
qq!foo $bar! eq
qq#foo $bar# etc
Meanwhile
'foo $bar' eq
q(foo $bar) eq
q[foo $bar] eq
q!foo $bar! eq
q#foo $bar# etc
The syntax extends to other features, including regular expressions, which is handy if you are dealing with URIs.
"http://www.example.com/foo/bar/baz/" =~ /\/foo/[^\/]+\/baz\//;
"http://www.example.com/foo/bar/baz/" =~ m!/foo/[^/]+/baz/!;
Current: "Typewriter" 'quotation' marks
There are many good reasons for using the quotation marks we are currently using:
Quotes are easily found on keyboards - so they are easy to type, and they have to be easy, because strings are needed so often.
Quotes are in ASCII - most programming tools only handle well ASCII. You can use ASCII in almost any environment imaginable. And that's important when you are fixing your program over a telnet connection in some far-far-away server.
Quotes come in many versions - single quotes, double quotes, back quotes. So a language can assign different meanings for differently quoted strings. These different quotes can also solve the 'quotes "inside" quotes' problem.
Quotes are natural - English used quotes for marking up text passages long before programming languages followed. In linguistics quotes are used in quite the same way as in programming languages. Quotes are natural the same way + and - are natural for addition and substraction.
Alternative: “typographically” ‘correct’ quotes
Technically they are superior. One great advantage is that you can easily differenciate between opening and closing quotes. But they are hard to type and they are not in ASCII. (I had to put them into a headline to make them visible in this StackOverflow font at all.)
Hopefully on one day when ASCII is something that only historians care about and keyboards have changed into something totally different (if we are even going to have keyboards at all), there will come a programming language that uses better quotes...
Python does have an alternative string delimiter with the triple-double quote """Some String""".
Single quotes and double quotes are used in the majority of languages since that is the standard delimiter in most written languages.
Languages (should) try to be as simple to understand as possible, and using something different from quotes to deal with strings introduces unnecessary complexity.
Python has an additional string type, using triple double-quotes,
"""like this"""
In addition to this, Perl allows you to use any delimiter you want,
q^ like this ^
I think for the most part, the regular string delimiters are used because they make sense. A string is wrapped in quotes. In addition to this, most developers are used to using their common-sense when it comes to strings that drastically altering the way strings are presented could be a difficult learning curve.
Using quotation marks to define a set of characters as separate from the enclosing text is more natural to us, and thus easier to read. Also, " and ' are on the keyboard, while those other characters you mentioned are not, so it's easier to type. It may be possible to use a character that is widely available on keyboards, but I can't think of one that won't have the same kind of problem.
E: I missed the pipe character, which may actually be a viable alternative. Except that it's currently widely used as the OR operator, and the readability issue still stands.
Ah, so you want old-fashioned FORTRAN, where you'd quote by counting the number of characters in the string and embedding it in a H format, such as: 13HHello, World!. As somebody who did a few things with FORTRAN back in the days when the language name was all caps, quotation marks and escaping them are a Good Thing. (For example, you aren't totally screwed if you are off by one in your manual character count.)
Seriously, there is no ideal solution. It will always be necessary, at some point, to have a string containing whatever quote character you like. For practical purposes, the quote delimiters need to be on the keyboard and easily accessible, since they're heavily used. Perl's q#...# syntax will fail if a string contains an example of each possible character. FORTRAN's Hollerith constants are even worse.
Because those other characters you listed aren't ASCII. I'm not sure that we are ready for, or need a programming language in unicode...
EDIT: As to why not use {}, | or \, well those symbols all already have meanings in most languages. Imagine C or Perl with two different meanings for '{' and '}'!
| means or, and in some languages concatenate strings already. and how would you get \n if \ was the delimiter?
Fundamentally, I really don't see why this is a problem. Is \" really THAT hard? I mean, in C, you often have to use \%, and \ and several other two-character characters so... Meh.
Because no one has created a language using some other character that has gotten popular.
I think that is largely because the demand for changing the character is just not there, most programmers are used to the standard quote and see no compelling reason to change the status quo.
Compare the following.
print "This is a simple string."
print "This \"is not\" a simple string."
print ¤This is a simple string.¤
print ¤This "is not" a simple string.¤
I for one don't really feel like the second is any easier or more readable.
You say "having to go to great lengths to treat quotes correctly"; but it's only in the text representation. All modern languages treat strings as binary blocks, so they really don't care about the content. Remember that the text representation is only a simple way for the programmer to tell the system what to do. Once the string is interned, it doesn't have any trouble managing the quotes.
One good reason would probably be that if this is the only thing you want to improve on an existing language, you're not really creating a new language.
And if you're creating a new language, picking the right character for the string quotes is probably way way WAY down on the todo list of things to actually implement.
You would probably be best off picking a delimiter that exists on all common keyboards and terminal representation sets, so most of the ones you suggest are right out...
And in any case, a quoting mechanism will still be necessary...you gain a reduction in the number of times you use quoting at the cost of making the language harder for non-specialist to read.
So it is not entirely clear that this is a win, and then there is force of habit.
Ada doesn't use single quotes for strings. Those are only for chars, and don't have to be escaped inside strings.
I find it very rare that the double-quote character comes up in a normal text string that I enter into a computer program. When it does, it is almost always because I am passing that string to a command interpreter, and need to embed another string in it.
I would imagine the main reason none of those other characters are used for string delimiters is that they aren't in the original 7-bit ASCII code table. Perhaps that's not a good excuse these days, but in a world where most language designers are afraid to buck the insanely crappy C syntax, you aren't going to get a lot of takers for an unusual string delimiter choice.
Python allows you to mix single and double quotes to put quotation marks in strings.
print "Please welcome Mr Jim 'Beaner' Wilson."
>>> Please welcome Mr Jim 'Beaner' Wilson.
print 'Please welcome Mr Jim "Beaner" Wilson.'
>>> Please welcome Mr Jim "Beaner" Wilson
You can also used the previously mentioned triple quotes. These also extend across multiple lines to allow you to also keep from having to print newlines.
print """Please welcome Mr Jim "Beaner" Wilson."""
>>> Please welcome Mr Jim "Beaner" Wilson
Finally, you can print strings the same way as everyone else.
print "Please welcome Mr Jim \"Beaner\" Wilson."
>>> Please welcome Mr Jim "Beaner" Wilson

Is there any advantage of being a case-sensitive programming language? [duplicate]

This question already has answers here:
Closed 14 years ago.
I personally do not like programming languages being case sensitive.
(I know that the disadvantages of case sensitivity are now-a-days complemented by good IDEs)
Still I would like to know whether there are any advantages for a programming language if it is case sensitive. Is there any reason why designers of many popular languages chose to make them case sensitive?
EDIT: duplicate of Why are many languages case sensitive?
EDIT: (I cannot believe I asked this question a few years ago)
This is a preference. I prefer case sensitivity, I find it easier to read code this way. For instance, the variable name "myVariable" has a different word shape than "MyVariable," "MYVARIABLE," and "myvariable." This makes it more straightforward at a glance to tell the two identifiers apart. Of course, you should not or very rarely create identifiers that differ only in case. This is more about consistency than the obvious "benefit" of increasing the number of possible identifiers. Some people think this is a disadvantage. I can't think of any time in which case sensitivity gave me any problems. But again, this is a preference.
Case-sensitivity is inherently faster to parse (albeit only slightly) since it can compare character sequences directly without having to figure out which characters are equivalent to each other.
It allows the implementer of a class/library to control how casing is used in the code. Case may also be used to convey meaning.
The code looks more the same. In the days of BASIC these were equivalent:
PRINT MYVAR
Print MyVar
print myvar
With type checking, case sensitivity prevents you from having a misspelling and unrecognized variable. I have fixed bugs in code that is a case insensitive, non typed language (FORTRAN77), where the zero (0) and capital letter O looked the same in the editor. The language created a new object and so the output was flawed. With a case sensitive, typed language, this would not have happened.
In the compiler or interpreter, a case-insensitive language is going to have to make everything upper or lowercase to test for matches, or otherwise use a case insensitive matching tool, but that's only a small amount of extra work for the compiler.
Plus case-sensitive code allows certain patterns of declarations such as
MyClassName myClassName = new MyClassName()
and other situations where case sensitivity is nice.

Resources