I love the versatility of python, but I absolutely hate the (non conventional) syntax (mainly the lack of {}, semicolons, and obvious variable declarations). I know that for many people the syntax is a part that they like, and I can understand that, but, I far prefer using brackets to define scoped rather than tabs; I like knowing a statement is over when I see a ; and it is second nature for me to write a conventional for (variable; condition; increment) {} rather than for i in range (*,*,*): etc. You get the point. So, my question is: is it a totally absurd idea to write a text parser (written in any language, probably Java) that converts a text file that has the custom syntax into a compile-able .py program? This would be mainly a learning experience, and for fun, it wouldn't be a serious solution for any large/complex programs.
I am also not sure how this will sit with the stack overflow guidelines for a typical question, seeing as how it is partly opinion based, so if you think it shouldn't be here, could you tell me where a good place to post it might be?
Thanks, Asher
I am trying to build my own Objective-C highlighting scheme for vim. The problem is that when I define some rule with contained it still being applied even if there are no rules containing this one. I have this in my objc.vim for test purposes:
syntax clear
runtime! syntax/c.vim
syn match firstComponent "[_A-Za-z0-9()]*:" contained
hi link firstComponent Function
I suspect this is because c.vim has a lot of rules with contained=ALLBUT so they include my rule as well. Are there ways to work around this?
Thanks.
PS I am building my own scheme to highlight methods because the one I was using before is slow, in particular method signature matching is slow, I've made a reduction that shows that. I suspect this could be because of the problem above. Complicated inner rules get matched everywhere.
Your hunch is right, this is due to contained=ALLBUT. There are limits to reusing an existing syntax. Though you can try to override or :syntax clear certain elements, there comes a point where this becomes overly tedious.
If the original syntax author is still maintaining his syntax, you can discuss this, and submit patches to ease integration, or maybe even completely split off a common sub-syntax that you can then use to base yours on. If that's not the case, or the coupling is undesired, you'd better start creating your own, completely separate syntax, even if that means some duplication.
And I mean that in the same sense that a C/Java for is just a funky syntax for a while loop.
I still remember when first learning about the for loop in C, the mental effort that had to go into understanding the execution sequence of the three control expressions relative to the loop statement. Seems to me the same sort of effort has to be applied to understand Continuations (in Scala and I guess probably other languages).
And then there's the obvious follow-up question... if so, then what's the point? It seems like a lot of pain (language complexity, programmer errors, unreadable programs, etc) for no gain.
In some sense, yes, continuations are funky syntax for using callbacks. You can manually perform a very complex global transformation on your code (the so called continuation-passing-style transformation), and you will get continuations on your hands without direct language support.
However, transforming your entire codebase is probably not very practical, and the resulting code is hard to read, so having the compiler do it for you behind the scenes is MUCH better.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 12 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
It is interesting that some languages do not use semicolons and braces, even though their predecessors had them. Personally, it makes me nervous to write code in Python because of this. Semicolons are also missing from Google's GO language, although the lexer uses a rule to insert semicolons automatically as it scans.
Why do some languages not use semicolons and braces?
Every programming language must have some way of distinguishing the end of a statement, function call parameter lists or a block of code from the next one.
Some languages use ; and {} (C, Java)
Some languages rely on known sizes of parameter lists (x86 assembly code)
Some use parentheses to form s-expression (Lisp, Clojure)
Some use whitespace (Python)
Some use special keywords like begin .... end (Pascal, Delphi)
So basically this is mostly just a language design choice. There is always some equivalent of ; or {}, even if it doesn't look the same at first glance...
You can argue that when you use semicolons and braces, you still indent your code with whitespace and new lines - for readability reasons. Therefore those delimiters can be considered redundant in this sense.
The designers for those languages presumably believe that the braces and semi-colons are needless cruft, when line continuations can (usually) be detected due to a statement not being complete, and blocks can be detected by whitespace.
Personally it makes me nervous too, but then the lack of checked exceptions in C# had the same effect on me for a while... I suspect that when you get used to such a scheme, it can improve readability (which is the point). It does mean you need to be more careful with whitespace, of course.
We have been using indentation to indicate statement groupings as a readability aid for a long time. This occasionally causes problems when the indentation and the actual statement groupings (indicated by {};, begin/end, whatever) are in conflict; we read one meaning, but the code actually says something else.
Python took the simplifying approach. If we find indentation a help in clarity of expression, why not make it the way the language itself determines groupings. When we write code, we express intent to other readers, so looking at what writing gurus say is often useful:
A sentence should contain no unnecessary words, a paragraph no unnecessary sentences, for the same reason that a drawing should have no unnecessary lines and a machine no unnecessary parts. ~William Strunk, Jr., The Elements of Style, 1918
So, maybe a programming language should have no unnecessary syntax elements...??
Two reasons: There are so many different ways to put braces around code blocks (see indent styles) that reading/parsing code written by others can be quite hard. Python code, on the other hand, always looks similar, and the indentation level gives a very clear visual clue for the structure of the code. As a side effect, it forces you to keep your code structure simple since deep nesting makes your code vanish off the right side of the screen :)
As for the semicolons - I've been bitten often enough by for(i=0;i<=100;i++); errors that I'm glad I'm not falling into the same traps in Python...
"Syntactic sugar causes cancer of the
semicolon."
Alan Perlis
Is there any reason why some languages do not use semicolons and braces?
Some designers believe that "syntactic noise" such as semicolons and braces distract the reader from the code. There are various ways to eliminate them:
Python and Haskell use significant indentation.
Clu and Lua use very carefully engineered grammars.
Standard ML uses keywords to introduce each construct plus let-bindings, which eliminate the need for most semicolons while also providing a handy way to declare local variables.
The Bourne shells use significant newlines to eliminate semicolons
Scheme uses an extremely regular syntax which in which the only syntactic markers are parentheses. Longtime Schemers like Olin Shivers claim that after a few weeks, your brain adjusts and you no longer see the parentheses.
The fact that there are so many designs, with so much variation, suggest that many language designers view semicolons and braces as syntactic noise to be eliminated if possible. By eliminating syntactic noise, designers make programs easier to read and understand all at once. and many programmers feel more productive, as if the signal-to-noise ratio has improved and they are channeling their code more clearly. (I won't say they're right and I won't say they're wrong, but I will say that has language-design decisions go, this one is pretty easy to defend.)
So is there a reason why language designers do use semicolons and braces?
Many of the modern semi-colon-and-brace languages are designed explicitly, or in some cases not so explicitly, to appeal to C programmers. After all, if it has semicolons and braces, it must be easy to learn. Right?
Using delimiters like semicolons and braces, or not, is just a matter of taste. In practical terms, compilers can work without them, so, why use them in modern programming languages? As I said, is a matter of taste, and... a long-time established de-facto syntax that ressembles C. It is difficult to fight against that.
There is one field in which braces and semicolons are useful: code generation. When you generate code that is expected to be compiled/interpreted in a kind of reflective behaviour, it is normally more comfortable to write braces (in, say, just one, single line) than to write the structure needed by a programming language such as Python, for example. Think of a function with a couple of unnested loops. You would have to keep track of the number of tabs needed at each line.
Is there any reason they should use them?
Some people think that semicolons and curly braces are not exactly human-readable text. I personally favor the Pascal-style begin end blocks, it seems more natural and easier to understand in terms of sheer meaning, even for a non-initiate in programming languages: "See, is says begin, then some stuff, then end, so it must be something that begins here and ends over there, some sort of block, huh...". Nevertheless, semicolons and brackets are usually easier to parse, so that's why it's easier to use them instead of indentation or other constructs; designers that consider the language easier to understand without them, but easier to parse with them, apply tricks like the one you mentioned: the lexer uses a rule to insert semicolons automatically as it scans.
Semicolons and {} have semantic meanings (variable lifetime, mostly) as well as just syntax. In C++ I've written code that looked like
{
lua_table tab;
{
lua_string str;
}
}
They were of great use because using the Lua stack from C++ sucks terribly.
For some people, semicolons and braces look like noise that makes difficult reading the 'actual' code.
As you can have parsers to recognize blocks either based on punctuations or in indentation (no technical issue involved) the use of one or the other alternative is just a question of the programmer preference.
My impression is that this preference could be mainly due to the previous programmer background.
I really don't understand why you ask. You hate writing Python code? Well, don't! Nobody has canceled C/C++/etc.
I need to document the software I'm currently working on. The software consists of several programming languages and scripts which got me thinking. If a new developers comes along and needs to fix something, they might know Java but maybe not bash scripting. It would be nice if there was a program which would help to understand what
for f in "$#" ; do
means. I was thinking of something that creates a static HTML page with the code plus syntax highlighting and if you hover over something (like the "for"), it would display a pop-up with an explanation:
for starts a loop which iterates over all values that follow in. In the loop, you can access each value via the variable $f. The loop body is between do and done
Does something like that already exist?
[EDIT] This is just an example. You'll get another help for f, in, "$#", ; and do, i.e. each and every element of the line should be explained. Unknown elements (like command names) should link to Google. So you can understand what it does even if you're missing some detail.
[EDIT2] I'm aware that you can't write a program which understands what another program does. What I'm looking for is a simple tool which will do "extended syntax highlighting" in the sense that it will color an expression and give a short explanation what it means (plus maybe a link to some in-depth reference).
This is meant for someone who knows how to program but maybe hasn't seen some obscure construct before. Say
echo "Error" 1>&2
Every bash programmer knows what this means but a Java developer might be puzzled by the 1>&2 despite the fact that they can guess that echo == System.out.println. A simple "Redirects stdout to stderr" will clear things up and give that instant "AHA!" which allows them to stay in their current train of thought.
A tool like this could be built using ANTLR, i.e. parse the code into an abstract syntax tree using an ANTLR grammar for that language, and write an HTML generator which produced the annotated code.
It sounds like a useful tool to have for language learning, or exploring source code of projects you're not maintaining -- but is it appropriate for documentation?
Why is it important to help the programmers of other languages understand the code at this level of implementation detail? Anyone maintaining the implementation at this level will obviously have to know the language and will probably have an IDE to do most of this.
That said, I'd definitely consider a tool like this as a learning aid.
IMO it would be simpler and more effective to just collect links to good language-specific references and tutorials on a Wiki page.
For all mainstream languages, such sources exist and are maintained regularly. If you try to create your own reference, you need to maintain it too. Fair enough, bash syntax is not going to change very often, but other languages do develop faster, so it is going to be a burden.
If you think about it, it's not that useful to have a tool that explains the syntax. Developers could just google for keywords instead of browsing a website in a similar fashion to http://www.codeweblog.com/source/ .
I believe that good comments will be by far more useful, plus there are tools to extract the documentation by using the comments (for example, HappyDoc does that for Python).
It is a very tricky thing. First of all by definition it can be proven that program that will "understand" any program down't exist. However, you can still use existing documentation. Maybe using tools like Doxygen can help you. You would need to document your code through comments and the documentation will be generated from them.
A language cannot be explained only through its syntax. The runtime environment plays a great part, together with the underlying philosophy of the language and libraies.
Moreover, syntax is not that complex for most common languages (given that code has been written with maintainability in mind).
Going on with bash example, you cannot deeply understand bash if you know nothing about processes & job control, environment variables, a big list of unix commands (tr, sort, cut, paste, sed, awk, find, ...) and many other features that don't appear in syntax.
If the tool produced
for starts a loop which iterates over
all values that follow in. In the
loop, you can access each value via
the variable $f. The loop body is
between do and done
it would be pretty worthless. This is exactly the kind of comment that trainee (human) programmers are told nver to write.