The syntactical grammar of hardly any programming language is regular, as they allow arbitrarily deeply nested parenthesis. Rust does, too:
let x = ((((()))));
But is Rust's syntactical grammar at least context-free? If not, what element makes the grammar context-sensitive? Or is the grammar even recursively enumerable, like C++'s syntactical grammar?
Related: Is Rust's lexical grammar regular, context-free or context-sensitive?
Rust includes a macro processor, whose operation is highly context-sensitive.
You could attempt to skate around this issue by only doing syntactic analysis up to but not including macro expansion -- possible, but not particularly useful -- or by assuming that the macro expansion is done by some intermediate tool which is given a free pass to allow it to be Turing complete.
But I'm inclined to say that it simply means that the Rust language is recursively enumerable.
There are a number of restrictions on the validity of macro definitions which probably make the language (at least) context-sensitive, even if you settle for not performing the macro expansions as part of syntactic analysis.
This doesn't mean that a context-free grammar cannot be useful as part of the syntactic analysis of Rust. It's probably essential, and it could even be useful to use a parser generator such as bison or Antlr (and examples of both exist). Like most programming languages, there is a simple superset of Rust which is context-free, and which can be usefully analysed with context-free grammar tools; however, in the end there are texts which will need to be rejected at compile-time as invalid even though they are part of the CF superset.
Answer straight from the source code for Rust:
Rust's lexical grammar is not context-free. Raw string literals are
the source of the problem. Informally, a raw string literal is an r,
followed by N hashes (where N can be zero), a quote, any characters,
then a quote followed by N hashes. Critically, once inside the first
pair of quotes, another quote cannot be followed by N consecutive
hashes. e.g. r###""###"### is invalid.
I'm not very good in understanding the difference between the imperative and declarative programming paradigms. I read about Haskell being a declarative language. To an extend I would says yes but there is a something that bothers me in respect to the definition of imperative.
When I have a datastruct and use Haskell's functions to transform that I actually just told WHAT to transform. So I give the datastruct as argument to a function and am happy with the result.
But what if there is no function that actually satisfies my needs?
I would start writing an own function which expects the datastruct as an argument. After that I would start writing how the datastruct should be processed. Since I only can call native Haskell functions I'm still with the declarative paradigm right? But what when I start using an "if statement". Wouldn't that end the declarative nature since I'm about to tell the program HOW to do stuff from that point?
Perhaps this is a matter of perspective. The way I see it, there is nothing imperative about defining things in terms of other things because we can always replace something with its definition (as long as the definition is pure). That is, if we have f x = x + 1, then any place we see f z we can replace with z + 1. So pure functions shouldn't really be considered instructions; they should be considered definitions.
A lot of Haskell code is considered declarative for this reason. We simply define things as (pure) functions of other things.
It is possible to write imperative-style code in Haskell as well. Sometimes we really do want to say something like "do A, then do B, then do C". This adds a new dimension to the simple world of function application: we need a notion of 'happens before'. Haskell has embraced the Monad concept to describe computations that have an order of evaluation. This turns out to be very handy because it can encapsulate effects such as changing state.
There is no if statement in Haskell, only an if expression if a then b else c, which is really equivalent to ternary expressions like a ? b : c in C-like languages or b if a else c in Python. You cannot leave out the else, and it can be used anywhere an arbitrary expression can be used. It is further constrained that a must have type Boolean, and b and c must have the same type. It is syntactic sugar for
case a of
True -> b
False -> c
Don't get too hung up on the label "declarative"; it's just a style of programming that the language supports. Consider, for example, a typical declarative definition of the factorial function:
fact 0 = 1
fact n = n * fact (n - 1)
This is also just syntactic sugar for something you would see as more imperative:
fact n = case n of
0 -> 1
otherwise -> n * fact (n - 1)
Frankly, the term "declarative programming" is more of a marketing term than anything else. The informal definition that it means "specifying what to do, not how to do it" is vague and open to interpretation and definitely far from a black-and-white boundary. In practice, it seems to be applied to anything that broadly falls into the categories of functional programming, logic programming or the use of Domain-Specific Languages (DSLs).
Consequently (and I realise this probably doesn't qualify as an answer to your question, but still :)), I recommend you do not waste your time wondering whether something is still declarative or not. The terms imperative vs. functional vs. logic programming are already a bit more meaningful, so perhaps it's more useful to reflect on those.
Declarative languages are constructed from expressions whereas imperative languages are constructed from statements.
The usual explanation is what to do versus how to do it. You've already found the confusion in this. If you take it this way, declarative is using definitions (by name) and imperative is writing definitions. What is a declarative language then? One which only names definitions? If so then the only Haskell program you could write is precisely main!
There is a declarative way and an imperative way to have branching, and it follows directly from the definition. A declarative branch is an expression and an imperative branch is a statement. Haskell has case … of … (an expression) and C has if (…) {…} else {…} (a statement).
What is the difference between expressions and statements? Expressions have a value whereas statements have effects. What is the difference between values and effects?
For expressions, there is a function μ which maps any expression e to its value μ(e). This is also called the semantic, or meaning, and is ideally a well-defined mathematical object. This way of defining values is called denotational semantics. There are other methods.
For statements, there is a state P immediately before a statement S and a state Q immediately after. The effect of S is the delta from P to Q. This way of defining effects is called Hoare logic. There are other methods.
You can't use an "if statement" in Haskell, because there is none. You can use an "if-expression"
if c then a else b
but this is just syntactic sugar for something like
let f True a b = a; f False a b = b in f c a b
which is again fuly declarative.
I am looking to create a DSL and I'm looking for a language where you can define your own bracket-style operators for things like the floor and ceiling functions. I'd rather not go the route of defining my own Antlr parser for a custom syntax.
As a far as I know the only languages I know of that allow you to define custom operators are all binary infix operators.
tl;dr: Which languages allow for defined paired symbol (like opening bracket/closed bracket) operators?
Also, I don't see how this question can be "too broad" if no one has named a single language that has this and the criteria are very specific and definitely in the programming domain.
Since Fortress is dead, the only languages I know of where something like this would be imaginable are those of FORTH heritage.
In all others that I know of, braces, brackets and parentheses are already heavily used, and can't be overloaded further.
I suggest giving up the quest for such stuff and get comfortable writing
floor x
ceiling y
or however function application is expressed in the language of your choice.
However, in the article you cite, it is said: Unicode contains codepoints for these symbols at U+2308–U+230B: ⌈x⌉, ⌊x⌋.
Thus you can at least define this a s operator in a Haskell like language and use like:
infix 5 ⌈⌉
(foo ⌈⌉)
The best I could come up with is like the following:
--- how to make brackets
module Test where
import Prelude.Math
infix 7 `«`
infix 6 `»`
_ « x = Math.ceil x
(») = const
x = () «2.345» ()
main _ = println x
The output was: 3.0
(This is not actually Haskell, but Frege, a Haskell-like JVM language.)
Note that I used «» instead of ⌈⌉, because I somehow have no font in my IDE that would correctly show the bracket symbols. (This is another reason not to do such.)
The way it works is that with the infix directives, we get this parsed as
(») ((«) () 2.345) ()
(One could insert any expression in place of the ())
Maybe if you ask in the Haskell group, someone finds a better solution.
I often read that lazy is not the same as non-strict but I find it hard to understand the difference. They seem to be used interchangeably but I understand that they have different meanings. I would appreciate some help understanding the difference.
I have a few questions which are scattered about this post. I will summarize those questions at the end of this post. I have a few example snippets, I did not test them, I only presented them as concepts. I have added quotes to save you from looking them up. Maybe it will help someone later on with the same question.
Non-Strict Def:
A function f is said to be strict if, when applied to a nonterminating
expression, it also fails to terminate. In other words, f is strict
iff the value of f bot is |. For most programming languages, all
functions are strict. But this is not so in Haskell. As a simple
example, consider const1, the constant 1 function, defined by:
const1 x = 1
The value of const1 bot in Haskell is 1. Operationally speaking, since
const1 does not "need" the value of its argument, it never attempts to
evaluate it, and thus never gets caught in a nonterminating
computation. For this reason, non-strict functions are also called
"lazy functions", and are said to evaluate their arguments "lazily",
or "by need".
-A Gentle Introduction To Haskell: Functions
I really like this definition. It seems the best one I could find for understanding strict. Is const1 x = 1 lazy as well?
Non-strictness means that reduction (the mathematical term for
evaluation) proceeds from the outside in,
so if you have (a+(bc)) then first you reduce the +, then you reduce
the inner (bc).
-Haskell Wiki: Lazy vs non-strict
The Haskell Wiki really confuses me. I understand what they are saying about order but I fail to see how (a+(b*c)) would evaluate non-strictly if it was pass _|_?
In non-strict evaluation, arguments to a function are not evaluated
unless they are actually used in the evaluation of the function body.
Under Church encoding, lazy evaluation of operators maps to non-strict
evaluation of functions; for this reason, non-strict evaluation is
often referred to as "lazy". Boolean expressions in many languages use
a form of non-strict evaluation called short-circuit evaluation, where
evaluation returns as soon as it can be determined that an unambiguous
Boolean will result — for example, in a disjunctive expression where
true is encountered, or in a conjunctive expression where false is
encountered, and so forth. Conditional expressions also usually use
lazy evaluation, where evaluation returns as soon as an unambiguous
branch will result.
-Wikipedia: Evaluation Strategy
Lazy Def:
Lazy evaluation, on the other hand, means only evaluating an
expression when its results are needed (note the shift from
"reduction" to "evaluation"). So when the evaluation engine sees an
expression it builds a thunk data structure containing whatever values
are needed to evaluate the expression, plus a pointer to the
expression itself. When the result is actually needed the evaluation
engine calls the expression and then replaces the thunk with the
result for future reference.
...
Obviously there is a strong correspondence between a thunk and a
partly-evaluated expression. Hence in most cases the terms "lazy" and
"non-strict" are synonyms. But not quite.
-Haskell Wiki: Lazy vs non-strict
This seems like a Haskell specific answer. I take that lazy means thunks and non-strict means partial evaluation. Is that comparison too simplified? Does lazy always mean thunks and non-strict always mean partial evaluation.
In programming language theory, lazy evaluation or call-by-need1 is
an evaluation strategy which delays the evaluation of an expression
until its value is actually required (non-strict evaluation) and also
avoid repeated evaluations (sharing).
-Wikipedia: Lazy Evaluation
Imperative Examples
I know most people say forget imperative programming when learning a functional language. However, I would like to know if these qualify as non-strict, lazy, both or neither? At the very least it would provide something familiar.
Short Circuiting
f1() || f2()
C#, Python and other languages with "yield"
public static IEnumerable Power(int number, int exponent)
{
int counter = 0;
int result = 1;
while (counter++ < exponent)
{
result = result * number;
yield return result;
}
}
-MSDN: yield (c#)
Callbacks
int f1() { return 1;}
int f2() { return 2;}
int lazy(int (*cb1)(), int (*cb2)() , int x) {
if (x == 0)
return cb1();
else
return cb2();
}
int eager(int e1, int e2, int x) {
if (x == 0)
return e1;
else
return e2;
}
lazy(f1, f2, x);
eager(f1(), f2(), x);
Questions
I know the answer is right in front of me with all those resources, but I can't grasp it. It all seems like the definition is too easily dismissed as implied or obvious.
I know I have a lot of questions. Feel free to answer whatever questions you feel are relevant. I added those questions for discussion.
Is const1 x = 1 also lazy?
How is evaluating from "inward" non-strict? Is it because inward allows reductions of unnecessary expressions, like in const1 x = 1? Reductions seem to fit the definition of lazy.
Does lazy always mean thunks and non-strict always mean partial evaluation? Is this just a generalization?
Are the following imperative concepts Lazy, Non-Strict, Both or Neither?
Short Circuiting
Using yield
Passing Callbacks to delay or avoid execution
Is lazy a subset of non-strict or vice versa, or are they mutually exclusive. For example is it possible to be non-strict without being lazy, or lazy without being non-strict?
Is Haskell's non-strictness achieved by laziness?
Thank you SO!
Non-strict and lazy, while informally interchangeable, apply to different domains of discussion.
Non-strict refers to semantics: the mathematical meaning of an expression. The world to which non-strict applies has no concept of the running time of a function, memory consumption, or even a computer. It simply talks about what kinds of values in the domain map to which kinds of values in the codomain. In particular, a strict function must map the value ⊥ ("bottom" -- see the semantics link above for more about this) to ⊥; a non strict function is allowed not to do this.
Lazy refers to operational behavior: the way code is executed on a real computer. Most programmers think of programs operationally, so this is probably what you are thinking. Lazy evaluation refers to implementation using thunks -- pointers to code which are replaced with a value the first time they are executed. Notice the non-semantic words here: "pointer", "first time", "executed".
Lazy evaluation gives rise to non-strict semantics, which is why the concepts seem so close together. But as FUZxxl points out, laziness is not the only way to implement non-strict semantics.
If you are interested in learning more about this distinction, I highly recommend the link above. Reading it was a turning point in my conception of the meaning of computer programs.
An example for an evaluation model, that is neither strict nor lazy: optimistic evaluation, which gives some speedup as it can avoid a lot of "easy" thunks:
Optimistic evaluation means that even if a subexpression may not be needed to evaluate the superexpression, we still evaluate some of it using some heuristics. If the subexpression doesn't terminate quickly enough, we suspend its evaluation until it's really needed. This gives us an advantage over lazy evaluation if the subexpression is needed later, as we don't need to generate a thunk. On the other hand, we don't lose too much if the expression doesn't terminate, as we can abort it quickly enough.
As you can see, this evaluation model is not strict: If something that yields _|_ is evaluated, but not needed, the function will still terminate, as the engine aborts the evaluation. On the other hand, it may be possible that more expressions than needed are evaluated, so it's not completely lazy.
Yes, there is some unclear use of terminology here, but the terms coincide in most cases regardless, so it's not too much of a problem.
One major difference is when terms are evaluated. There are multiple strategies for this, ranging on a spectrum from "as soon as possible" to "only at the last moment". The term eager evaluation is sometimes used for strategies leaning toward the former, while lazy evaluation properly refers to a family of strategies leaning heavily toward the latter. The distinction between "lazy evaluation" and related strategies tend to involve when and where the result of evaluating something is retained, vs. being tossed aside. The familiar memoization technique in Haskell of assigning a name to a data structure and indexing into it is based on this. In contrast, a language that simply spliced expressions into each other (as in "call-by-name" evaluation) might not support this.
The other difference is which terms are evaluated, ranging from "absolutely everything" to "as little as possible". Since any value actually used to compute the final result can't be ignored, the difference here is how many superfluous terms are evaluated. As well as reducing the amount of work the program has to do, ignoring unused terms means that any errors they would have generated won't occur. When a distinction is being drawn, strictness refers to the property of evaluating everything under consideration (in the case of a strict function, for instance, this means the terms it's applied to. It doesn't necessarily mean sub-expressions inside the arguments), while non-strict means evaluating only some things (either by delaying evaluation, or by discarding terms entirely).
It should be easy to see how these interact in complicated ways; decisions are not at all orthogonal, as the extremes tend to be incompatible. For instance:
Very non-strict evaluation precludes some amount of eagerness; if you don't know whether a term will be needed, you can't evaluate it yet.
Very strict evaluation makes non-eagerness somewhat irrelevant; if you're evaluating everything, the decision of when to do so is less significant.
Alternate definitions do exist, though. For instance, at least in Haskell, a "strict function" is often defined as one that forces its arguments sufficiently that the function will evaluate to _|_ ("bottom") whenever any argument does; note that by this definition, id is strict (in a trivial sense), because forcing the result of id x will have exactly the same behavior as forcing x alone.
This started out as an update but it started to get long.
Laziness / Call-by-need is a memoized version of call-by-name where, if the function argument is evaluated, that value is stored for subsequent uses. In a "pure" (effect-free) setting, this produces the same results as call-by-name; when the function argument is used two or more times, call-by-need is almost always faster.
Imperative Example - Apparently this is possible. There is an interesting article on Lazy Imperative Languages. It says there are two methods. One requires closures the second uses graph reductions. Since C does not support closures you would need to explicitly pass an argument to your iterator. You could wrap a map structure and if the value does not exist calculate it otherwise return value.
Note: Haskell implements this by "pointers to code which are replaced with a value the first time they are executed" - luqui.
This is non-strict call-by-name but with sharing/memorization of the results.
Call-By-Name - In call-by-name evaluation, the arguments to a function are not evaluated before the function is called — rather, they are substituted directly into the function body (using capture-avoiding substitution) and then left to be evaluated whenever they appear in the function. If an argument is not used in the function body, the argument is never evaluated; if it is used several times, it is re-evaluated each time it appears.
Imperative Example: callbacks
Note: This is non-strict as it avoids evaluation if not used.
Non-Strict = In non-strict evaluation, arguments to a function are not evaluated unless they are actually used in the evaluation of the function body.
Imperative Example: short-circuiting
Note: _|_ appears to be a way to test if a function is non-strict
So a function can be non-strict but not lazy. A function that is lazy is always non-strict.
Call-By-Need is partly defined by Call-By-Name which is partly defined by Non-Strict
An Excerpt from "Lazy Imperative Languages"
2.1. NON-STRICT SEMANTICS VS. LAZY EVALUATION We must first clarify
the distinction between "non-strict semantics" and "lazy evaluation".
Non-strictsemantics are those which specify that an expression is not
evaluated until it is needed by a primitiveoperation. There may be
various types of non-strict semantics. For instance, non-strict
procedure calls donot evaluate the arguments until their values are
required. Data constructors may have non-strictsemantics, in which
compound data are assembled out of unevaluated pieces Lazy evaluation,
also called delayed evaluation, is the technique normally used to
implement non-strictsemantics. In section 4, the two methods commonly
used to implement lazy evaluation are very brieflysummarized.
CALL BY VALUE, CALL BY LAZY, AND CALL BY NAME "Call by value" is the
general name used for procedure calls with strict semantics. In call
by valuelanguages, each argument to a procedure call is evaluated
before the procedure call is made; the value isthen passed to the
procedure or enclosing expression. Another name for call by value is
"eager" evaluation.Call by value is also known as "applicative order"
evaluation, because all arguments are evaluated beforethe function is
applied to them."Call by lazy" (using William Clinger's terminology in
[8]) is the name given to procedure calls which usenon-strict
semantics. In languages with call by lazy procedure calls, the
arguments are not evaluatedbefore being substituted into the procedure
body. Call by lazy evaluation is also known as "normal
order"evaluation, because of the order (outermost to innermost, left
to right) of evaluation of an expression."Call by name" is a
particular implementation of call by lazy, used in Algol-60 [18]. The
designers ofAlgol-60 intended that call-by-name parameters be
physically substituted into the procedure body, enclosedby parentheses
and with suitable name changes to avoid conflicts, before the body was
evaluated.
CALL BY LAZY VS. CALL BY NEED Call by need is an
extension of call by lazy, prompted by the observation that a lazy
evaluation could beoptimized by remembering the value of a given
delayed expression, once forced, so that the value need notbe
recalculated if it is needed again. Call by need evaluation,
therefore, extends call by lazy methods byusing memoization to avoid
the need for repeated evaluation. Friedman and Wise were among the
earliestadvocates of call by need evaluation, proposing "suicidal
suspensions" which self-destructed when theywere first evaluated,
replacing themselves with their values.
The way I understand it, "non-strict" means trying to reduce workload by reaching completion in a lower amount of work.
Whereas "lazy evaluation" and similar try to reduce overall workload by avoiding full completion (hopefully forever).
from your examples...
f1() || f2()
...short circuiting from this expression would not possibly result in triggering 'future work', and there's neither a speculative/amortized factor to the reasoning, nor any computational complexity debt accruing.
Whereas in the C# example, "lazy" conserves a function call in the overall view, but in exchange, comes with those above sorts of difficulty (at least from point of call until possible full completion... in this code that's a negligible-distance pathway, but imagine those functions had some high-contention locks to put up with).
int f1() { return 1;}
int f2() { return 2;}
int lazy(int (*cb1)(), int (*cb2)() , int x) {
if (x == 0)
return cb1();
else
return cb2();
}
int eager(int e1, int e2, int x) {
if (x == 0)
return e1;
else
return e2;
}
lazy(f1, f2, x);
eager(f1(), f2(), x);
If we're talking general computer science jargon, then "lazy" and "non-strict" are generally synonyms -- they stand for the same overall idea, which expresses itself in different ways for different situations.
However, in a given particular specialized context, they may have acquired differing technical meanings. I don't think you can say anything accurate and universal about what the difference between "lazy" and "non-strict" is likely to be in the situation where there is a difference to make.
why most of the scripting languages are loosely typed ? for example
javascript , python , etc ?
First of all, there are some issues with your terminology. There is no such thing as a loosely typed language and the term scripting language is vague too, most commonly referring to so called dynamic programming languges.
There is weak typing vs. strong typing about how rigorously is distinguished between different types (i.e. if 1 + "2" yields 3 or an error).
And there is dynamic vs. static typing, which is about when type information is determined - while or before running.
So now, what is a dynamic language? A language that is interpreted instead of compiled? Surely not, since the way a language is run is never some inherent characteristic of the language, but a pure implementation detail. In fact, there can be interpreters and compilers for one-and-the-same language. There is GHC and GHCi for Haskell, even C has the Ch interpreter.
But then, what are dynamic languges? I'd like to define them through how one works with them.
In a dynamic language, you like to rapidly prototype your program and just get it work somehow. What you don't want to do is formally specifying the behaviour of your programs, you just want it to behave like intended.
Thus if you write
foo = greatFunction(42)
foo.run()
in a scripting language, you'll simply assume that there is some greatFunction taking a number that will returns some object you can run. You don't prove this for the compiler in any way - no predetmined types, no IRunnable ... . This automatically gets you in the domain of dynamic typing.
But there is type inference too. Type inference means that in a statically-typed language, the compiler does automatically figure out the types for you. The resulting code can be extremely concise but is still statically typed. Take for example
square list = map (\x -> x * x) list
in Haskell. Haskell figures out all types involved here in advance. We have list being a list of numbers, map some function that applies some other function to any element of a list and square that produces a list of numbers from another list of numbers.
Nonetheless, the compiler can prove that everything works out in advance - the operations anything supports are formally specified. Hence, I'd never call Haskell a scripting language though it can reach similar levels of expressiveness (if not more!).
So all in all, scripting languages are dynamically typed because that allows you to prototype a running system without specifying, but assuming every single operation involved exists, which is what scripting languages are used for.
I don't quite understand your question. Apart from PHP, VBScript, COMMAND.COM and the Unix shell(s) I can't really think of any loosely typed scripting languages.
Some examples of scripting languages which are not loosely typed are Python, Ruby, Mondrian, JavaFXScript, PowerShell, Haskell, Scala, ELisp, Scheme, AutoLisp, Io, Ioke, Seph, Groovy, Fantom, Boo, Cobra, Guile, Slate, Smalltalk, Perl, …