Why are the below layout parsed correctly in Haskell? - haskell

I am testing out my understanding of the layout-parsing function in Haskell report (Here)
I could understand that:
test case 1 will pass due to good alignment
test case 2 will fail because "in a + b" is considered a new item at module-level
However, I could not understand why test case 3 would be correctly parsed. So, Questions:
Why will test-case 3 be correctly parsed?
Which pattern in the LHS of the parsing function L (see Here) does test-case 3 match?
-- test case 1
f_1 = let a = 1
b = 2
in a + b
-- test case 2
f_2 = let a = 1
b = 2
in a + b
-- test case 3
f_3 = let a = 1
b = 2
in a + b

Test case 3 matches the parse-error(t) rule. Because the token in is not legal at that point in the let block, a } is inserted before the in to end it.
The parse-error rule can be confusing, but it is also very flexible; using it you can e.g. write Haskell one-liners with rarely any explicit {} at all.

Related

Why i = 1, and i = i + 1 would become 2?

i = 1
i = i + 1
print(i)
I am pretty confused about the code's logic. Why would i eventually become 2?
Lets begin with the first assignment:
i = 1
This creates the variable i and initialize it to the integer value 1.
Then we get to what you seem to have problem understanding:
i = i + 1
This statement can be split into two parts:
The addition
The assignment
The addition i + 1 will take the current values of the variable i, which is 1, and add the value 1 to that. In essence the expression i + 1 is the same as 1 + 1.
The result of the addition will be 2. And this result is then assigned to the variable i, making the value of i be equal to 2.
You then print the (new) current value of i:
print(i)
This will of course print the value 2.
The difference is that one modifies the data-structure itself (in-place operation) b += 1 while the other just reassigns the variable a = a + 1.
Just for completeness:
x += y is not always doing an in-place operation, there are (at least) three exceptions:
If x doesn't implement an __iadd__ method then the x += y statement is just a shorthand for x = x + y. This would be the case if x was something like an int
If __iadd__ returns NotImplemented, Python falls back to x = x + y.
The __iadd__ method could theoretically be implemented to not work in place. It'd be really weird to do that, though.
As it happens your bs are numpy.ndarrays which implements __iadd__ and return itself so your second loop modifies the original array in-place.
You can read more on this in the Python documentation of "Emulating Numeric Types".
'i' is a variable which stored 1 if We add 1 again in 'i' that means
i=1;
i+1 means 1+1=2
i=1
i=i+1// i has already 1 and here we are adding 1 again so result will be 2.
hope you understood.
Let's start from i = 1. So you are assigning i to 1. Now your situation is:
i = i + 1
So if i is 1, then the abovementioned code would be "translated" to:
i = 1 + 1
That's why i = i + 1 is equal to 2.

How to fix indentation problem with haskell if statement

I have the following Haskell code:
f :: Int -> Int
f x =
let var1 = there in
case (there) of
12 -> 0
otherwise | (there - 1) >= 4 -> 2
| (there + 1) <= 2 -> 3
where there = 6
The function alone is garbage, ignore what exactly it does.
I want to replace the guards with if
f x =
let var1 = there in
case (there) of
12 -> 0
otherwise -> if (there - 1) >= 4 then 2
else if (there + 1) <= 2 then 3
where there = 6
I tried moving the if to the next line, the then to the next line, lining them up, unlining them, but nothing seems to work.
I get a parsing error and I don't know how to fix it:
parse error (possibly incorrect indentation or mismatched brackets)
|
40 | where there = 6
| ^
You have a few misunderstandings in here. Let's step through them starting from your original code:
f x =
A function definition, but the function never uses the parameter x. Strictly speaking this is a warning and not an error, but most code bases will use -Werror so consider omitting the parameter or using _ to indicate you are explicitly ignoring the variable.
let var1 = there in
This is unnecessary - again you are not using var1 (the below used there) so why have it?
case (there) of
Sure. Or just case there of, not need for excessive parens cluttering up the code.
12 -> 0
Here 12 is a pattern match, and it's fine.
otherwise ->
Here you used the variable name otherwise as a pattern which will uncondtionally match the value there. This is another warning: otherwise is a global value equal to True so it can be used in guards, such as function foo | foo < 1 = expr1 ; | otherwise = expr2. Your use is not like that, using otherwise as a pattern shadows the global value. Instead consider the catch all pattern with underscore:
_ -> if (there - 1) >= 4
then 2
else if (there + 1) <= 2
then 3
where there = 6
Ok... what if there was equal to 3? 3-1 is not greater than 4. 3+1 is not less than 2. You always need an else with your if statement. There is no if {} in Haskell instead there is if ... else ... much like the ternary operator in C, as explained in the Haskell wiki.

what is this shift used in the simplified galil seiferas string match algorithm?

I'm self-studying problem 32-1 in CLRS; part c), presents the following algorithm for string matching:
REPETITION-MATCHER(P, T)
m = P.length
n = T.length
k = 1 + ρ'(P)
q = 0
s = 0
while s <= n-m
if T[s+q+1] == P[q+1]
q = q+1
if q==m
print "Pattern occurs with shift" s
if q==m or T[s+q+1] != P[q+1]
s = s+max(1, ceil(q/k))
q = 0
Here, ρ'(P), which is a function of P only, is defined as the largest integer r such that some prefix P[1..i] = y^r, e.g. a substring y repeated r times.
This algorithm appears to be 95 percent similar to the naive brute-force string matcher. However, the one part which greatly confuses me, and which seems to be the centerpiece of the entire algorithm, is the second to last line. Here, q is the number of characters of P matched so far. What is the rationale behind ceil(q/k)? It is completely opaque to me. It would have made more sense if that line were something like s = s + max(1+q, 1+i), where i is the length of the prefix that gives rise to ρ'(P).
CLRS claims that this algorithm is due to Galil and Seiferas, but in the reference they provide, I cannot find anything that resembles the algorithm provided above. It appears that reference contains, if anything, a much more advanced version of what is here. Can someone explain this ceil(q/k) value, and/or point me toward a reference that describes this particular algorithm, instead of the more well-known main Galil Seiferas paper?
Example #1:
Match aaaa in aaaaab, here ρ' = 4. Consider state:
aaaa ab
^
We have a mismatch here, and we want to move forward by one symbol, no more, because we will match full pattern again (last line sets q to zero). q = 4 and k = 5, so ceil(q/k) = 1, that's all right.
Example #2: Match abcd.abcd.abcd.X in abcd.abcd.abcd.abcd.X. Consider state:
abcd.abcd.abcd. abcd.X
^
We have a mismatch here, and we would like to move forward by five symbols. q = 15 and k = 4, so ceil(q/k) = 4. That's ok, it is almost 5, we still can match our pattern. Had we bigger ρ', say 10, we would have ceil(50/(10+1)) = 5.
Yeh, algorithms skips forward less symbols than KMP does, in case ρ'=10 its running time is O(10n+m) while KMP has O(n+m).
I figured out the proof of correctness.
let k = ρ'(P) + 1, and ρ'(P) is the largest possible repetition factor out of all the prefixes of P.
Suppose T[s+1..s+q] = P[1..q], and either q=m or T[s+q+1] != P[q+1]
Then, for 1 <= j <= floor(q/k) (except for the case q=m and m mod k = 0, in which the upper limit must be ceil(m/k)), we have
T[s+1..s+j] = P[1..j]
T[s+j+1..s+2j] = P[j+1..2j]
T[s+2j+1..s+3j] = P[2j+1..3j]
...
T[s+(k-1)j+1..s+kj] = P[(k-1)j+1..kj]
where not every quantity on every line is equal, since k cannot be a repetition factor, since the largest possible repetition factor out of any prefix of P is k-1.
Suppose we now make a comparison at shift s' = s+j, so that we will make the following comparisons
T[s+j+1..s+2j] with P[1..j]
T[s+2j+1..s+3j] with P[j+1..2j]
T[s+3j+1..s+4j] with P[2j+1..3j]
...
T[s+kj+1..s+(k+1)j] with P[(k-1)j+1..kj]
We claim that not every comparison can match, e.g. at least one of the above "with"s must be replaced with !=. We prove by contradiction. Suppose every "with" above is replaced by =. Then, comparing to the first set of comparisons we did, we would immediately have the following:
P[1..j] = P[j+1..2j]
P[j+1..2j] = [2j+1..3j]
...
P[(k-2)j+1..(k-1)j] = P[(k-1)j+1..kj]
However, this cannot be true, because k is not a repetition factor, hence a contradiction.
Hence, for any 1 <= j <= floor(q/k), testing a new shift s'=s+j is guaranteed to mismatch.
Hence, the smallest shift that is possible to result in a match is s + floor(q/k) + 1 >= ceil(q/k).
Note the code uses ceil(q/k) for simplicity, solely to deal with the case that q = m and m mod k = 0, in which case k * (floor(q/k)+1) would be greater than m, so only ceil(q/k) would do. However, when q mod k = 0 and q < m, then ceil(q/k) = floor(q/k), so is slightly suboptimal, since that shift is guaranteed to fail, and floor(q/k)+1 is the first shift that has any chance of matching.

Fortran nested WHERE statement

I have a Fortran 90 source code with a nested WHERE statement. There is a problem but it seems difficult to understand what exactly happens. I would like to transform it into DO-IF structure in order to debug. What it is not clear to me is how to translate the nested WHERE.
All the arrays have the same size.
WHERE (arrayA(:) > 0)
diff_frac(:) = 1.5 * arrayA(:)
WHERE (diff_frac(:) > 2)
arrayC(:) = arrayC(:) + diff_frac(:)
ENDWHERE
ENDWHERE
My option A:
DO i=1, SIZE(arrayA)
IF (arrayA(i) > 0) THEN
diff_frac(i) = 1.5 * arrayA(i)
DO j=1, SIZE(diff_frac)
IF (diff_frac(j) > 2) THEN
arrayC(j) = arrayC(j) + diff_frac(j)
ENDIF
ENDDO
ENDIF
ENDDO
My option B:
DO i=1, SIZE(arrayA)
IF (arrayA(i) > 0) THEN
diff_frac(i) = 1.5 * arrayA(i)
IF (diff_frac(i) > 2) THEN
arrayC(i) = arrayC(i) + diff_frac(i)
ENDIF
ENDIF
ENDDO
Thank you
According to the thread "Nested WHERE constructs" in comp.lang.fortran (particularly Ian's reply), it seems that the first code in the Question translates to the following:
do i = 1, size( arrayA )
if ( arrayA( i ) > 0 ) then
diff_frac( i ) = 1.5 * arrayA( i )
endif
enddo
do i = 1, size( arrayA )
if ( arrayA( i ) > 0 ) then
if ( diff_frac( i ) > 2 ) then
arrayC( i ) = arrayC( i ) + diff_frac( i )
endif
endif
enddo
This is almost the same as that in Mark's answer except for the second mask part (see below). Key excerpts from the F2008 documents are something like this:
7.2.3 Masked array assignment – WHERE (page 161)
7.2.3.2 Interpretation of masked array assignments (page 162)
... 2. Each statement in a WHERE construct is executed in sequence.
... 4. The mask-expr is evaluated at most once.
... 8. Upon execution of a WHERE statement that is part of a where-body-construct, the control mask is established to have the value m_c .AND. mask-expr.
... 10. If an elemental operation or function reference occurs in the expr or variable of a where-assignment-stmt or in a mask-expr, and is not within the argument list of a nonelemental function reference, the operation is performed or the function is evaluated only for the elements corresponding to true values of the control mask.
If I understand the above thread/documents correctly, the conditional diff_frac( i ) > 2 is evaluated after arrayA( i ) > 0, so corresponding to double IF blocks (if I assume that A .and. B in Fortran does not specify the order of evaluation).
However, as noted in the above thread, the actual behavior may depend on compilers... For example, if we compile the following code with gfortran5.2, ifort14.0, or Oracle fortran 12.4 (with no options)
integer, dimension(4) :: x, y, z
integer :: i
x = [1,2,3,4]
y = 0 ; z = 0
where ( 2 <= x )
y = x
where ( 3.0 / y < 1.001 ) !! possible division by zero
z = -10
end where
end where
print *, "x = ", x
print *, "y = ", y
print *, "z = ", z
they all give the expected result:
x = 1 2 3 4
y = 0 2 3 4
z = 0 0 -10 -10
But if we compile with debugging options
gfortran -ffpe-trap=zero
ifort -fpe0
f95 -ftrap=division (or with -fnonstd)
gfortran and ifort abort with floating-point exception by evaluating y(i) = 0 in the mask expression, while f95 runs with no complaints. (According to the linked thread, Cray behaves similarly to gfortran/ifort, while NAG/PGI/XLF are similar to f95.)
As a side note, when we use "nonelemental" functions in WHERE constructs, the control mask does not apply and all the elements are used in the function evaluation (according to Sec. 7.2.3.2, sentence 9 of the draft above). For example, the following code
integer, dimension(4) :: a, b, c
a = [ 1, 2, 3, 4 ]
b = -1 ; c = -1
where ( 3 <= a )
b = a * 100
c = sum( b )
endwhere
gives
a = 1 2 3 4
b = -1 -1 300 400
c = -1 -1 698 698
which means that sum( b ) = 698 is obtained from all the elements of b, with the two statements evaluated in sequence.
Why not
WHERE (arrayA(:) > 0)
diff_frac(:) = 1.5 * arrayA(:)
ENDWHERE
WHERE (diff_frac(:) > 2 .and. arrayA(:) > 0)
arrayC(:) = arrayC(:) + diff_frac(:)
ENDWHERE
?
I won't say it can't be done with nested wheres, but I don't see why it has to be. Then, if you must translate to do loops, the translation is very straightforward.
Your own attempts suggest you think of where as a kind of looping construct, I think it's better to think of it as a masked assignment (which is how it's explained in the language standard) in which each individual assignment happens at the same time. These days you might consider translating into do concurrent constructs.
Sorry about deflecting the question a bit, but this is interesting. I am not sure that I can tell how the nested where is going to be compiled. It may even be one of those cases that push the envelope.
I agree with High Performance Mark that where is best thought of as a masking operation and then it is unclear (to me) whether your "A" or "B" will result.
I do think that his solution should be the same as your nested where.
My point: Since this is tricky to even discern, can you write new code instead of this, from scratch? Not to translate it, but delete it, forget about it, and write code to do the job.
If you know exactly what this piece of code needs to do, its pre- and post- conditions, then it shouldn't be difficult. If you don't know that then the algorithm may be too entangled in which case this should be rewritten anyway. There may be subtleties involved between what this was intended to do and what it does. You say you are debugging this code already.
Again, sorry to switch context but I think that there is a possibility that this is one of those situations where code is best served by a complete rewrite.
If you want to keep it and only write loops for debugging: Why not write them and compare output?
Run it with where as it is, then run it with "A" instead, then with "B". Print values.

How to define a variable based on an if/then/else statement

I'm trying to translate some python code to haskell. However I reached a point where I'm not sure how to proceed.
if len(prod) % 2 == 0:
ss = float(1.5 * count_vowels(cust))
else:
ss = float(count_consonants(cust)) # muliplicaton by 1 is implied.
if len(cust_factors.intersection(prod_factors)) > 0:
ss *= 1.5
return ss
I've tried to translate it to this:
if odd length prod
then ss = countConsonants cust
else ss = countVowels cust
if length (cust intersect prod) > 0
then ss = 1.5 * ss
else Nothing
return ss
But I keep getting errors of:
parse error on input `='
Any help or words of wisdom on this would be greatly appreciated.
Don't think of programming in Haskell as "if this, then do that, then do the other thing" — the entire idea of doing things in a sequence is imperative. You're not checking a condition and then defining a variable — you're just calculating a result that depends on a condition. In functional programming, if is an expression and variables are assigned the result of an expression, not assigned inside it.
The most direct translation would be:
let ss = if odd $ length prod
then countConsonants cust
else countVowels cust
in if length (cust `intersect` prod) > 0
then Just $ 1.5 * ss
else Nothing
In Haskell, if is an expression, not a statement. This means it returns a value (like a function) instead of performing an action. Here's one way to translate your code:
ss = if odd length prod
then countConsinants cust
else countVowels cust
return if length ( cust intersect prod) > 0
then Just $ 1.5 * ss
else Nothing
Here's another way:
return if length ( cust intersect prod) > 0
then Just $ 1.5 * if odd length prod
then countConsinants cust
else countVowels cust
else Nothing
As Matt has pointed out, however, your Python code doesn't return None. Every code path sets ss to a number. If this is how it's supposed to work, here's a Haskell translation:
let ss = if odd $ length prod
then countConsonants cust
else countVowels cust
in if length (cust `intersect` prod) > 0
then 1.5 * ss
else ss
If I were you I'd use guards. Maybe I'm a Haskell heathen.
ss prod prodfactors cust | even $ length prod = extratest . (1.5 *) . countvowels cust
| otherwise = extratest . countconsonants cust
where extratest curval | custfactorsintersection prodfactors > 0 = curval * 1.5
| otherwise = curval
I would write it like this in Haskell:
if (not $ null $ cust_factors `intersect` prod_factors)
then ss * 1.5
else ss
where
ss = if (even $ length prod)
then 1.5 * count_vowels cust
else count_cosinants cust
Some comments about what you wrote:
You can do assignment in Haskell using the let and where syntax. In general everything you write in Haskell are expressions. In your case you have to write the whole thing as a single expression and using let or where simplifies that task.
return in Haskell means something different than in Python, it's used for computations with side effects (like IO). For your example there is no need for it.
Nothing is a special value of the type Maybe a. This type represents values of type a with possible failure (the Nothing).
And to answer your direct question.
This Python code
if b:
s = 1
else:
s = 2
would be translated to Haskell to s = if b then 1 else 2 inside a let or where clause.
Functional programming is different from imperative programming. Trying to "translate" line by line isn't how Haskell is meant to be used.
To specifically answer your question. "ss" already has a value. It simply isn't possible to give it a different value. ss = ss * 1.5 makes no sense.

Resources