Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed last year.
Improve this question
I faced with some issue in Haskell: code-style and big functions (I'm continuing to learn Haskell through writing toy-language).
I have some necessary-big-function (see example). And there are two sub-function (nested where) in it. And there is no reason to place its sub-finction in module scoupe.
How "Haskell code-style" or "Haskell code-style best practics" suggest to solve this problem of "ungraceful and clumsy code"?
Function (with later comment):
-- We can delete (on DCE) any useless opers.
-- Useful opers - only opers, whitch determine (directly or transitivery) result of GlobalUse oper
addGlobalUsageStop :: [Var] -> IR -> IR
addGlobalUsageStop guses iR = set iOpers (ios ++ ios') $ set opN opn' iR
where
ios = _iOpers iR
gdefs = _gDefs iR :: M.Map Int Var
opn = _opN iR
guses' = nub $ filter isRegGlobal guses
ogs = catMaybes $ map (concatIOperWithGDef gdefs) $ reverse ios
where
concatIOperWithGDef gdefs' (i, o) = case gdefs' M.!? i of
Nothing -> Nothing
Just gd -> Just (o, gd)
nops = newGUses ogs guses'
where
newGUses [] _ = []
newGUses _ [] = []
newGUses ((Oper _ d _ _, g):os) guses = if elem g guses
then (Oper GlobalUse g (RVar d) None):newGUses os (filter (g /=) guses)
else newGUses os guses
ios' = zip [opn..] nops
opn' = opn + length ios'
Notices:
If you want to know why I even wrote such big function the answer is:
because this is some big (and one-needed functionality in compiler): - for each "returning variable" we shoul find last operation, witch defines it (actually corresponding virtual register), and expand our IR with constructed opers.
I'v seen some similar questions: Haskell nested where clause and "let ... in" syntax
but they are about "how typewrite correct code?", and my question "is this code Code-Style correct, and if it isn't - what should i do?".
And there is no reason to place its sub-function in module scope
Think about it the other way around. Is there any reason to put the sub-function in the local scope? Then do it. This could be because
it needs to access locally bound variables. In this case it must be local, or else you need extra parameters.
it does something very obvious and only relevant to the specific use case. This could be one-line definitions of some operation that you don't care thinking a properly descriptive name, or it could be a go helper that does basically the whole work of the enclosing function.
If neither of these apply, and you can give the local function a descriptive name (as you've already done) then put it in module scope. Add a type signature, which makes it clearer yet. Don't export it from the module.
Putting the function in module scope also removes the need to rename variables like you did with gdefs'. That's one of the more common causes for bugs in Haskell code.
The question is a good one, but the example code isn't a great example. For me, the correct fix in this particular case is not to talk about how to stylishly nest wheres; it's to talk about how to use library functions and language features to simplify the code enough that you don't need where in the first place. In particular, list comprehensions get you very far here. Here's how I would write those two definitions:
import Data.Containers.ListUtils (nubOrdOn)
... where
ogs = [(o, gd) | (i, o) <- reverse ios, Just gd <- [gdefs M.!? i]]
nops = nubOrdOn fun
[ Oper GlobalUse g (RVar d) None
| (Oper _ d _ _, g) <- ogs
, g `elem` guses'
]
fun (Oper _ g _ _) = g -- this seems useful enough to put at the global scope; it may even already exist there
Since ogs isn't mentioned anywhere else in the code, you might consider inlining it:
-- delete the definition of ogs
nops = nubOrdOn fun
[ Oper GlobalUse g (RVar d) None
| (i, Oper _ d _ _) <- reverse ios
, Just g <- [gdefs M.!? i]
, g `elem` guses'
]
Related
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I was working on a version of the problem that's here : http://blog.gja.in/2014/01/functional-programming-101-with-haskell.html#.WIi0J7YrKHo
However if I were to give in an input of chop chop [1,0,0,0], it should return [0,3,0,0]. I tried playing around with the code that was given on the above website however I can't seem to be figuring it out. I've also only started learning Haskell 3-4 days ago so I'm not sure what the right direction is to go on.
Thank you in advance!
I think you misunderstood the intended behavior. As rampion points out,
chop [1,0,0,0] should result in [0,0,0]. I don't understand, where you get [0,3,0,0], but I'll walk you through a little.
First, the definition of chop, with some comments that allow me to refer to each part of the definition.
chop [] = [] --chopNull
chop (1:xs) = xs --chopHead1
chop (n:xs) = (replicate (n - 1) (n - 1)) ++ xs --chopHeadN
You asked about chop chop [1,0,0,0]. This isn't valid code. I'll assume you meant chop (chop [1,0,0,0]). Taking this as the starting point, I'll perform a bit of equational reasoning. That is, I'll transform the program fragment in question by substituting the relevant part of the definition. Each line has a comment indicating how the current line was calculated from the previous one.
chop (chop [1,0,0,0])
= chop (chop (1:0:0:0:[]) --De-sugaring of List
= chop (chop (1:xs)) --Let xs = 0:0:0:[] = [0,0,0]
= chop (xs) --chopHead1
= chop (0:0:0:[]) --def of xs
= (replicate (0 - 1) (0 - 1)) ++ (0:0:[]) --chopHeadN
= [] ++ (0:0:[]) --From definition of replicate
= (0:0:[]) --From defintion of (++)
= [0,0] --re-sugaring
I do a few loose things above. Notably I equate xs to (0:0:0:[]) in the comment. This is just to make it clear how that particular substitution is satisfied by the pattern match in the definition. Next, I used the chopHeadN definition to match the case where n=0, as it is the first thing that matches. You'll have to trust me on the definition of replicate and (++).
So that is what that particular call should be doing. Generally however, if you don't know what a particular function does, it is a good idea to start with some simpler input. For lists, an empty list [] or singleton's, [n] are good starting points. Then move on to two element lists. Like in this example, you can cut out part of a definition and inspect what that part does on known data. Do it yourself in ghci. (Actually, that's what I did for the replicate (0-1) (0-1) expression. I thought it would be an error.)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am learning a haskell for a few days and the laziness is something like buzzword. Because of the fact I am not familiar with laziness ( I have been working mainly with non-functional languages ) it is not easy concept for me.
So, I am asking for any excerise / example which show me what laziness is in the fact.
Thanks in advance ;)
In Haskell you can create an infinite list. For instance, all natural numbers:
[1,2..]
If Haskell loaded all the items in memory at once that wouldn't be possible. To do so you would need infinite memory.
Laziness allows you to get the numbers as you need them.
Here's something interesting: dynamic programming, the bane of every intro. algorithms student, becomes simple and natural when written in a lazy and functional language. Take the example of string edit distance. This is the problem of measuring how similar two DNA strands are or how many bytes changed between two releases of a binary executable or just how 'different' two strings are. The dynamic programming algorithm, expressed mathematically, is simple:
let:
• d_{i,j} be the edit distance of
the first string at index i, which has length m
and the second string at index j, which has length m
• let a_i be the i^th character of the first string
• let b_j be the j^th character of the second string
define:
d_{i,0} = i (0 <= i <= m)
d_{0,j} = j (0 <= j <= n)
d_{i,j} = d_{i - 1, j - 1} if a_i == b_j
d_{i,j} = min { if a_i != b_j
d_{i - 1, j} + 1 (delete)
d_{i, j - 1} + 1 (insert)
d_{i - 1, j - 1} + 1 (modify)
}
return d_{m, n}
And the algorithm, expressed in Haskell, follows the same shape of the algorithm:
distance a b = d m n
where (m, n) = (length a, length b)
a' = Array.listArray (1, m) a
b' = Array.listArray (1, n) b
d i 0 = i
d 0 j = j
d i j
| a' ! i == b' ! j = ds ! (i - 1, j - 1)
| otherwise = minimum [ ds ! (i - 1, j) + 1
, ds ! (i, j - 1) + 1
, ds ! (i - 1, j - 1) + 1
]
ds = Array.listArray bounds
[d i j | (i, j) <- Array.range bounds]
bounds = ((0, 0), (m, n))
In a strict language we wouldn't be able to define it so straightforwardly because the cells of the array would be strictly evaluated. In Haskell we're able to have the definition of each cell reference the definitions of other cells because Haskell is lazy – the definitions are only evaluated at the very end when d m n asks the array for the value of the last cell. A lazy language lets us set up a graph of standing dominoes; it's only when we ask for a value that we need to compute the value, which topples the first domino, which topples all the other dominoes. (In a strict language, we would have to set up an array of closures, doing the work that the Haskell compiler does for us automatically. It's easy to transform implementations between strict and lazy languages; it's all a matter of which language expresses which idea better.)
The blog post does a much better job of explaining all this.
So, I am asking for any excerise / example which show me what laziness is in the fact.
Click on Lazy on haskell.org to get the canonical example. There are many other examples just like it to illustrate the concept of delayed evaluation that benefits from not executing some parts of the program logic. Lazy is certainly not slow, but the opposite of eager evaluation common to most imperative programming languages.
Laziness is a consequence of non-strict function evaluation. Consider the "infinite" list of 1s:
ones = 1:ones
At the time of definition, the (:) function isn't evaluated; ones is just a promise to do so when it is necessary. Such a time would be when you pattern match:
myHead :: [a] -> a
myHead (x:rest) = x
When myHead ones is called, x and rest are needed, but the pattern match against 1:ones simply binds x to 1 and rest to ones; we don't need evaluate ones any further at this time, so we don't.
The syntax for infinite lists, using the .. "operator" for arithmetic sequences, is sugar for calls to enumFrom and enumFromThen. That is
-- An infintite list of ones
ones = [1,1..] -- enumFromThen 1 1
-- The natural numbers
nats = [1..] -- enumFrom 1
so again, laziness just comes from the non-strict evaluation of enumFrom.
Unlike with other languages, Haskell decouples the creation and definition of an object.... You can easily watch this in action using Debug.Trace.
You can define a variable like this
aValue = 100
(the value on the right hand side could include a complicated evaluation, but let's keep it simple)
To see if this code ever gets called, you can wrap the expression in Debug.Trace.trace like this
import Debug.Trace
aValue = trace "evaluating aValue" 100
Note that this doesn't change the definition of aValue, it just forces the program to output "evaluating aValue" whenever this expression is actually created at runtime.
(Also note that trace is considered unsafe for production code, and should only be used to debug).
Now, try two experiments.... Write two different mains
main = putStrLn $ "The value of aValue is " ++ show aValue
and
main = putStrLn "'sup"
When run, you will see that the first program actually creates aValue (you will see the "creating aValue" message, while the second does not.
This is the idea of laziness.... You can put as many definitions in a program as you want, but only those that are used will be actually created at runtime.
The real use of this can be seen with objects of infinite size. Many lists, trees, etc. have an infinite number of elements. Your program will use only some finite number of values, but you don't want to muddy the definition of the object with this messy fact. Take for instance the infinite lists given in other answers here....
[1..] -- = [1,2,3,4,....]
You can again see laziness in action here using trace, although you will have to write out a variant of [1..] in an expanded form to do this.
f::Int->[Int]
f x = trace ("creating " ++ show x) (x:f (x+1)) --remember, the trace part doesn't change the expression, it is just used for debugging
Now you will see that only the elements you use are created.
main = putStrLn $ "the list is " ++ show (take 4 $ f 1)
yields
creating 1
creating 2
creating 3
creating 4
the list is [1,2,3,4]
and
main = putStrLn "yo"
will not show any item being created.
For instance:
let x = 1 in putStrLn [dump|x, x+1|]
would print something like
x=1, (x+1)=2
And even if there isn't anything like this currently, would it be possible to write something similar?
TL;DR There is this package which contains a complete solution.
install it via cabal install dump
and/or
read the source code
Example usage:
{-# LANGUAGE QuasiQuotes #-}
import Debug.Dump
main = print [d|a, a+1, map (+a) [1..3]|]
where a = 2
which prints:
(a) = 2 (a+1) = 3 (map (+a) [1..3]) = [3,4,5]
by turnint this String
"a, a+1, map (+a) [1..3]"
into this expression
( "(a) = " ++ show (a) ++ "\t " ++
"(a+1) = " ++ show (a + 1) ++ "\t " ++
"(map (+a) [1..3]) = " ++ show (map (+ a) [1 .. 3])
)
Background
Basically, I found that there are two ways to solve this problem:
Exp -> String The bottleneck here is pretty-printing haskell source code from Exp and cumbersome syntax upon usage.
String -> Exp The bottleneck here is parsing haskell to Exp.
Exp -> String
I started out with what #kqr put together, and tried to write a parser to turn this
["GHC.Classes.not x_1627412787 = False","x_1627412787 = True","x_1627412787 GHC.Classes.== GHC.Types.True = True"]
into this
["not x = False","x = True","x == True = True"]
But after trying for a day, my parsec-debugging-skills have proven insufficient to date, so instead I went with a simple regular expression:
simplify :: String -> String
simplify s = subRegex (mkRegex "_[0-9]+|([a-zA-Z]+\\.)+") s ""
For most cases, the output is greatly improved.
However, I suspect this to likely mistakenly remove things it shouldn't.
For example:
$(dump [|(elem 'a' "a.b.c", True)|])
Would likely return:
["elem 'a' \"c\" = True","True = True"]
But this could be solved with proper parsing.
Here is the version that works with the regex-aided simplification: https://github.com/Wizek/kqr-stackoverflow/blob/master/Th.hs
Here is a list of downsides / unresolved issues I've found with the Exp -> String solution:
As far as I know, not using Quasi Quotation requires cumbersome syntax upon usage, like: $(d [|(a, b)|]) -- as opposed to the more succinct [d|a, b|]. If you know a way to simplify this, please do tell!
As far as I know, [||] needs to contain fully valid Haskell, which pretty much necessitates the use of a tuple inside further exacerbating the syntactic situation. There is some upside to this too, however: at least we don't need to scratch our had where to split the expressions since GHC does that for us.
For some reason, the tuple only seemed to accept Booleans. Weird, I suspect this should be possible to fix somehow.
Pretty pretty-printing Exp is not very straight-forward. A more complete solution does require a parser after all.
Printing an AST scrubs the original formatting for a more uniform looks. I hoped to preserve the expressions letter-by-letter in the output.
The deal-breaker was the syntactic over-head. I knew I could get to a simpler solution like [d|a, a+1|] because I have seen that API provided in other packages. I was trying to remember where I saw that syntax. What is the name...?
String -> Exp
Quasi Quotation is the name, I remember!
I remembered seeing packages with heredocs and interpolated strings, like:
string = [qq|The quick {"brown"} $f {"jumps " ++ o} the $num ...|]
where f = "fox"; o = "over"; num = 3
Which, as far as I knew, during compile-time, turns into
string = "The quick " ++ "brown" ++ " " ++ $f ++ "jumps " ++ o ++ " the" ++ show num ++ " ..."
where f = "fox"; o = "over"; num = 3
And I thought to myself: if they can do it, I should be able to do it too!
A bit of digging in their source code revealed the QuasiQuoter type.
data QuasiQuoter = QuasiQuoter {quoteExp :: String -> Q Exp}
Bingo, this is what I want! Give me the source code as string! Ideally, I wouldn't mind returning string either, but maybe this will work. At this point I still know quite little about Q Exp.
After all, in theory, I would just need to split the string on commas, map over it, duplicate the elements so that first part stays string and the second part becomes Haskell source code, which is passed to show.
Turning this:
[d|a+1|]
into this:
"a+1" ++ " = " ++ show (a+1)
Sounds easy, right?
Well, it turns out that even though GHC most obviously is capable to parse haskell source code, it doesn't expose that function. Or not in any way we know of.
I find it strange that we need a third-party package (which thankfully there is at least one called haskell-src-meta) to parse haskell source code for meta programming. Looks to me such an obvious duplication of logic, and potential source of mismatch -- resulting in bugs.
Reluctantly, I started looking into it. After all, if it is good enough for the interpolated-string folks (those packaged did rely on haskell-src-meta) then maybe it will work okay for me too for the time being.
And alas, it does contain the desired function:
Language.Haskell.Meta.Parse.parseExp :: String -> Either String Exp
Language.Haskell.Meta.Parse
From this point it was rather straightforward, except for splitting on commas.
Right now, I do a very simple split on all commas, but that doesn't account for this case:
[d|(1, 2), 3|]
Which fails unfortunatelly. To handle this, I begun writing a parsec parser (again) which turned out to be more difficult than anticipated (again). At this point, I am open to suggestions. Maybe you know of a simple parser that handles the different edge-cases? If so, tell me in a comment, please! I plan on resolving this issue with or without parsec.
But for the most use-cases: it works.
Update at 2015-06-20
Version 0.2.1 and later correctly parses expressions even if they contain commas inside them. Meaning [d|(1, 2), 3|] and similar expressions are now supported.
You can
install it via cabal install dump
and/or
read the source code
Conclusion
During the last week I've learnt quite a bit of Template Haskell and QuasiQuotation, cabal sandboxes, publishing a package to hackage, building haddock docs and publishing them, and some things about Haskell too.
It's been fun.
And perhaps most importantly, I now am able to use this tool for debugging and development, the absence of which has been bugging me for some time. Peace at last.
Thank you #kqr, your engagement with my original question and attempt at solving it gave me enough spark and motivation to continue writing up a full solution.
I've actually almost solved the problem now. Not exactly what you imagined, but fairly close. Maybe someone else can use this as a basis for a better version. Either way, with
{-# LANGUAGE TemplateHaskell, LambdaCase #-}
import Language.Haskell.TH
dump :: ExpQ -> ExpQ
dump tuple =
listE . map dumpExpr . getElems =<< tuple
where
getElems = \case { TupE xs -> xs; _ -> error "not a tuple in splice!" }
dumpExpr exp = [| $(litE (stringL (pprint exp))) ++ " = " ++ show $(return exp)|]
you get the ability to do something like
λ> let x = True
λ> print $(dump [|(not x, x, x == True)|])
["GHC.Classes.not x_1627412787 = False","x_1627412787 = True","x_1627412787 GHC.Classes.== GHC.Types.True = True"]
which is almost what you wanted. As you see, it's a problem that the pprint function includes module prefixes and such, which makes the result... less than ideally readable. I don't yet know of a fix for that, but other than that I think it is fairly usable.
It's a bit syntactically heavy, but that is because it's using the regular [| quote syntax in Haskell. If one wanted to write their own quasiquoter, as you suggest, I'm pretty sure one would also have to re-implement parsing Haskell, which would suck a bit.
I'm currently reading Implementing functional languages: a tutorial by SPJ and the (sub)chapter I'll be referring to in this question is 3.8.7 (page 136).
The first remark there is that a reader following the tutorial has not yet implemented C scheme compilation (that is, of expressions appearing in non-strict contexts) of ECase expressions.
The solution proposed is to transform a Core program so that ECase expressions simply never appear in non-strict contexts. Specifically, each such occurrence creates a new supercombinator with exactly one variable which body corresponds to the original ECase expression, and the occurrence itself is replaced with a call to that supercombinator.
Below I present a (slightly modified) example of such transformation from 1
t a b = Pack{2,1} ;
f x = Pack{2,2} (case t x 7 6 of
<1> -> 1;
<2> -> 2) Pack{1,0} ;
main = f 3
== transformed into ==>
t a b = Pack{2,1} ;
f x = Pack{2,2} ($Case1 (t x 7 6)) Pack{1,0} ;
$Case1 x = case x of
<1> -> 1;
<2> -> 2 ;
main = f 3
I implemented this solution and it works like charm, that is, the output is Pack{2,2} 2 Pack{1,0}.
However, what I don't understand is - why all that trouble? I hope it's not just me, but the first thought I had of solving the problem was to just implement compilation of ECase expressions in C scheme. And I did it by mimicking the rule for compilation in E scheme (page 134 in 1 but I present that rule here for completeness): so I used
E[[case e of alts]] p = E[[e]] p ++ [Casejump D[[alts]] p]
and wrote
C[[case e of alts]] p = C[[e]] p ++ [Eval] ++ [Casejump D[[alts]] p]
I added [Eval] because Casejump needs an argument on top of the stack in weak head normal form (WHNF) and C scheme doesn't guarantee that, as opposed to E scheme.
But then the output changes to enigmatic: Pack{2,2} 2 6.
The same applies when I use the same rule as for E scheme, i.e.
C[[case e of alts]] p = E[[e]] p ++ [Casejump D[[alts]] p]
So I guess that my "obvious" solution is inherently wrong - and I can see that from outputs. But I'm having trouble stating formal arguments as to why that approach was bound to fail.
Can someone provide me with such argument/proof or some intuition as to why the naive approach doesn't work?
The purpose of the C scheme is to not perform any computation, but just delay everything until an EVAL happens (which it might or might not). What are you doing in your proposed code generation for case? You're calling EVAL! And the whole purpose of C is to not call EVAL on anything, so you've now evaluated something prematurely.
The only way you could generate code directly for case in the C scheme would be to add some new instruction to perform the case analysis once it's evaluated.
But we (Thomas Johnsson and I) decided it was simpler to just lift out such expressions. The exact historical details are lost in time though. :)
Classic way to define Haskell functions is
f1 :: String -> Int
f1 ('-' : cs) -> f1 cs + 1
f1 _ = 0
I'm kinda unsatisfied writing function name at every line. Now I usually write in the following way, using pattern guards extension and consider it more readable and modification friendly:
f2 :: String -> Int
f2 s
| '-' : cs <- s = f2 cs + 1
| otherwise = 0
Do you think that second example is more readable, modifiable and elegant? What about generated code? (Haven't time to see desugared output yet, sorry!). What are cons? The only I see is extension usage.
Well, you could always write it like this:
f3 :: String -> Int
f3 s = case s of
('-' : cs) -> f3 cs + 1
_ -> 0
Which means the same thing as the f1 version. If the function has a lengthy or otherwise hard-to-read name, and you want to match against lots of patterns, this probably would be an improvement. For your example here I'd use the conventional syntax.
There's nothing wrong with your f2 version, as such, but it seems a slightly frivolous use of a syntactic GHC extension that's not common enough to assume everyone will be familiar with it. For personal code it's not a big deal, but I'd stick with the case expression for anything you expect other people to be reading.
I prefer writing function name when I am pattern matching on something as is shown in your case. I find it more readable.
I prefer using guards when I have some conditions on the function arguments, which helps avoiding if else, which I would have to use if I was to follow the first pattern.
So to answer your questions
Do you think that second example is more readable, modifiable and elegant?
No, I prefer the first one which is simple and readable. But more or less it depends on your personal taste.
What about generated code?
I dont think there will be any difference in the generated code. Both are just patternmatching.
What are cons?
Well patternguards are useful to patternmatch instead of using let or something more cleanly.
addLookup env var1 var2
| Just val1 <- lookup env var1
, Just val2 <- lookup env var2
= val1 + val2
Well the con is ofcourse you need to use an extension and also it is not Haskell98 (which you might not consider much of a con)
On the other hand for trivial pattern matching on function arguments I will just use the first method, which is simple and readable.