Performance of a function that changes functions

Performance of a function that changes functions - haskell

Lets suppose we have a function
type Func = Bool -> SophisticatedData
fun1 :: Func
And we'd like to change this function some input:
change :: SophisticatedData -> Func -> Func
change data func = \input -> if input == False then data else func input
Am I right that after several calls of change (endFunc = change data1 $ change data2 $ startFunc) resulting function would call all intermediate ones each time? Am I right that GC wouldn't able to delete unused data? What is the haskell way to cope with this task?
Thanks.

Well let's start by cleaning up change to be a bit more legible
change sd f input = if input then func input else sd
So when we compose these
change d1 $ change d2 $ change d3
GHC starts by storing a thunk for each of them. Remember that $ is a function to so the whole change d* thing is going to be a thunk at first. Thunks are relatively cheap and if you're not creating 10k or so of them at once you'll be just fine :) so no worries there.
Now the question is, what happens when you start evaluating, the answer is, it'll still not evaluate the complex data, so it's still quite memory efficient, and it only needs to force input to determine which branch it's taking. Because of this, you should never actually fully evaluate SophisticatedData until after choose has run and returned a one to you, then it will be evaluated as need if you use it.
Further more, at each step, GHC can garbage collect the unneeded thunks since they can't be referenced anymore.
In conclusion, you should be just fine. Trust in the laziness

You are correct: if foo is a chain of O(n) calls to change, there will be O(n) overhead on every call to foo. The way to deal with this is to memoize foo:
memoize :: Func -> Func
memoize f = \x -> if x then fTrue else fFalse where
fTrue = f True
fFalse = f False

Related

Haskell recursion loop counter

Is there anyway I could check the number of recursion the program is at. For example, I want to stop the recursion after 2 times. Is there anyway to do that in haskell.

Yes, there is but...
Stopping recursion is normally done when some final state is achieved, something like "I've run out of data to process" or "I have reached the base case." When I see something as arbitrary as "after 2 times" I want to ask where you came up with 2.
However in the interest of answering the question asked:
You need to pass in a counter to the recursive function and bail out when you have done the required number of cycles. For cases like this where the number of cycles is not a matter external to the function, it is typically the case that one creates an auxilary function to introduce it.
myFunction :: Value -> Value
myFunction init = recurse 2 init
where
recurse :: Int -> Value -> Value
recurse 0 result = result
recurse n intermediate = recurse (n-1) (someFun intermediate)

Does Data.Map work in a pass-by-value or pass-by-reference way? (Better explanation inside)

I have a recursive function working within the scope of strictly defined interface, so I can't change the function signatures.
The code compiles fine, and even runs fines without error. My problem is that it's a large result set, so it's very hard to test if there is a semantic error.
My primary question is: In a sequence of function calls A to B to A to B to breaking condition, considering the same original Map is passed to all functions until breaking condition, and that some functions only return an Integer, would an insert on a Map in a function that only returns an Integer still be reflected once control is returned to the first function?
primaryFunc :: SuperType -> MyMap -> (Integer, MyMap)
primaryFunc (SubType1 a) mapInstance = do
let returnInt = func1 a mapInstance
(returnInt, mapInstance)
primaryFunc (SubType2 c) mapInstance = do
let returnInt = primaryFunc_nonprefix_SuperType c mapInstance
let returnSuperType = (Const returnInt)
let returnTable = H.insert c returnSuperType mapInstance
(returnInt, returnTable)
primaryFunc (ConstSubType d) mapInstance = do
let returnInt = d
(returnInt, mapInstance)
func1 :: SubType1 -> MyMap -> Integer
func1 oe vt = do
--do stuff with input and map data to get return int
returnInt = primaryFunc
returnInt
func2 :: SubType2 -> MyMap -> Integer
func2 pe vt = do
--do stuff with input and map data to get return int
returnInt = primaryFunc
returnInt

Your question is almost impossibly dense and ambiguous, but it should be possible to answer what you term your "primary" question from the simplest first principles of Haskell:
No Haskell function updates a value (e.g. a map). At most it can return a modified copy of its input.
Outside of the IO monad, no function can have side effects. No function can affect the value of any variable assigned before it was called; all it can do is return a value.
So if you pass a map as a parameter to a function, nothing the function does can alter your existing reference to that value. If you want an updated value, you can only get that from the output of a function to which you have passed the original value as input. New value, new reference.
Because of this, you should have absolute clarity at any depth within your web of functions about which value you are working with. Knowing this, you should be able to answer your own question. Frankly, this is such a fundamental characteristic of Haskell that I am perplexed that you even need to ask.
If a function only returns an integer, then any operations you perform on any values made available to the function can only affect the output - that is, the integer value returned. Nothing done within the function can affect anything else (short of causing the whole program to crash).
So if function A has a reference to a map and it passes this value to function B which returns an int, nothing function B does can affect A's copy of the map. If function B were allowed to secretly alter A's copy of the map, that would be a side effect. Side effects are not allowed.
You need to understand that Haskell does not have variables as you understand them. It has immutable values, references to immutable values and functions (which take inputs and return new outputs). Functions do not have variables which are in scope for other functions which might alter those variables on the fly. That cannot happen.
As an aside, not only does the code you posted show that you do not understand the basics of Haskell syntax, the question you asked shows that you haven't understood the primary characteristics of Haskell as a language. Not only are these fundamentals things which can be understood before having learned any syntax, they are things you need to know to make sense of the syntax.
If you have a deadline, meet it using a tool you do understand. Then go learn Haskell properly.

In addition, you will find that
an insert on a Map in a function that only returns an Integer
is nearly impossible to express. Yes, you can technically do it like in
insert k v map `seq` 42 -- force an insert and throw away the result
but if you think that, for example:
let done = insert k v map in 42
does anything with the map, you're probably wrong.
In no case, however, is the original map altered.

Using Netwire to spawn lists from a value

I think I'm fundamentally misunderstanding how to attack this type of problem with Netwire:
I have the following test-case:
I'd like to take a string, split it into lines, print each line, then exit.
The pieces I'm missing are:
How to inhibit after a value early in a pipeline, but then if that value is split and the results produced later, not inhibit until all those results are consumed.
What the main loop function should look like
If I should use 'once'
Here is the code I have so far:
import Control.Wire
main :: IO ()
main = recur mainWire
recur :: Wire () IO () () -> IO ()
recur a = do
(e,w) <- stepWire a 0 ()
case e of Left () -> return ()
Right () -> recur w
mainWire :: Wire () IO () ()
mainWire = pure "asdf\nqwer\nzxcv"
>>> once
>>> arr lines
>>> fifo
>>> arr print
>>> perform
This outputs the following:
"asdf"
and then quits. If I remove the once, then the program performs as expected, repeatedly outputting the full list of lines forever.
I'd like the following output:
"asdf"
"qwer"
"zxcv"
I'm sure that I'm just missing some intuition here about the correct way to approach this class of problem with Netwire.

Note: This is for an older version of netwire (before events worked like they do now), so some translating of the code would be required to make this work properly with the current version.
If I understood you right, you want a wire that produces the lines of a string, and then inhibits when it's done with that? It's a bit hard to tell.
once as the name implies, produces exactly once and then inhibits forever. Again it's a bit unclear what your wires are doing (because you didn't tell us) but it's not something you normally put into your "main" wire (so far I've only ever used once with andThen).
If that is correct, I'd probably do it something along the lines of:
produceLines s = produceLines' $ lines s where
produceLines' [] = inhibit mempty
produceLines' (l:ls) = pure s . once --> produceLines' ls
(You could write that as a fold or something, I just thought this was a bit clearer).
--> is pretty for andThen in case you didn't know. Basically this splits the passed string into lines, and turns them into a wire that produces the first line once, and then behaves like a similar wire except with the first element removed. It inhibits indefinitely once all values were produced.
Is that what you wanted?
Update
I see what you were trying to do now.
The wire you were trying to write could be done as
perform . arr print . fifo . ((arr lines . pure "asdf\nqwer\nzxcv" . once) --> pure [])
The part in the parentheses produce ["adf","nqwer","nzxc"] for one instant, and then produces [] forever. fifo takes values from the previous wire, adding the result from the previous wire in every instance (because of that we have to keep producing []). The rest is as you know it (I'm using the function-like notation rather than the arrow notation because I prefer it, but that shouldn't be a problem for you).

Haskell: Need Enlightenment with Calculator program

I have an assignment which is to create a calculator program in Haskell. For example, users will be able to use the calculator by command lines like:
>var cola =5; //define a random variable
>cola*2+1;
(print 11)
>var pepsi = 10
>coca > pepsi;
(print false)
>def coke(x,y) = x+y; //define a random function
>coke(cola,pepsi);
(print 15)
//and actually it's more complicated than above
I have no clue how to program this in Haskell. All I can think of right now is to read the command line as a String, parse it into an array of tokens. Maybe go through the array, detect keywords such "var", "def" then call functions var, def which store variables/functions in a List or something like that. But then how do I store data so that I can use them later in my computation?
Also am I on the right track because I am actually very confused what to do next? :(
*In addition, I am not allowed to use Parsec!*

It looks like you have two distinct kinds of input: declarations (creating new variables and functions) and expressions (calculating things).
You should first define some data structures so you can work out what sort of things you are going to be dealing with. Something like:
data Command = Define Definition | Calculate Expression | Quit
type Name = String
data Definition = DefVar Name Expression | DefFunc Name [Name] Expression
-- ^ alternatively, implement variables as zero-argument functions
-- and merge these cases
data Expression = Var Name | Add Expression Expression | -- ... other stuff
type Environment = [Definition]
To start off with, just parse (tokenise and then parse the tokens, perhaps) the stuff into a Command, and then decide what to do with it.
Expressions are comparatively easy. You assume you already have all the definitions you need (an Environment) and then just look up any variables or do additions or whatever.
Definitions are a bit trickier. Once you've decided what new definition to make, you need to add it to the environment. How exactly you do this depends on how exactly you iterate through the lines, but you'll need to pass the new environment back from the interpreter to the thing which fetches the next line and runs the interpreter on it. Something like:
main :: IO ()
main = mainLoop emptyEnv
where
emptyEnv = []
mainLoop :: Environment -> IO ()
mainLoop env = do
str <- getLine
case parseCommnad str of
Nothing -> do
putStrLn "parse failed!"
mainLoop env
Just Quit -> do
return ()
Just (Define d) -> do
mainLoop (d : env)
Just (Calculate e) -> do
putStrLn (calc env e)
mainLoop env
-- the real meat:
parseCommand :: String -> Maybe Command
calc :: Environment -> Expression -> String -- or Integer or some other appropriate type
calc will need to look stuff up in the environment you create as you go along, so you'll probably also need a function for finding which Definition corresponds to a given Name (or complaining that there isn't one).
Some other decisions you should make:
What do I do when someone tries to redefine a variable?
What if I used one of those variables in the definition of a function? Do I evaluate a function definition when it is created or when it is used?
These questions may affect the design of the above program, but I'll leave it up to you to work out how.

First, you can learn a lot from this tutorial for haskell programming
You need to write your function in another doc with .hs
And you can load the file from you compiler and use all the function you create
For example
plus :: Int -> Int -- that mean the function just work with a number of type int and return Int
plus x y = x + y -- they receive x and y and do the operation

Data value dependencies, updates and memoisation

I'm sorry this problem description is so abstract: its for my job, and for commercial confidentiality reasons I can't give the real-world problem, just an abstraction.
I've got an application that receives messages containing key-value pairs. The keys are from a defined set of keywords, and each keyword has a fixed data type. So if "Foo" is an Integer and "Bar" is a date you might get a message like:
Foo: 234
Bar: 24 September 2011
A message may have any subset of keys in it. The number of keys is fairly large (several dozen). But lets stick with Foo and Bar for now.
Obviously there is a record like this corresponding to the messages:
data MyRecord {
foo :: Maybe Integer
bar :: Maybe UTCTime
-- ... and so on for several dozen fields.
}
The record uses "Maybe" types because that field may not have been received yet.
I also have many derived values that I need to compute from the current values (if they exist). For instance I want to have
baz :: MyRecord -> Maybe String
baz r = do -- Maybe monad
f <- foo r
b <- bar r
return $ show f ++ " " ++ show b
Some of these functions are slow, so I don't want to repeat them unnecessarily. I could recompute baz for each new message and memo it in the original structure, but if a message leaves the foo and bar fields unchanged then that is wasted CPU time. Conversely I could recompute baz every time I want it, but again that would waste CPU time if the underlying arguments have not changed since last time.
What I want is some kind of smart memoisation or push-based recomputation that only recomputes baz when the arguments change. I could detect this manually by noting that baz depends only on foo and bar, and so only recomputing it on messages that change those values, but for complicated functions that is error-prone.
An added wrinkle is that some of these functions may have multiple strategies. For instance you might have a value that can be computed from either Foo or Bar using 'mplus'.
Does anyone know of an existing solution to this? If not, how should I go about it?

I'll assume that you have one "state" record and these message all involve updating it as well as setting it. So if Foo is 12, it may later be 23, and therefore the output of baz would change. If any of this is not the case, then the answer becomes pretty trivial.
Let's start with the "core" of baz -- a function not on a record, but the values you want.
baz :: Int -> Int -> String
Now let's transform it:
data Cached a b = Cached (Maybe (a,b)) (a -> b)
getCached :: Eq a => Cached a b -> a -> (b,Cached a b)
getCached c#(Cached (Just (arg,res)) f) x | x == arg = (res,c)
getCached (Cached _ f) x = let ans = f x in (ans,Cached (Just (x,ans) f)
bazC :: Cached (Int,Int) String
bazC = Cached Nothing (uncurry baz)
Now whenever you would use a normal function, you use a cache-transformed function instead, substituting the resulting cache-transformed function back into your record. This is essentially a manual memotable of size one.
For the basic case you describe, this should be fine.
A fancier and more generalized solution involving a dynamic graph of dependencies goes under the name "incremental computation" but I've seen research papers for it more than serious production implementations. You can take a look at these for starters, and follow the reference trail forward:
http://www.carlssonia.org/ogi/Adaptive/
http://www.andres-loeh.de/Incrementalization/paper_final.pdf
Incremental computation is actually also very related to functional reactive programming, so you can take a look at conal's papers on that, or play with Heinrich Apfelmus' reactive-banana library: http://www.haskell.org/haskellwiki/Reactive-banana
In imperative languages, take a look at trellis in python: http://pypi.python.org/pypi/Trellis or Cells in lisp: http://common-lisp.net/project/cells/

You can build a stateful graph that corresponds to computations you need to do. When new values appear you push these into the graph and recompute, updating the graph until you reach the outputs. (Or you can store the value at the input and recompute on demand.) This is a very stateful solution but it works.
Are you perhaps creating market data, like yield curves, from live inputs of rates etc.?

What I want is some kind of smart memoisation or push-based recomputation that only recomputes baz when the arguments change.
It sounds to me like you want a variable that is sort of immutable, but allows a one-time mutation from "nothing computed yet" to "computed". Well, you're in luck: this is exactly what lazy evaluation gives you! So my proposed solution is quite simple: just extend your record with fields for each of the things you want to compute. Here's an example of such a thing, where the CPU-intensive task we're doing is breaking some encryption scheme:
data Foo = Foo
{ ciphertext :: String
, plaintext :: String
}
-- a smart constructor for Foo's
foo c = Foo { ciphertext = c, plaintext = crack c }
The point here is that calls to foo have expenses like this:
If you never ask for the plaintext of the result, it's cheap.
On the first call to plaintext, the CPU churns a long time.
On subsequent calls to plaintext, the previously computed answer is returned immediately.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string