I'm reading The Joy of Clojure, and in the section about paralellization, the functions pvalues, pmap and pcalls are explained, with a brief example of how each one is used. Below are the examples given for pvalues and pcalls:
(defn sleeper [s thing] (Thread/sleep (* 1000 s)) thing)
(pvalues
(sleeper 2 :1st)
(sleeper 3 :2nd)
(keyword "3rd"))
(pcalls
#(sleeper 2 :1st)
#(sleeper 3 :2nd)
#(keyword "3rd"))
I understand the technical difference between the two -- pvalues takes a variable number of "values" to be computed, whereas pcalls takes "an arbitrary number of functions taking no arguments," to be called in parallel. In both cases, a lazy sequence of the results is returned.
My question is essentially, when would you use one vs. the other? It seems like the only semantic difference is that you stick a # before each argument to pcalls, turning it into an anonymous function. So, purely from the perspective of saving keystrokes and having simpler code, wouldn't it make more sense to just use pvalues? -- and if so, why even have a pcalls function?
I get that with pcalls, you can substitute a symbol referring to a function, but if you wanted to use such a function with pvalues, you could just put the function in parentheses so that it is called.
I'm just a little confused as to why Clojure has these 2 functions that are so similar. Is there anything that you can do with one, but not the other? Some contrived examples might be helpful.
pvalues is a convienience macro around pcalls, though because macros are not first class it can't be composed or passed around in the same way a function can. Here is an example where only pcalls works:
user> (defn sleeper [s thing] (Thread/sleep (* 1000 s)) thing)
#'user/sleeper
user> (def actions [#(sleeper 2 :1st) #(sleeper 3 :2nd) #(keyword "3rd")])
#'user/actions
user> (apply pcalls actions)
(:1st :2nd :3rd)
user> (apply pvalues actions)
CompilerException java.lang.RuntimeException: Can't take value of a macro: #'clojure.core/pvalues, compiling:(NO_SOURCE_PATH:1:1)
It's important when designing APIs in clojure that you don't restrict the users to only interacting with your library through macros because then they get caught on limitations such as this.
Related
Taking the three functions below, implemented in Haskell and Clojure respectively:
f :: [Int] -> Int
f = foldl1 (+) . map (*7) . filter even
(defn f [coll]
((comp
(partial reduce +)
(partial map #(* 7 %)
(partial filter even?)) coll))
(defn f [coll]
(transduce
(comp
(filter even?)
(map #(* 7 %)))
+ coll))
when they are applied to a list like [1, 2, 3, 4, 5] they all return 42. I know the machinery behind the first 2 is similar since map is lazy in Clojure, but the third one uses transducers. Could someone show the intermediate steps for the execution of these functions?
The intermediate steps between the second and third example are the same for this specific example. This is due to the fact that map and filter are implemented as lazy transformations of a sequence into a sequence, as you’re no-doubt already aware.
The transducer versions of map and filter are defined using the same essential functionality as the not-transducer versions, except that the way they “conj" (or not, in the case of filter) onto the result stream is defined elsewhere. Indeed, if u look at the source for map, there are explicit data-structure construction functions in use, whereas the transducer variant uses no such functions -- they are passed in via rf. Explicitly using cons in the non-transducer versions means they're always going to be dealing with sequences
IMO, the main benefit of using transducers is that you have the ability to define the process that you're doing away from the thing which will use your process. Therefore perhaps a more interesting rewrite of your third example may be:
(def process (comp (filter even)
(map #(* 7 %))))
(defn f [coll] (transduce process + collection))
Its an exercise to the application author to decide when this sort of abstraction is necessary, but it can definitely open an opportunity for reuse.
It may occur to you that you can simply rewrite
(defn process [coll]
((comp
(partial map #(* 7 %)
(partial filter even?)) coll))
(reduce + (process coll))
And get the same effect; this is true. When your input is always a sequence (or always the same kind of stream / you know what kind of stream it will be) there's arguably not a good reason to create a transducer. But the power of reuse can be demonstrated here (assume process is a transducer)
(chan 1 process) ;; an async channel which runs process on all inputs
(into [] process coll) ;; writing to a vector
(transduce + process coll) ;; your goal
The motivation behind transducers was essentially to stop having to write new collection functions for different collection types. Rich Hickey mentions his frustration writing functions like map< map> mapcat< mapcat>, and so on in the core async library -- what map and mapcat are, is already defined, but because they assume that they operate on sequences (that explicit cons I linked above), they cant be applied to asnychronous channels. But channels can supply their own rf in the transducer version to let them reuse these functions.
I need a function which takes two parameters input string and default string, then return input if not blank, or default.
(defn default-if-blank [input default]
(if (clojure.string/blank? input)
default
input))
I can implement the function, but I think there would be lots of good utility libraries in Clojure world, such as Apache commons or Guava in Java world.
Is it common that implementing such functions by myself, rather than using some libraries? I'm new to Clojure, so this can be stupid question but any advices will help me. Thanks.
What you have there is very normal and would be perfectly fine in high quality Clojure code. It's completely ok to write this way, because Clojure is a terse language this does not add much intellectual overhead.
In some cases you would not need it, for instance if you are pulling values from any of the collections, get takes an optional default parameter:
user> (get {1 "one" 2 "two"}
42
"infinity")
"infinity"
and optional parameters when destructing things like maps:
user> (let [{a :a b :b c :c :or {a :default-a
b :default-b}}
{:a 42 :c 99}]
(println "a is" a)
(println "b is" b)
(println "c is" c))
a is 42
b is :default-b
c is 99
make it less common to need these situations, though there are plenty of (if (test foo) foo :default) cases in the codebase I work with on a daily basis. It has not been common and consistent for us in a way that would lend itself to being consolidated as you have done.
Haskell does not have loops like many other languages. I understand the reasoning behind it and some of the different approaches used to solve problems without them. However, when a loop structure is necessary, I am not sure if the way I'm creating the loop is correct/good.
For example (trivial function):
dumdum = do
putStrLn "Enter something"
num <- getLine
putStrLn $ "You entered: " ++ num
dumdum
This works fine, but is there a potential problem in the code?
A different example:
a = do
putStrLn "1"
putStrLn "2"
a
If implemented in an imperative language like Python, it would look like:
def a():
print ("1")
print ("2")
a()
This eventually causes a maximum recursion depth error. This does not seem to be the case in Haskell, but I'm not sure if it might cause potential problems.
I know there are other options for creating loops such as Control.Monad.LoopWhile and Control.Monad.forever -- should I be using those instead? (I am still very new to Haskell and do not understand monads yet.)
For general iteration, having a recursive function call itself is definitely the way to go. If your calls are in tail position, they don't use any extra stack space and behave more like goto1. For example, here is a function to sum the first n integers using constant stack space2:
sum :: Int -> Int
sum n = sum' 0 n
sum' !s 0 = s
sum' !s n = sum' (s+n) (n-1)
It is roughly equivalent to the following pseudocode:
function sum(N)
var s, n = 0, N
loop:
if n == 0 then
return s
else
s,n = (s+n, n-1)
goto loop
Notice how in the Haskell version we used function parameters for the sum accumulator instead of a mutable variable. This is very common pattern for tail-recursive code.
So far, general recursion with tail-call-optimization should give you all the looping power of gotos. The only problem is that manual recursion (kind of like gotos, but a little better) is relatively unstructured and we often need to carefully read code that uses it to see what is going on. Just like how imperative languages have looping mechanisms (for, while, etc) to describe most common iteration patterns, in Haskell we can use higher order functions to do a similar job. For example, many of the list processing functions like map or foldl'3 are analogous to straightforward for-loops in pure code and when dealing with monadic code there are functions in Control.Monad or in the monad-loops package that you can use. In the end, its a matter of style but I would err towards using the higher order looping functions.
1 You might want to check out "Lambda the ultimate GOTO", a classical article about how tail recursion can be as efficient as traditional iteration. Additionally, since Haskell is a lazy languages, there are also some situations where recursion at non-tail positions can still run in O(1) space (search for "Tail recursion modulo cons")
2 Those exclamation marks are there to make the accumulator parameter be eagerly evaluated, so the addition happens at the same time as the recursive call (Haskell is lazy by default). You can omit the "!"s if you want but then you run the risk of running into a space leak.
3 Always use foldl' instead of foldl, due to the previously mentioned space leak issue.
I know there are other options for creating loops such as Control.Monad.LoopWhile and Control.Monad.forever -- should I be using those instead? (I am still very new to Haskell and do not understand monads yet.)
Yes, you should. You'll find that in "real" Haskell code, explicit recursion (i.e. calling your function in your function) is actually pretty rare. Sometimes, people do it because it's the most readable solution, but often, using things such as forever is much better.
In fact, saying that Haskell doesn't have loops is only a half-truth. It's correct that no loops are built into the language. However, in the standard libraries there are more kinds of loops than you'll ever find in an imperative language. In a language such as Python, you have "the for loop" which you use whenever you need to iterate through something. In Haskell, you have
map, fold, any, all, scan, mapAccum, unfold, find, filter (Data.List)
mapM, forM, forever (Control.Monad)
traverse, for (Data.Traversable)
foldMap, asum, concatMap (Data.Foldable)
and many, many others!
Each of these loops are tailored for (and sometimes optimised for) a specific use case.
When writing Haskell code, we make heavy use of these, because they allow us to reason more intelligently about our code and data. When you see someone use a for loop in Python, you have to read and understand the loop to know what it does. When you see someone use a map loop in Haskell, you know without reading what it does that it will not add any elements to the list – because we have the "Functor laws" which are just rules that say any map function must work this or that way!
Back to your example, we can first define an askNum "function" (it's technically not a function but an IO value... we can pretend it is a function for the time being) which asks the user to enter something just once, and displays it back to them. When you want your program to keep asking forever, you just give that "function" as an argument to the forever loop and the forever loop will keep asking forever!
The entire thing might look like:
askNum = do
putStrLn "Enter something"
num <- getLine
putStrLn "You entered: " ++ num
dumdum = forever askNum
Then a more experienced programmer would probably get rid of the askNum "function" in this case, and turn the entire thing into
dumdum = forever $ do
putStrLn "Enter something"
num <- getLine
putStrLn "You entered: " ++ num
I'm playing around with Haskell at the moment and thus stumbled upon the list comprehension feature.
Naturally, I would have used a closure to do this kind of thing:
Prelude> [x|x<-[1..7],x>4] -- list comprehension
[5,6,7]
Prelude> filter (\x->x>4) [1..7] -- closure
[5,6,7]
I still don't feel this language, so which way would a Haskell programmer go?
What are the differences between these two solutions?
Idiomatic Haskell would be filter (> 4) [1..7]
Note that you are not capturing any of the lexical scope in your closure, and are instead making use of a sectioned operator. That is to say, you want a partial application of >, which operator sections give you immediately. List comprehensions are sometimes attractive, but the usual perception is that they do not scale as nicely as the usual suite of higher order functions ("scale" with respect to more complex compositions). That kind of stylistic decision is, of course, largely subjective, so YMMV.
List comprehensions come in handy if the elements are somewhat complex and one needs to filter them by pattern matching, or the mapping part feels too complex for a lambda abstraction, which should be short (or so I feel), or if one has to deal with nested lists. In the latter case, a list comprehension is often more readable than the alternatives (to me, anyway).
For example something like:
[ (f b, (g . fst) a) | (Just a, Right bs) <- somelist, a `notElem` bs, (_, b) <- bs ]
But for your example, the section (>4) is a really nice way to write (\a -> a > 4) and because you use it only for filtering, most people would prefer ANthonys solution.
Can all iterative algorithms be expressed recursively?
If not, is there a good counter example that shows an iterative algorithm for which there exists no recursive counterpart?
If it is the case that all iterative algorithms can be expressed recursively, are there cases in which this is more difficult to do?
Also, what role does the programming language play in all this? I can imagine that Scheme programmers have a different take on iteration (= tail-recursion) and stack usage than Java-only programmers.
There's a simple ad hoc proof for this. Since you can build a Turing complete language using strictly iterative structures and a Turing complete language using only recursive structures, then the two are therefore equivalent.
Can all iterative algorithms be expressed recursively?
Yes, but the proof is not interesting:
Transform the program with all its control flow into a single loop containing a single case statement in which each branch is straight-line control flow possibly including break, return, exit, raise, and so on. Introduce a new variable (call it the "program counter") which the case statement uses to decide which block to execute next.
This construction was discovered during the great "structured-programming wars" of the 1960s when people were arguing the relative expressive power of various control-flow constructs.
Replace the loop with a recursive function, and replace every mutable local variable with a parameter to that function. Voilà! Iteration replaced by recursion.
This procedure amounts to writing an interpreter for the original function. As you may imagine, it results in unreadable code, and it is not an interesting thing to do. However, some of the techniques can be useful for a person with background in imperative programming who is learning to program in a functional language for the first time.
Like you say, every iterative approach can be turned into a "recursive" one, and with tail calls, the stack will not explode either. :-) In fact, that's actually how Scheme implements all common forms of looping. Example in Scheme:
(define (fib n)
(do ((x 0 y)
(y 1 (+ x y))
(i 1 (+ i 1)))
((> i n) x)))
Here, although the function looks iterative, it actually recurses on an internal lambda that takes three parameters, x, y, and i, and calling itself with new values at each iteration.
Here's one way that function could be macro-expanded:
(define (fib n)
(letrec ((inner (lambda (x y i)
(if (> i n) x
(inner y (+ x y) (+ i 1))))))
(inner 0 1 1)))
This way, the recursive nature becomes more visually apparent.
Defining iterative as:
function q(vars):
while X:
do Y
Can be translated as:
function q(vars):
if X:
do Y
call q(vars)
Y in most cases would include incrementing a counter that is tested by X. This variable will have to be passed along in 'vars' in some way when going the recursive route.
As noted by plinth in the their answer we can construct proofs showing how recursion and iteration are equivalent and can both be used to solve the same problem; however, even though we know the two are equivalent there are drawbacks to use one over the other.
In languages that are not optimized for recursion you may find that an algorithm using iteration preforms faster than the recursive one and likewise, even in optimized languages you may find that an algorithm using iteration written in a different language runs faster than the recursive one. Furthermore, there may not be an obvious way of written a given algorithm using recursion versus iteration and vice versa. This can lead to code that is difficult to read which leads to maintainability issues.
Prolog is recursive only language and you can do pretty much everything in it (I don't suggest you do, but you can :))
Recursive Solutions are usually relatively inefficient when compared to iterative solutions.
However, it is noted that there are some problems that can be solved only through recursion and equivalent iterative solution may not exist or extremely complex to program easily (Example of such is The Ackermann function cannot be expressed without recursion)Though recursions are elegant,easy to write and understand.