How to idiomatically rotate a string in Clojure for the Burrows-Wheeler transform?
I came up with this, which uses (cycle "string"), but feels a bit imperative:
(let [s (str "^" "banana" "|")
l (count s)
c (cycle s)
m (map #(take l (drop % c)) (range l))]
(apply map str m))
=> ("^banana|" "banana|^" "anana|^b" "nana|^ba" "ana|^ban" "na|^bana" "a|^banan" "|^banana")
I'm not sure if this qualifies as code golf. Is there a cleaner way to do this?
I would do:
(defn bwrot [s]
(let [s (str "^" s "|")]
(for [i (range (count s))]
(str (subs s i) (subs s 0 i)))))
or:
(defn bwrot [s]
(let [n (+ 2 (count s))
s (str "^" s "|^" s "|")]
(for [i (range n)]
(subs s i (+ i n)))))
The second one should allocate less (one string instead of three per iteration).
There used to be a rotations function in clojure.contrib.seq that might be worth a look for inspiration. The source is reproduced below:
(defn rotations
"Returns a lazy seq of all rotations of a seq"
[x]
(if (seq x)
(map
(fn [n _]
(lazy-cat (drop n x) (take n x)))
(iterate inc 0) x)
(list nil)))
Then you could do something like:
(apply map str (rotations "^banana|"))
; => ("^banana|" "banana|^" "anana|^b" "nana|^ba" "ana|^ban" "na|^bana" "a|^banan" "|^banana")
A stepped call to partition works:
(defn bwt[s]
(let [s' (str "^" s "|")
c (cycle s')
l (count s')]
(map last (sort (apply map str (take l (partition l 1 c)))))))
(apply str (bwt "banana"))
=> "|bnn^aaa"
If I was unconcerned about efficiency or number of characters I'd write something like:
(defn rotate-string
[s]
(apply str (concat (drop 1 s) (take 1 s))))
(defn string-rotations
[s]
(->> s
(iterate rotate-string)
(take (count s))))
(rotate-string "^banana|") ; "banana|^"
(string-rotations "^banana|") ; ("^banana|" "banana|^" "anana|^b" "nana|^ba" "ana|^ban" "na|^bana" "a|^banan" "|^banana")
In particular, factoring out the single rotation into its own function.
Another way to accomplish rotation is to use a "double string" (i.e. concatenate the string to itself) and play around with substrings.
(defn rotations [strng]
(let [indices (range (count strng))
doublestr (str strng strng)]
(map #(subs doublestr % (+ % (count strng))) indices)))
(rotations "^banana|")
;;(^banana| banana|^ anana|^b nana|^ba ana|^ban na|^bana a|^banan |^banana)
Rotations of "foo":
Take the double string "foofoo"
Length n of "foo" = 3
The rotations are all the n substrings of "foofoo" that start with indices 0, 1, 2 and have the same length n
Related
I used string-length to get the number of characters but I am having difficulties in defining a recursive function. Should I convert the string to a list and then count the elements?
There's no useful way of doing this recursively (or even tail recursively): strings in Scheme are objects which know how long they are. There would be such an approach in a language like C where strings don't know how long they are but are delimited by some special marker. So for instance if (special-marker? s i) told you whether the i'th element of s was the special marker object, then you could write a function to know how long the string was:
(define (silly-string-length s)
(let silly-string-length-loop ([i 1])
(if (special-marker? s i)
(- i 1)
(silly-string-length-loop (+ i 1)))))
But now think about how you would implement special-marker? in Scheme: in particular here's the obvious implementation:
(define (special-marker? s i)
(= i (+ (string-length s) 1)))
And you can see that silly-string-length is now just a terrible version of string-length.
Well, if you wanted to make it look even more terrible, you could, as you suggest, convert a string to a list and then compute the length of the lists. Lists are delimited by a special marker object, () so this approach is reasonable:
(define (length-of-list l)
(let length-of-list-loop ([i 0]
[lt l])
(if (null? lt)
i
(length-of-list-loop (+ i 1) (rest lt)))))
So you could write
(define (superficially-less-silly-string-length s)
(length-of-list
(turn-string-into-list s)))
But, wait, how do you write turn-string-into-list? Well, something like this perhaps:
(define (turn-string-into-list s)
(let ([l (string-length s)])
(let loop ([i 0]
[r '()])
(if (= i l)
(reverse r)
(loop (+ i 1)
(cons (string-ref s i) r))))))
And this ... uses string-length.
What is the problem with?
(string-length string)
If the question is a puzzle "count characters in a string without using string-length",
then maybe:
(define (my-string-length s)
(define (my-string-length t n)
(if (string=? s t) n
(my-string-length
(string-append t (string (string-ref s n))) (+ n 1))))
(my-string-length "" 0))
or:
(define (my-string-length s)
(define (my-string-length n)
(define (try thunk)
(call/cc (lambda (k)
(with-exception-handler (lambda (x)
(k n))
thunk))))
(try (lambda ()
(string-ref s n)
(my-string-length (+ n 1)))))
(my-string-length 0))
(but of course string-ref will be using the base string-length or equivalent)
Trying to convert Haskell function to Clojure. But facing difficulties. Not sure what's happening.
Here's recursive Haskell function.
mapWidth :: [[Char]] -> Int
mapWidth [] = 0
mapWidth (x:xs)
| length xs == 0 = length x
| length x /= length (xs!!0) = -1
| otherwise = mapWidth(xs)
Here's what I've tried so far :
(defn mapWidth [data_list]
(def data 0)
([[x & xs](seq data_list)](if (= (count x) 0)
(data 0)
(data -1))))
([[x & xs](seq data_list)](if not(= (count xs) length (xs!!0))
(data 0)
(data -1)
mapWidth(xs)))
Any help is appreciated. I'm pretty new to both the languages.
as far as i can see, this function returns length of an element, if all elements have the equal length. In this case it could look like this:
(defn map-len [[x & [y :as xs]]]
(cond (empty? xs) (count x)
(not= (count x) (count y)) -1
:else (recur xs)))
which is almost the exact rewrite of haskell variant (replacing straight recursive call with recur)
(map-len [[1 2] [3 4] [5 6]])
;;=> 2
(map-len [[1 2] [3 4 5] [5 6]])
;;=> -1
bot since clojure is about operations on sequences, you can do it in a more idiomatic way (as for me, it is):
(defn map-len2 [data]
(cond (empty? data) 0
(apply = (map count data)) (count (first data))
:else -1))
(defn map-len3 [[x & xs]]
(let [c (count x)]
(if (every? #(= c (count %)) xs)
c
-1)))
I want to modify a string by applying a function to some of its chars (by starting index and length).
For example, I want to increment the ascii representation of the string "aaaaa" from the 2nd index to the 4th.
[start=1 length=3]
"aaaaa" => "abbba"
The only way I could think of is applying map, but it goes over all the sequence.
You could use subs to get the portions you do and don't want to modify. After modification use str to concatenate the result together:
(defn replace-in-str [f in from len]
(let [before (subs in 0 from)
after (subs in (+ from len))
being-replaced (subs in from (+ from len))
replaced (f being-replaced)]
(str before replaced after)))
You can call it:
(replace-in-str
(fn [sub-str] (apply str (map #(char (+ 1 (int %))) sub-str)))
"aaaaa"
1
3)
Indeed map applies the function to every element in the sequence. One way to get around that is to start with map-indexed. Unlike map, map-indexed passes the element's index as the first argument to the mapping function. When we have element's index, we can use it to choose if we need to perform the operation or just return the element as is.
A solution might look like this:
(defn inc-char [c]
(char (inc (long c))))
(defn if-in-range [from to f]
(fn [i x & args]
(if (<= from i (dec to))
(apply f x args)
x)))
(defn map-subs [from to f s]
(apply str (map-indexed (if-in-range from to f) s)))
(map-subs 1 4 inc-char "aaaaa")
;; "abbba"
I thought of using map-index to execute the operation only on the specified index:
((fn [op start length] (map-indexed (fn [i m] (if (<= start i length)
(op m)
m)) "aaaaa"))
#(char (+ 1 (int %)))
1
3)
=> (\a \b \b \b \a)
Here you go:
(defn replace-str
[s start-i end-i]
(apply str (map-indexed (fn [index val]
(if (and (>= index start-i)
(<= index end-i))
(char (+ (int val) 1))
val))
s)))
(replace-str "aaaa" 1 2)
;=> "abba"
The Problem
I need to create a function that, when given a finite sequence of potentially infinite sequences, it produces the sequence that is their "cartesian product".
i.e. given the sequence
'((1 2) (3 4))
the function produces (some ordering of):
'((1 3) (1 4) (2 3) (2 4)
Importantly, for every p in the list of cartesian products ps, there must be some natural number n such that (= p (last (take n ps))). Or, informally, you only need to iterate through the sequence a finite amount to reach any element in it.
This condition becomes important when dealing with infinite lists.
Solution in Haskell
In Haskell, this is how I would have done it:
interleave :: [a] -> [a] -> [a]
interleave [] ys = ys
interleave (x:xs) ys = x : interleave ys xs
combine :: [[a]] -> [[a]]
combine = foldr prod [[]]
where
prod xs css = foldr1 interleave [ [x:cs | cs <- css] | x <- xs]
And calling it you get the following:
combine [[0..] [0..]] = [[0,0,0],[1,0,0],[,1,0],[2,0,0],[0,0,1],[1,1,0],...
Solution in Clojure
And so I attempted to replicate this in Clojure, like so, (It's pretty much a direct translation):
(defn interleave
"Returns a lazy sequence of the interleavings of sequences `xs` and `ys`
(both potentially infinite), leaving no elements discarded."
[xs ys]
(lazy-seq
(if-let [[x & xs*] (seq xs)]
(cons x (interleave ys xs*))
ys)))
(defn interleave*
"Converts a sequence of potentially infinite sequences into its lazy
interleaving."
[xss]
(lazy-seq
(when-let [[xs & xss*] (seq xss)]
(interleave xs (interleave* xss*)))))
(defn combine
"Takes a finite sequence of potentially infinite sequences, and combines
them to produce a possibly infinite sequence of their cartesian product."
[xss]
(if-let [[xs & xss*] (seq xss)]
(interleave*
(for [x xs]
(for [cs (combine xss*)]
(lazy-seq (cons x cs)))))
'(()) ))
But when I run:
(take 1 (combine [(range) (range)]))
I get:
StackOverflowError cfg.util/combine/iter--3464--3468/fn--3469/fn--3470/iter--3471--3475/fn--3476
So, how do I make it lazy enough, so as to avoid the stack overflow? Really, I don't understand how Clojure's lazy sequence model works which is the main problem.
I think your solution may be algorithmically intractable, reconstructing the sub-sequences time and again, much as the simple Fibonacci function:
(defn fib [n]
(case n
(0N 1N) n
(+ (fib (- n 1)) (fib (- n 2)))))
... recomputes its precedents.
In any event, the search for [100 10] in the cartesian product of (range) and (range):
(first (filter #{[100 10]} (combine [(range) (range)])))
... does not return in a reasonable time.
I can offer you a faster though far less elegant solution.
First, a couple of utilities:
Something from #amalloy to compute the Cartesian product of finite sequences:
(defn cart [colls]
(if (empty? colls)
'(())
(for [x (first colls)
more (cart (rest colls))]
(cons x more))))
A function adapted from the Clojure Cookbook to map the values of a map:
(defn map-vals [f m] (zipmap (keys m) (map f (vals m))))
Now for the function we want, which I've called enum-cart, as it enumerates the Cartesian product even of infinite sequences:
(defn enum-cart [colls]
(let [ind-colls (into (sorted-map) (map-indexed (fn [n s] [n (seq s)]) colls))
entries ((fn tins [ss] (let [ss (select-keys ss (map key (filter val ss)))]
(lazy-seq
(if (seq ss)
(concat
(map-vals first ss)
(tins (map-vals next ss)))))))
ind-colls)
seens (reductions
(fn [a [n x]] (update-in a [n] conj x))
(vec (repeat (count colls) []))
entries)]
(mapcat
(fn [sv [n x]] (cart (assoc sv n [x])))
seens entries)))
The idea is to generate an indexed sequence of entries, going round the non-exhausted sequences. From this we generate a companion sequence of what we have already seen from each sequence. We pairwise combine these two, generating the free cartesian product of the new element with what we have of the other sequences. The answer is the concatenation of these free products.
For example
(enum-cart [(range 3) (range 10 15)])
... produces
((0 10)
(1 10)
(0 11)
(1 11)
(2 10)
(2 11)
(0 12)
(1 12)
(2 12)
(0 13)
(1 13)
(2 13)
(0 14)
(1 14)
(2 14))
And
(first (filter #{[100 10]} (enum-cart [(range) (range)])))
;(100 10)
... returns more or less instantly.
Notes
Is this better done in Knuth or elsewhere? I don't have access to
it.
The last non-exhausted sequence need not be kept, as there is nothing
else to use it.
So, I figured it out. And the issue is a subtle, but frustrating one. The problem stems from the destructuring I perform, in basically every function: I use this sort of idiom: [x & xs*] (seq xs), however, this realizes the first element of xs*, as well as realizing x. This behaviour is similar to what you would see if you were to use first and next to get the head and tail of the list respectively.
Using first/rest instead of destructuring in this way fixed the stack overflow:
(defn interleave
"Returns a lazy sequence of the interleavings of sequences `xs` and `ys`
(both potentially infinite), leaving no elements discarded."
[xs ys]
(lazy-seq
(if-let [xs* (seq xs)]
(cons (first xs*) (interleave ys (rest xs*)))
ys)))
(defn interleave*
"Converts a sequence of potentially infinite sequences into its lazy
interleaving."
[xss]
(lazy-seq
(when-let [xss* (seq xss)]
(interleave (first xss*)
(interleave* (rest xss*))))))
(defn combine
"Takes a finite sequence of potentially infinite sequences, and combines
them to produce a possibly infinite sequence of their cartesian product."
[xss]
(if-let [xss* (seq xss)]
(interleave*
(for [x (first xss*)]
(for [cs (combine (rest xss*))]
(lazy-seq (cons x cs)))))
'(()) ))
And running it, we get:
(= (take 5 (combine [(range) (range) (range)]))
'((0 0 0) (1 0 0) (0 1 0) (2 0 0) (0 0 1)))
I am going over this haskell lecture on count down game, i don't know any haskell but i am intrested in the problem, i am trying to port his code to clojure.
this is the part i got stuck must be something i don't get in haskell,
split :: [a] -> [([a],[a])]
split [] = [([],[])]
split (x:xs) = ([],x:xs) : [(x:ls,rs) | (ls,rs) [([a],[a])]
nesplit = filter ne . split
ne :: ([a],[b]) -> Bool
ne (xs,ys) = not (null xs || null ys)
exprs :: [Int] -> [Expr]
exprs [] = []
exprs [n] = [Val n]
exprs ns = [e | (ls,rs)
I have my own split given 1 2 3 4 it spits out,
(((1) (2 3 4)) ((1 2) (3 4)) ((1 2 3) (4)))
(defn split [v]
(if (= (count v) 1)
(list (first v))
(map #(list (take % v) (drop % v)) (range 1 (count v)))))
(defn exprs [v]
(if (= (count v) 1)
v
(map #(concat (exprs (first %)) (exprs (second %))) v)))
(exprs (split [1 2 3 4]))
that gives me,
java.lang.IllegalArgumentException: Don't know how to create ISeq from: java.lang.Integer
Can anyone tell me what am i missing from the haskell code?
His full code listing is available here.
This is closely following the Haskell implementation as far as my limited Haskell fu allows me to do....
(defn split
[s]
(map #(split-at % s) (range 1 (count s))))
(defn ne
[s]
(every? (complement empty?) s))
(defn nesplit
[s]
(filter ne (split s)))
(declare combine)
(defn exprs
[s]
(when-let [s (seq s)]
(if (next s)
(for [[ls rs] (nesplit s)
l (exprs ls)
r (exprs rs)
e (combine l r)]
e)
s)))
Haven't tested it though.
As for your error message: I think the problem is, that you don't call split recursively in exprs. Then you get 1 were a sequence is expected...
Random other note: count is linear in time for sequences. Since we just need to know, whether we have more than one element, we can check the value of (next s) against nil.
the exception results from exprs being called recursively and eventually being invoked with a list of integers. your code only handles a list of lists or a list of length one.
(exprs '(2 3 4))
leads to the else branch of the if statement which expands out to:
(map #(concat (exprs (first %)) (exprs (second %))) '(2 3 4))))
which comes out to:
(concat (exprs (first 2)) (exprs (second 2)))
(concat (exprs (first 3)) (exprs (second 3)))
(concat (exprs (first 4)) (exprs (second 4)))
and (first 2) throws:
java.lang.IllegalArgumentException: Don't know how to create ISeq from: java.lang.Integer