clojure: remove case-insensitive string duplicates - string

What is the idiomatic clojure way to remove strings from an array of strings if there is case-insensitive match?
I need to preserve the case for the results (I always want to preserve the first occurence of insensitive match).
Simple example:
(distinct-case-insensitive ["fish" "Dog" "cat"] ["FISH "DOG"])
would return
["fish" "Dog" "cat"]

This is solution I came up with. To simplify function it accepts just one list with duplicates, so if you need vararg lists (apply concat lists) before.
(defn distinct-case-insensitive [xs]
(->> xs
(group-by clojure.string/lower-case)
(vals)
(map first)))
(distinct-case-insensitive ["fish" "Dog" "cat" "Fish" "DOG"]) =>
("fish" "Dog" "cat")
But, as Leonid mentioned it does not preserve order due to hashmap. For ordered solution use
(defn distinct-case-insesitive [xs]
(->> xs
(group-by clojure.string/lower-case)
(#(map % (map clojure.string/lower-case xs)))
(map first)
(distinct)))

Greedy solution
Obviously, you can't use build-in distinct here, so you should reimplement it yourself.
mishadoff's solution is really beautiful and clujur'ish, but it breaks the order of elements when there are more then 8 unique elements dye to clojure HashMap implementation.
The safest way to do what you want is to use reduce:
(defn concat-distinct [& colls]
(first
(reduce (fn [[coll seen] el]
(let [lc-el (string/lower-case el)]
(if (contains? seen lc-el)
[coll seen]
[(conj coll el) (conj seen lc-el)])))
[[] #{}]
(apply concat colls))))
If works for any number of collections:
user=> (concat-distinct ["fish" "Dog" "cat"] ["FISH" "DOG"] ["snake"] ["CaT" "Camel"])
["fish" "Dog" "cat" "snake" "Camel"]
And for any number of distinct elements (unlike mishadoff's solution):
user=> (concat-distinct ["g" "h" "i" "j" "a" "b" "c" "d" "e" "f"])
["g" "h" "i" "j" "a" "b" "c" "d" "e" "f"]
Lazy solution
In most cases you'll be fine with greedy solution. But if you want it to be lazy, then you won't be able to avoid recursion:
(defn lazy-concat-distinct [& colls]
((fn step [coll seen]
(lazy-seq
(loop [[el & xs :as s] coll]
(when (seq s)
(let [lc-el (string/lower-case el)]
(if (contains? seen lc-el)
(recur xs)
(cons el (step xs (conj seen lc-el)))))))))
(apply concat colls) #{}))
This solution uses lazy sequences:
user=> (def res (lazy-concat-distinct (lazy-seq (println :boo) ["boo"])))
user=> (count res)
:boo
1
You can make it even lazier using lazy-cat macro:
(defmacro lazy-concat-distinct* [& colls]
`(lazy-concat-distinct (lazy-cat ~#colls)))
Now it won't even evaluate it's arguments until they are actually used:
user=> (def res (lazy-concat-distinct* (do (println :boo) ["boo"])))
user=> (count res)
:boo
1
It's useful when you want to aggregate data from some large database without downloading it all at once.
N.B. Be careful with lazy solutions. For example, this solution works almost 4 times slower than the greedy one.

Here's a solution that meets your requirements (the first matching item "wins" and order is preserved), is lazy, and has the benefit of being a higher order function. It takes a keyfn as its first argument, in correspondence with e.g. sort-by and group-by.
(defn distinct-by [keyfn coll]
(letfn [(step [xs seen]
(lazy-seq
((fn [xs]
(when-let [[x & more] (seq xs)]
(let [k (keyfn x)]
(if (seen k)
(recur more)
(cons x (step more (conj seen k)))))))
xs)))]
(step coll #{})))
So your usage would be:
(require '[clojure.string :as str])
(distinct-by str/lower-case ["fish" "Dog" "cat" "Fish" "DOG"])
;=> ("fish" "Dog" "cat")
The use of recur and the inner anonymous function is a relatively minor optimization. clojure.core/distinct uses it but in many cases it's not necessary. Here's a version without the extra noise:
(defn distinct-by [keyfn coll]
(letfn [(step [xs seen]
(lazy-seq
(when-let [[x & more] (seq xs)]
(let [k (keyfn x)]
(if (seen k)
(step more seen)
(cons x (step more (conj seen k))))))))]
(step coll #{})))

A solution is to implement a distinct-by that allows to specify a function to apply on each element before checking the duplicates
(defn distinct-by [f coll]
(let [groups (group-by f coll)]
(map #(first (groups %)) (distinct (map f coll)))))
For the example case this can be used like
(distinct-by clojure.string/lower-case
(concat ["fish" "Dog" "cat"] ["FISH" "DOG"]))
; => ("fish" "Dog" "cat")

Related

How to implement multithreading

Game rules
Consider a simple 2-player game played as follows: An even number of coins is laid out in a row. Taking turns, each player removes the coin on one of the ends of the row. The object is to have the highest value in coins when all coins have been taken.
Player one finds the sum of all of the even-numbered coins, and all the odd-numbered coins. If the sum of the odd numbered coins is higher, player one takes the leftmost coin; otherwise he takes the rightmost.
Player two has a choice, with an odd number of coins. So he tries taking a coin from both ends, to see which option would leave player 1 worse off.
The problem
I basically want to implement multitheading to this program. I am still very new to Clojure, and I couldn't really find any good material online, regarding multithreading, that could be applied to my program.
The code
(ns game.core
(:gen-class))
(defn vector-from-string [s]
(drop 1 (map read-string (clojure.string/split (clojure.string/trim-newline s) #" "))))
(defn string-from-file [f]
(slurp f))
(defn sum-of-evens [v]
(def evens (vector))
(loop [v v, index 1]
(when (seq v)
(if (even? index)
(def evens (conj evens (first v))))
(recur (rest v) (inc index))))
(reduce + evens))
(defn sum-of-odds [v]
(def odds (vector))
(loop [v v, index 1]
(when (seq v)
(if (odd? index)
(def odds (conj odds (first v))))
(recur (rest v) (inc index))))
(reduce + odds))
(defn player-two [v p1score p2score]
(if (not (empty? v))
(if (> (max (sum-of-odds (drop 1 v)) (sum-of-evens (drop 1 v))) (max (sum-of-odds (drop-last v)) (sum-of-evens (drop-last v))))
(player-one (drop-last v) p1score (+ p2score(last v)))
(player-one (drop 1 v) p1score (+ p2score (first v))))
(println "p1score" p1score "p2score" p2score)))
(defn player-one [v p1score p2score]
(if (not (empty? v))
(if (> (sum-of-odds v) (sum-of-evens v))
(player-two (drop 1 v) (+ p1score (first v)) p2score)
(player-two (drop-last v) (+ p1score (last v)) p2score))
(println "p1score" p1score "p2score" p2score)))
(defn -main [& args]
(let [v (vector-from-string (string-from-file "numbers.txt")) ]
(player-one v 0 0)))
So -main runs the player-one function first, and player-one calls player-two, and they both continue on until the end of the program. I would like to somehow implement multithreading to speed up the executing of this game with a higher amount of starting coins.
Your code is currently very unidiomatic.
A few remarks that hopefully help you getting into the right direction:
A def inside a defn (or def) is (almost) always wrong. You're thinking in terms of variable assignment and mutable variables here. This is not how Clojure works. Use variables in your recur instead, if you absolutely must, use a local atom (also almost always wrong, but less often wrong than def inside defn).
Your loops are unnecessarily complicated. You want to sum over the elements at even or odd indices? Use a combination of reduce, take-nth and rest:
(take-nth 2 [1 2 3])
;=> (1 3)
(take-nth 2 (rest [1 2 3 4]))
;=> (2 4)
The whole things looks like you're compiling this over and over again and then run the JVM with it. Am I right? The preferred way is to work at the REPL. How to access it, depends on which editing environment you use. There are many beginner-friendly REPLs out there. Gorilla REPL is one example.
Once you got your code and development workflow in better shape, you may want to explore functions like pmap and future for easy access to multi-threading. More advanced stuff involves a library called core.async, but that's probably not the ideal route for the beginner. You can also fall back to Java interop to create your threads. Again something that, while not really hard to do, requires a bit of experience with Clojure.
Hope that helps, even it is not a direct answer to your question.
First let's look at some issues in your example that will need to be addressed before parallelizing this code.
sum-of-evens is using def inside a function, which is almost always a mistake. This might seem to have the effect you want, but it's not the right way to achieve it. defs are typically used for namespace-level (at the same level as your function defns) values. We can refactor sum-of-evens to not rely on unintentionally side-effecty behavior via def:
(defn sum-of-evens [v]
(loop [v v
index 1
evens []]
(if (seq v)
(recur (rest v)
(inc index)
(if (even? index) ;; add a binding to loop, not a def
(conj evens (first v))
evens)) ;; pass unchanged value when necessary
(reduce + evens))))
But we can further simplify this function with keep-indexed:
(defn sum-of-evens [coll]
(->> coll
(keep-indexed (fn [i v] (when (even? (inc i))
v)))
(apply +)))
And when we do the same for sum-of-odds, we can see the functions are nearly identical except for the condition they use: odd? vs. even?. We can make another function that takes a predicate function:
(defn sum-by-index-pred [f coll]
(->> coll
(keep-indexed (fn [i v] (when (f i) v)))
(apply +)))
;; using partial application and function composition
(def sum-of-evens (partial sum-by-index-pred (comp even? inc)))
(def sum-of-odds (partial sum-by-index-pred (comp odd? inc)))
Looking at the implementation of player-one and player-two, they seem to be mutually recursive. I don't see how you could parallelize this to make it any faster because each turn is dependent on the previous turn's outcome; there's nothing to parallelize.
I'd suggest refactoring this so that your game rules and state are computed in one place, rather than mutually recursive functions.
(loop [scores (array-map :player-1 0 :player-2 0)
turns (cycle (keys scores))
vs (shuffle (range 100))]
(if (seq vs)
(let [higher-odds? (> (sum-of-odds vs) (sum-of-evens vs))
scores (if higher-odds?
(update scores (first turns) + (first vs))
(update scores (first turns) + (last vs)))
remain (if higher-odds?
(rest vs)
(butlast vs))]
(recur scores (rest turns) remain))
(prn scores)))
;; {:player-1 2624, :player-2 2326}
I'm not sure if this preserves your original game logic but it should be close, and it does generalize it for more than two players. Try adding :player-3 0 to the starting scores.

How to modify string by index in Clojure?

I want to modify a string by applying a function to some of its chars (by starting index and length).
For example, I want to increment the ascii representation of the string "aaaaa" from the 2nd index to the 4th.
[start=1 length=3]
"aaaaa" => "abbba"
The only way I could think of is applying map, but it goes over all the sequence.
You could use subs to get the portions you do and don't want to modify. After modification use str to concatenate the result together:
(defn replace-in-str [f in from len]
(let [before (subs in 0 from)
after (subs in (+ from len))
being-replaced (subs in from (+ from len))
replaced (f being-replaced)]
(str before replaced after)))
You can call it:
(replace-in-str
(fn [sub-str] (apply str (map #(char (+ 1 (int %))) sub-str)))
"aaaaa"
1
3)
Indeed map applies the function to every element in the sequence. One way to get around that is to start with map-indexed. Unlike map, map-indexed passes the element's index as the first argument to the mapping function. When we have element's index, we can use it to choose if we need to perform the operation or just return the element as is.
A solution might look like this:
(defn inc-char [c]
(char (inc (long c))))
(defn if-in-range [from to f]
(fn [i x & args]
(if (<= from i (dec to))
(apply f x args)
x)))
(defn map-subs [from to f s]
(apply str (map-indexed (if-in-range from to f) s)))
(map-subs 1 4 inc-char "aaaaa")
;; "abbba"
I thought of using map-index to execute the operation only on the specified index:
((fn [op start length] (map-indexed (fn [i m] (if (<= start i length)
(op m)
m)) "aaaaa"))
#(char (+ 1 (int %)))
1
3)
=> (\a \b \b \b \a)
Here you go:
(defn replace-str
[s start-i end-i]
(apply str (map-indexed (fn [index val]
(if (and (>= index start-i)
(<= index end-i))
(char (+ (int val) 1))
val))
s)))
(replace-str "aaaa" 1 2)
;=> "abba"

Idiomatic string rotation in Clojure

How to idiomatically rotate a string in Clojure for the Burrows-Wheeler transform?
I came up with this, which uses (cycle "string"), but feels a bit imperative:
(let [s (str "^" "banana" "|")
l (count s)
c (cycle s)
m (map #(take l (drop % c)) (range l))]
(apply map str m))
=> ("^banana|" "banana|^" "anana|^b" "nana|^ba" "ana|^ban" "na|^bana" "a|^banan" "|^banana")
I'm not sure if this qualifies as code golf. Is there a cleaner way to do this?
I would do:
(defn bwrot [s]
(let [s (str "^" s "|")]
(for [i (range (count s))]
(str (subs s i) (subs s 0 i)))))
or:
(defn bwrot [s]
(let [n (+ 2 (count s))
s (str "^" s "|^" s "|")]
(for [i (range n)]
(subs s i (+ i n)))))
The second one should allocate less (one string instead of three per iteration).
There used to be a rotations function in clojure.contrib.seq that might be worth a look for inspiration. The source is reproduced below:
(defn rotations
"Returns a lazy seq of all rotations of a seq"
[x]
(if (seq x)
(map
(fn [n _]
(lazy-cat (drop n x) (take n x)))
(iterate inc 0) x)
(list nil)))
Then you could do something like:
(apply map str (rotations "^banana|"))
; => ("^banana|" "banana|^" "anana|^b" "nana|^ba" "ana|^ban" "na|^bana" "a|^banan" "|^banana")
A stepped call to partition works:
(defn bwt[s]
(let [s' (str "^" s "|")
c (cycle s')
l (count s')]
(map last (sort (apply map str (take l (partition l 1 c)))))))
(apply str (bwt "banana"))
=> "|bnn^aaa"
If I was unconcerned about efficiency or number of characters I'd write something like:
(defn rotate-string
[s]
(apply str (concat (drop 1 s) (take 1 s))))
(defn string-rotations
[s]
(->> s
(iterate rotate-string)
(take (count s))))
(rotate-string "^banana|") ; "banana|^"
(string-rotations "^banana|") ; ("^banana|" "banana|^" "anana|^b" "nana|^ba" "ana|^ban" "na|^bana" "a|^banan" "|^banana")
In particular, factoring out the single rotation into its own function.
Another way to accomplish rotation is to use a "double string" (i.e. concatenate the string to itself) and play around with substrings.
(defn rotations [strng]
(let [indices (range (count strng))
doublestr (str strng strng)]
(map #(subs doublestr % (+ % (count strng))) indices)))
(rotations "^banana|")
;;(^banana| banana|^ anana|^b nana|^ba ana|^ban na|^bana a|^banan |^banana)
Rotations of "foo":
Take the double string "foofoo"
Length n of "foo" = 3
The rotations are all the n substrings of "foofoo" that start with indices 0, 1, 2 and have the same length n

Cartesian Product of a finite sequence of potentially infinite sequences

The Problem
I need to create a function that, when given a finite sequence of potentially infinite sequences, it produces the sequence that is their "cartesian product".
i.e. given the sequence
'((1 2) (3 4))
the function produces (some ordering of):
'((1 3) (1 4) (2 3) (2 4)
Importantly, for every p in the list of cartesian products ps, there must be some natural number n such that (= p (last (take n ps))). Or, informally, you only need to iterate through the sequence a finite amount to reach any element in it.
This condition becomes important when dealing with infinite lists.
Solution in Haskell
In Haskell, this is how I would have done it:
interleave :: [a] -> [a] -> [a]
interleave [] ys = ys
interleave (x:xs) ys = x : interleave ys xs
combine :: [[a]] -> [[a]]
combine = foldr prod [[]]
where
prod xs css = foldr1 interleave [ [x:cs | cs <- css] | x <- xs]
And calling it you get the following:
combine [[0..] [0..]] = [[0,0,0],[1,0,0],[,1,0],[2,0,0],[0,0,1],[1,1,0],...
Solution in Clojure
And so I attempted to replicate this in Clojure, like so, (It's pretty much a direct translation):
(defn interleave
"Returns a lazy sequence of the interleavings of sequences `xs` and `ys`
(both potentially infinite), leaving no elements discarded."
[xs ys]
(lazy-seq
(if-let [[x & xs*] (seq xs)]
(cons x (interleave ys xs*))
ys)))
(defn interleave*
"Converts a sequence of potentially infinite sequences into its lazy
interleaving."
[xss]
(lazy-seq
(when-let [[xs & xss*] (seq xss)]
(interleave xs (interleave* xss*)))))
(defn combine
"Takes a finite sequence of potentially infinite sequences, and combines
them to produce a possibly infinite sequence of their cartesian product."
[xss]
(if-let [[xs & xss*] (seq xss)]
(interleave*
(for [x xs]
(for [cs (combine xss*)]
(lazy-seq (cons x cs)))))
'(()) ))
But when I run:
(take 1 (combine [(range) (range)]))
I get:
StackOverflowError cfg.util/combine/iter--3464--3468/fn--3469/fn--3470/iter--3471--3475/fn--3476
So, how do I make it lazy enough, so as to avoid the stack overflow? Really, I don't understand how Clojure's lazy sequence model works which is the main problem.
I think your solution may be algorithmically intractable, reconstructing the sub-sequences time and again, much as the simple Fibonacci function:
(defn fib [n]
(case n
(0N 1N) n
(+ (fib (- n 1)) (fib (- n 2)))))
... recomputes its precedents.
In any event, the search for [100 10] in the cartesian product of (range) and (range):
(first (filter #{[100 10]} (combine [(range) (range)])))
... does not return in a reasonable time.
I can offer you a faster though far less elegant solution.
First, a couple of utilities:
Something from #amalloy to compute the Cartesian product of finite sequences:
(defn cart [colls]
(if (empty? colls)
'(())
(for [x (first colls)
more (cart (rest colls))]
(cons x more))))
A function adapted from the Clojure Cookbook to map the values of a map:
(defn map-vals [f m] (zipmap (keys m) (map f (vals m))))
Now for the function we want, which I've called enum-cart, as it enumerates the Cartesian product even of infinite sequences:
(defn enum-cart [colls]
(let [ind-colls (into (sorted-map) (map-indexed (fn [n s] [n (seq s)]) colls))
entries ((fn tins [ss] (let [ss (select-keys ss (map key (filter val ss)))]
(lazy-seq
(if (seq ss)
(concat
(map-vals first ss)
(tins (map-vals next ss)))))))
ind-colls)
seens (reductions
(fn [a [n x]] (update-in a [n] conj x))
(vec (repeat (count colls) []))
entries)]
(mapcat
(fn [sv [n x]] (cart (assoc sv n [x])))
seens entries)))
The idea is to generate an indexed sequence of entries, going round the non-exhausted sequences. From this we generate a companion sequence of what we have already seen from each sequence. We pairwise combine these two, generating the free cartesian product of the new element with what we have of the other sequences. The answer is the concatenation of these free products.
For example
(enum-cart [(range 3) (range 10 15)])
... produces
((0 10)
(1 10)
(0 11)
(1 11)
(2 10)
(2 11)
(0 12)
(1 12)
(2 12)
(0 13)
(1 13)
(2 13)
(0 14)
(1 14)
(2 14))
And
(first (filter #{[100 10]} (enum-cart [(range) (range)])))
;(100 10)
... returns more or less instantly.
Notes
Is this better done in Knuth or elsewhere? I don't have access to
it.
The last non-exhausted sequence need not be kept, as there is nothing
else to use it.
So, I figured it out. And the issue is a subtle, but frustrating one. The problem stems from the destructuring I perform, in basically every function: I use this sort of idiom: [x & xs*] (seq xs), however, this realizes the first element of xs*, as well as realizing x. This behaviour is similar to what you would see if you were to use first and next to get the head and tail of the list respectively.
Using first/rest instead of destructuring in this way fixed the stack overflow:
(defn interleave
"Returns a lazy sequence of the interleavings of sequences `xs` and `ys`
(both potentially infinite), leaving no elements discarded."
[xs ys]
(lazy-seq
(if-let [xs* (seq xs)]
(cons (first xs*) (interleave ys (rest xs*)))
ys)))
(defn interleave*
"Converts a sequence of potentially infinite sequences into its lazy
interleaving."
[xss]
(lazy-seq
(when-let [xss* (seq xss)]
(interleave (first xss*)
(interleave* (rest xss*))))))
(defn combine
"Takes a finite sequence of potentially infinite sequences, and combines
them to produce a possibly infinite sequence of their cartesian product."
[xss]
(if-let [xss* (seq xss)]
(interleave*
(for [x (first xss*)]
(for [cs (combine (rest xss*))]
(lazy-seq (cons x cs)))))
'(()) ))
And running it, we get:
(= (take 5 (combine [(range) (range) (range)]))
'((0 0 0) (1 0 0) (0 1 0) (2 0 0) (0 0 1)))

Create a proxy for an specific instance of an object in clojure

I am trying to create a proxy object that adds some functionality to some of the methods of an object, using a closure (let / proxy) I ca do the thing, unfortunately I hava to re write ALL of the methods from the original object o I get an UnsupportedOpretationException here is an example:
;; the real object
(def realcon (java.sql.DriverManager/getConnection "jdbc:h2:tcp://localhost:9092/test"))
(def con
(let [msg "FG>"
xcon rcon]
(proxy [java.sql.Connection] []
(createStatement []
(println msg) ;; access to closure context !
(.createStatement xcon)))))
(def stmt (.createStatement con))
;;output FG>
(def rs (.executeQuery stmt "select count(*) from serie_sat"))
If I invoque any other method from java.sql.Connection I get UnsupportedOperationException I can do by hand the proxying of ALL methods but may be there is a beter way!.
Thanks
Here's an alternative using reify instead of proxy since, according to the docs, it's "preferable in all cases where its constraints are not prohibitive."
(defmacro override-delegate
[type delegate & body]
(let [d (gensym)
overrides (group-by first body)
methods (for [m (.getMethods (resolve type))
:let [f (-> (.getName m)
symbol
(with-meta {:tag (-> m .getReturnType .getName)}))]
:when (not (overrides f))
:let [args (for [t (.getParameterTypes m)]
(with-meta (gensym) {:tag (.getName t)}))]]
(list f (vec (conj args 'this))
`(. ~d ~f ~#(map #(with-meta % nil) args))))]
`(let [~d ~delegate]
(reify ~type ~#body ~#methods))))
;; Modifying your example slightly...
(def realcon (java.sql.DriverManager/getConnection "jdbc:h2:tcp://localhost:9092/test"))
(def con
(let [msg "FG>"]
(override-delegate java.sql.Connection realcon
(createStatement [this]
(println msg)
(.createStatement realcon)))))
The override-delegate macro expects the body to contain the reify specs of the methods you wish to override. Any you don't override will be invoked on the delegate. All reify specs generated by the macro will include type hints for each method's arguments and return value.
There is a caveat with my implementation: it only checks for method names in the body, ignoring argument arity/type for overloaded methods. So in the example above, where the java.sql.Connection interface provides multiple createStatement overloads, none of those that accept arguments will be defined for con. It wouldn't be too difficult to extend the macro to account for overloads, but when I need this behavior, I typically have to override all of them anyway.
I have just written the most ridiculous macro of my life to support this functionality. There might be a simpler way -- if I can think of one, I will certainly post it -- but this gave me a cool, trippy feeling and actually seems to work, so... here goes.
Edit: Here's a simpler way; define a function returning a regular proxy which delegates all methods (write it by hand or create it automatically -- the code of delegating-proxy contains a way of doing that), the use update-proxy on individual instances to replace just the methods which need replacing. This is clearly less cool than the crazy macro, so the latter shall remain below.
Here's the new, simplified approach (still not extremely clear due to some issues with the position parameter count limit and varargs):
;;; delegates all methods
(defmacro delegating-proxy [o class-and-ifaces ctor-args]
(let [oname (gensym)
impls (->> class-and-ifaces
(map resolve)
(mapcat #(.getDeclaredMethods ^Class %))
(group-by #(.getName ^java.lang.reflect.Method %))
(vals)
(map (fn delegating-impls [^java.lang.reflect.Method ms]
(let [mname (symbol (.getName ^java.lang.reflect.Method (first ms)))
arity-groups (partition-by #(count (.getParameterTypes ^java.lang.reflect.Method %)) ms)
max-arity (max-key #(count (.getParameterTypes ^java.lang.reflect.Method %)) ms)]
`(~mname
~#(remove
nil?
(map (fn [agroup]
(let [param-types (.getParameterTypes ^java.lang.reflect.Method (first agroup))
vararg? (and (seq param-types) (or (.isArray ^Class (last param-types)) (<= 20 (count param-types))))
arity ((if vararg? dec identity) (count param-types))
params (vec (repeatedly arity gensym))
params (if vararg? (conj params '& (gensym)) params)]
(when-not (and vararg? (not= arity max-arity))
(list params `(. ~oname (~mname ~#params))))))
arity-groups)))))))]
`(let [~oname ~o]
(proxy ~class-and-ifaces ~ctor-args ~#impls))))
A demo:
user> (def p (delegating-proxy (fn [& args] :foo) [clojure.lang.IFn] []))
#'user/p
user> (update-proxy p {"applyTo" (fn [& args] :bar)})
#<Object$IFn$4c646ebb user.proxy$java.lang.Object$IFn$4c646ebb#1c445f88>
user> (p 1)
:foo
user> (apply p (seq [1]))
:bar
Edit: the original macro follows.
First, a demo:
user> (.invoke (delegating-proxy (fn [x y] (prn x y))
[clojure.lang.IFn] []
(invoke [x] :foo))
:bar)
:foo
user> (.invoke (delegating-proxy (fn [x y] (prn x y))
[clojure.lang.IFn] []
(invoke [x] :foo))
:bar :quux)
:bar :quux
nil
delegating-proxy accepts an object to which it delegates when called upon to execute a method not explicitly implemented followed by the regular proxy arguments.
Second, the code. I think it's safe to assume there are various imperfections lurking in there. Actually the general shape of it is right out there; no lurking. If it's sufficiently useful to someone, it could probably be tested & improved into some degree of assured robustness.
The Gist is somewhat easier to read.
(defmacro delegating-proxy [o class-and-ifaces ctor-args & impls]
(let [oname (gensym)]
(letfn [(delegating-impls [^java.lang.reflect.Method ms]
(let [mname (symbol (.getName ^java.lang.reflect.Method (first ms)))
arity-groups (partition-by #(count (.getParameterTypes ^java.lang.reflect.Method %)) ms)
max-arity (max-key #(count (.getParameterTypes ^java.lang.reflect.Method %)) ms)]
`(~mname
~#(remove
nil?
(map (fn [agroup]
(let [param-types (.getParameterTypes ^java.lang.reflect.Method (first agroup))
vararg? (and (seq param-types) (or (.isArray ^Class (last param-types)) (<= 20 (count param-types))))
arity ((if vararg? dec identity) (count param-types))
params (vec (repeatedly arity gensym))
params (if vararg? (conj params '& (gensym)) params)]
(when-not (and vararg? (not= arity max-arity))
(list params `(. ~oname (~mname ~#params))))))
arity-groups)))))
(combine-impls [eimpls dimpls]
(map (fn [e d]
(let [e (if (vector? (second e))
(list (first e) (next e))
e)]
(list* (first e) (concat (next e) (next d)))))
eimpls
dimpls))]
(let [klass (resolve (first class-and-ifaces))
methods (->> class-and-ifaces
(map resolve)
(mapcat #(.getDeclaredMethods ^Class %)))
eimpl-specs (set (map (juxt first (comp count second)) impls))
rm-fn (fn rm-fn [^java.lang.reflect.Method m]
(contains? eimpl-specs [(symbol (.getName m)) (count (.getParameterTypes m))]))
dimpls (->> methods
(remove rm-fn)
(remove #(let [mods (.getModifiers ^java.lang.reflect.Method %)]
(or (java.lang.reflect.Modifier/isPrivate mods)
(java.lang.reflect.Modifier/isProtected mods))))
(sort-by #(.getName ^java.lang.reflect.Method %))
(partition-by #(.getName ^java.lang.reflect.Method %))
(map delegating-impls))
dimpl-names (set (map first dimpls))
eimpl-names (set (map first eimpl-specs))
{eonly false eboth true} (group-by (comp boolean dimpl-names first) impls)
{donly false dboth true} (group-by (comp boolean eimpl-names first) dimpls)
all-impls (concat eonly donly (combine-impls eboth dboth))]
`(let [~oname ~o]
(proxy ~class-and-ifaces ~ctor-args
~#all-impls))))))
Thank you very much I lerned a lot by seeing your answer, then I found some minor errors.
In the function delegating-impls, the parameter is an array of Method objects do the type cast is wrong. This means that max-arity is not a number and does not contain the bigest arity.
This took me to understand the code that has to do with de varargs. and realized that the var arg constructor in java (...) presets the las parameter as a array, the problem is that is the object has one method with for example 2 parameters and an other method with one parameter followed by a vararg (...) the we end up with 2 methods of the same arity, the code of delegating-proxy macro never enters the:
(when-not (and vararg? (not= arity max-arity)) because max-arity is no a number! so the proxy object omits any method with an array as the last parameter.
This took me to make a rewrite of the delegating-proxy and I ended up with the folowing code that work fine if there are no vararg (...) parameters, otherwise this methods will not be covered by the proxy implementation.
Here is the code:
(defmacro instance-proxy [obj mtd-re-filter pre-func post-func]
(let [cls (class (eval obj))
interfaces (.getInterfaces cls)
ifaces (into [] (map #(symbol (.getName %)) interfaces))
oname (gensym)
info (gensym)
impls (->> ifaces
(map resolve)
(mapcat #(.getDeclaredMethods ^Class %))
(group-by #(.getName ^java.lang.reflect.Method %))
(vals)
(map (fn delegating-impls [ms] ;; ms is an array of "Method" objects
(let [mname (symbol (.getName ^java.lang.reflect.Method (first ms)))
arity-groups (partition-by #(count (.getParameterTypes ^java.lang.reflect.Method %)) ms)]
`(~mname
~#(remove
nil?
(map (fn [agroup]
(let [param-types (.getParameterTypes ^java.lang.reflect.Method (first agroup))
arity (count param-types)
vararg? (and
(seq param-types)
(.isArray ^Class (last param-types)))
params (vec (repeatedly arity gensym))]
(when-not vararg?
(if (re-matches mtd-re-filter (name mname))
(list params
`(swap! ~info ~pre-func)
`(let [result# (. ~oname (~mname ~#params))]
(swap! ~info ~post-func)
result#))
(list params `(. ~oname (~mname ~#params)))))))
arity-groups)))))))]
`(let [~oname ~obj
~info (atom {})]
(proxy ~ifaces [] ~#impls))))
;;The code abobe is used like so:
(defn pre-test [m]
(println "ejecutando pre")
(assoc m :t0 (System/currentTimeMillis)))
(defn post-test [m]
(println "ejecutando post " m)
(let [now (System/currentTimeMillis)
mm (assoc m :t1 now :delta (- now (:t0 m)))]
(println mm)
mm))
(def rcon (java.sql.DriverManager/getConnection "jdbc:h2:tcp://localhost:9092/test"))
(def pcon (instance-proxy rcon #"prepareStatement" pre-test post-test))
(def stmt (.prepareStatement pcon "select * from SERIE_SAT"))
;;ejecutando pre
;;ejecutando post {:t0 1330046744318}
;;{:delta 3, :t1 1330046744321, :t0 1330046744318}
;;#'mx.interware.analyzer.driver/stmt
;;Here we obtain the statistics in a non-intrusive way wich was the objective of this code !
Thats all for now and thanks again for the very clever macro !
Saludos

Resources