I am just learning scheme, but I would love to be able to repeat myself less.
Is there a way I can assign a name to a subexpression in the local scope?
As per the comment:
Haskell where clause
x = s * t
where s = 10
t = 20
x should be 200 in this case.
Let (or letrec for recursive bindings), e.g.:
(define (f g)
(let ((x 1) (y (* g 2)))
(+ x y)))
Related
The cosine similarity of two lists can be calculated in linear time using a for-loop. I'm curious as to how one would achieve this using a Lisp-like language. Below is an example of my code in Python and Hy (Hylang).
Python:
def cos_sim(A,B):
import math as _math
n,da,db,d = 0,0,0,0
for a,b in zip(A,B):
n += a*b
da += a*a
db += b*b
da = _math.sqrt(da)
db = _math.sqrt(db)
d = da*db
return n / (d + 1e-32)
Hy (Lisp):
(import math)
(defn l2norm [a]
(math.sqrt (reduce + (map (fn [s](* s s)) a))))
(defn dot [a b]
(reduce + (map * a b)))
(defn cossim [a b]
(/ (dot a b) (* (l2norm a) (l2norm b))))
"I'm curious as to how one would achieve this using a Lisp-like language." This really depends on which Lisp you are using. In Scheme you might do something similar to the posted Hy solution:
(define (cos-sim-1 u v)
(/ (dot-prod u v)
(* (norm u) (norm v))))
(define (dot-prod u v)
(fold-left + 0 (map * u v)))
(define (norm u)
(sqrt (fold-left (lambda (acc x) (+ acc (* x x)))
0
u)))
This is linear in time complexity, but it could be improved by a constant factor by passing over the input only once. Scheme provides a named let construct that can be used to bind a name to a procedure; this is convenient here as it provides a simple mechanism for building the dot product and norms:
(define (cos-sim-2 u v)
(let iter ((u u)
(v v)
(dot-product 0)
(U^2 0)
(V^2 0))
(if (null? u)
(/ dot-product (sqrt (* U^2 V^2)))
(let ((x (car u))
(y (car v)))
(iter (cdr u)
(cdr v)
(+ dot-product (* x y))
(+ U^2 (* x x))
(+ V^2 (* y y)))))))
Both of these procedures assume that the input lists have the same length; it might be useful to add some validation code that checks this. Note that fold-left is standard in R6RS Scheme, but other standards rely on SRFIs for this, and some implementations may use different names, but the fold-left functionality is commonly available (perhaps as foldl or reduce).
It is possible to solve the problem in Common Lisp using either of the basic methods shown above, though in Common Lisp you would use labels instead of named let. But it would be typical to see a Common Lisp solution using the loop macro. The Common Lisp standard does not guarantee tail call elimination (though some implementations do support that), so explicit loops are seen much more often than in Scheme. The loop macro is pretty powerful, and one way that you could solve this problem while passing over the input lists only once is this:
(defun cos-sim (u v)
(loop :for x :in u
:for y :in v
:sum (* x y) :into dot-product
:sum (* x x) :into u2
:sum (* y y) :into y2
:finally (return (/ dot-product (sqrt (* u2 y2))))))
Here are some sample interactions:
Scheme (Chez Scheme):
> (cos-sim-1 '(1 0 0) '(1 0 0))
1
> (cos-sim-1 '(1 0 0) '(-1 0 0))
-1
> (cos-sim-1 '(1 0 0) '(0 1 0))
0
> (cos-sim-1 '(1 1 0) '(0 1 0))
0.7071067811865475
> (cos-sim-2 '(1 0 0) '(1 0 0))
1
> (cos-sim-2 '(1 0 0) '(-1 0 0))
-1
> (cos-sim-2 '(1 0 0) '(0 1 0))
0
> (cos-sim-2 '(1 1 0) '(0 1 0))
0.7071067811865475
Common Lisp:
CL-USER> (cos-sim '(1 0 0) '(1 0 0))
1.0
CL-USER> (cos-sim '(1 0 0) '(-1 0 0))
-1.0
CL-USER> (cos-sim '(1 0 0) '(0 1 0))
0.0
CL-USER> (cos-sim '(1 1 0) '(0 1 0))
0.70710677
A simple option is to translate the Python version literally to Hy, like this:
(defn cos_sim [A B]
(import math :as _math)
(setv [n da db d] [0 0 0 0])
(for [[a b] (zip A B)]
(+= n (* a b))
(+= da (* a a))
(+= db (* b b)))
(setv
da (_math.sqrt da)
db (_math.sqrt db)
d (* da db))
(/ n (+ d 1e-32)))
I think your proposed solution is fairly 'lispy': build several short, easy to read functions that combine into your solution. EG:
(defun n (A B)
(sqrt (reduce #'+ (map 'list #'* A B))))
(defun da (A)
(sqrt (reduce #'+ (map 'list #'* A A))))
(defun db (B)
(sqrt (reduce #'+ (map 'list #'* B B))))
(defun cos-sim (A B)
(let ((n (n A B))
(da (da A))
(db (db B)))
(/ (* n n) (+ (* da db) 1e-32)))
But, notice that n, da, and db look very similar. We can see if we can make those a single function, or macro. In this case, a function with an optional second list parameter is easy enough. (And note that I've defined n in a slightly weird way to emphasize this, but we might prefer not to take a square root and then square it for our final calculation. This would be easy to change by checking for passing the optional parameter (included as B-p below); I chose to move the square root inside the combined function) Anyway, this gives us:
(defun d (A &optional (B A B-p))
(reduce #'+ (map 'list #'* A B)))
(defun cos-sim (A B)
(let ((n (d A B))
(da (sqrt (d A)))
(db (sqrt (d B))))
(/ n (+ (* da db) 1e-32))))
Alternately, using Loop is very Common Lisp-y, and is more directly similar to the python:
(defun cos-sim (A B)
(loop for a in A
for b in B
sum (* a b) into n
sum (* a a) into da
sum (* b b) into db
finally (return (/ n (+ (sqrt (* da db)) 1e-32)))))
Here is a fairly natural (I think) approach in Racket. Essentially this is a process of folding a pair of sequences of numbers, so that's what we do. Note that this uses no explicit assignment, and also pulls the square root up a level (sqrt(a) * sqrt(b) = sqrt(a*b) as taking roots is likely expensive (this probably does not matter in practice). It also doesn't do the weird adding of a tiny float, which I presume was an attempt to coerce a value which might not be a float to a float? If so that's the wrong way to do that, and it's also not needed in a language like Racket (and most Lisps) which strive to do arithmetic correctly where possible.
(define (cos-sim a b)
;; a and b are sequences of numbers
(let-values ([(a^2-sum b^2-sum ab-sum)
(for/fold ([a^2-running 0]
[b^2-running 0]
[ab-running 0])
([ai a] [bi b])
(values (+ (* ai ai) a^2-running)
(+ (* bi bi) b^2-running)
(+ (* ai bi) ab-running)))])
(/ ab-sum (sqrt (* a^2-sum b^2-sum)))))
You can relatively easily turn this into typed Racket:
(define (cos-sim (a : (Sequenceof Number))
(b : (Sequenceof Number)))
: Number
(let-values ([(a^2-sum b^2-sum ab-sum)
(for/fold ([a^2-running : Number 0]
[b^2-running : Number 0]
[ab-running : Number 0])
([ai a] [bi b])
(values (+ (* ai ai) a^2-running)
(+ (* bi bi) b^2-running)
(+ (* ai bi) ab-running)))])
(/ ab-sum (sqrt (* a^2-sum b^2-sum)))))
This probably is no faster, but it is fussier.
This might be faster though:
(define (cos-sim/flonum (a : (Sequenceof Flonum))
(b : (Sequenceof Flonum)))
: Flonum
(let-values ([(a^2-sum b^2-sum ab-sum)
(for/fold ([a^2-running : Flonum 0.0]
[b^2-running : Flonum 0.0]
[ab-running : Flonum 0.0])
([ai a] [bi b])
(values (+ (* ai ai) a^2-running)
(+ (* bi bi) b^2-running)
(+ (* ai bi) ab-running)))])
(/ ab-sum (assert (sqrt (* a^2-sum b^2-sum)) flonum?))))
I have not checked it is however.
Your Hy example is already linear time. None of the nested loops multiply their number of iterations based on the length of input. It could be simplified to make this easier to see
(import math)
(defn dot [a b]
(sum (map * a b)))
(defn l2norm [a]
(math.sqrt (dot a a)))
(defn cossim [a b]
(/ (dot a b) (* (l2norm a) (l2norm b))))
I think this version is clearer than the Python version, because it's closer to the math notation.
Let's also inline the l2norm to make the number of loops easier to see.
(defn cossim [a b]
(/ (dot a b)
(* (math.sqrt (dot a a))
(math.sqrt (dot b b)))))
Python's map() is lazy, so the sum() and map() together only loop once. You effectively have three loops, one for each dot, and none of them are nested. Your Python version had one loop, but it was doing more calculations each iteration. Theoretically, it doesn't matter if you calculate row-by-row or column-by-column: multiplication is commutative, either rows by columns or columns by rows are the same number of calculations.
However, in practice, Python does have significant overhead for function calls, so I would expect the Hy version using higher-order functions to be slower than the Python version that doesn't have any function calls in the loop body. This is a constant factor slowdown, so it's still linear time.
If you want fast loops for calculations in Python, put your data in a matrix and use Numpy.
I have the following two Haskell expressions:
map (\f x -> f x 5) [(-),(+),(*)]
map (\f x -> f 5 x) [(-),(+),(*)]
And I'm trying to figure out whether either expression above is equivalent to the following expression:
map ($ 5) [(-),(+),(*)]
I am trying to understand what the difference between the first two expressions is.
Since for both expressions, there is only one parameter passed to the lambda function (e.g. the operator), the function will be partially applied.
Is it correct to say that the elements in the result list from the first expression will be:
(1) - x 5 = (- x) 5
(2) + x 5 = (+ x) 5
(3) * x 5 = (* x) 5
And for the second expression:
(1) - 5 x = (- 5) x
(2) + 5 x = (+ 5) x
(3) * 5 x = (* 5) x
However, I don't think that either expression is equivalent to map ($ 5) [(-),(+),(*)]. This is because (- x) 5 (where x is a number) gives an error in GHCI and is an invalid expression. Similarly (- 5) x also gives an error.
On the other hand, map ($5) [(-)], results in a function that takes a number and subtracts it from 5.
Is this reasoning correct? Any insights are appreciated.
(- 5) 5 gives out an error because prefix minus is a special case in the language syntax: (- 5) means minus five, the number, and not a function that subtracts five (see also: Currying subtraction). That being so, I will focus on the (+) case, which is not exceptional.
In your second expression, map (\f x -> f 5 x) [(-),(+),(*)], the second element of the result list will be:
(\f x -> f 5 x) (+)
When evaluating such a thing by hand, it is important to be careful to not mix up prefix, infix and sectioned uses of operators. Application here gives out...
\x -> (+) 5 x -- Prefix syntax (note the parentheses around the operator)
... which is equivalent to...
\x -> 5 + x -- Infix syntax
... and to:
\x -> (5 +) x -- Left section
\x -> (+ x) 5 -- Right section
(5 +) -- Left section, pointfree
So the sections, which are patterned after infix usage of the operators, should be the other way around relative to your question. As for map ($ 5) [(-),(+),(*)], it is equivalent to map (\f x -> f 5 x) [(-),(+),(*)], your second expression. You can confirm that by using the fact that ($) f x = f x to figure out what the ($ 5) right section is.
(define n 100)
(define (f a) n)
(define (g n) n)
(define (h n) (f 0))
Why (define (h n) (f 0)) evaluate to 100 instead of 10 when calling (h 10)?
When calling (h 10), will n be redefined as 10 or still 100? How about (g 10)?
So when you make a procedure and introduce a binding it is only accessible withing the scope of that binding:
((lambda (w) (+ w 10)) 2) ; ==> 12
w ; ERROR: Undefined variable (it only exists inside the lambda)
You can rename any parameter as long as you rename all of it's use. Eg. this is the exact same:
((lambda (w2) (+ w2 10)) 2) ; w renamed to w2 and has not effect on the outcome.
Actually, many compilers rename all identifiers so that they are unique. I'll do that with your code now:
(define n_1 100)
(define (f_1 a_1) n_1)
(define (g_1 n_2) n_2)
(define (h_1 n_3) (f_1 0))
(h_1 10) ; ==> 100
This is the same code as in your question. All I've done is rename so that bindings are not shadowed by new closures that use the same names.
Do you see now why (h_1 10) evaluates to 100 now that there are no shadowed bindings?
Fun fact: In a lisp language without closures we have dynamic binding. There the variables created last in runtime dictates what n is. In a dynamic Scheme my rewrite would work the same as in a normal lexical Scheme, but your original code would evaluate (h 10) ; ==> 10 since the local dynamic binding n to 10 is the closest.
Is it considered idiomatic or non-idiomatic to have a LET block nested inside aFLET/LABELS block ?
Again, I may be coming at this all wrong, but I'm trying to mimic the generic where block in Haskell (so I have a defun and I want to write code that uses certain temporary bindings for values and functions.
In case these are non-idiomatic (after all, I shouldn't expect to transfer over usage from one language to another), what's the right way to do this ?
E.g. something like (stupid example follows ...)
(defun f (x) (
(let* ((x 4)
(y (1+ x))
(flet ((g (x) (+ 2 x)))
(g y))))
You want to know if it's a difference of preference between:
(defun f (x)
(let* ((x 4) (y (1+ x)))
(flet ((g (x) (+ 2 x)))
(g y))))
and
(defun f (x)
(flet ((g (x) (+ 2 x)))
(let* ((x 4) (y (1+ x)))
(g y))))
?
It really doesn't matter which order you put flet/labels and let/let* in this case. It will produce the same result and your CL implementation might optimize your code such that the result would be the same anyway.
In a LISP-1 you would have put it in the same let and then the question would be if you should put the lambda first or last. Seems like taste to me.
The only case where there is a difference is when you are making calculations that are free variables in your function. Like this:
(defun f (x)
(let ((y (1+ x)))
(flet ((g (x) (+ 2 x y))) ; y is free, made in the let*
(g x))))
(f 5) ; ==> 13
Switching order is now impossible without moving logic since the function uses a free variable. You could put the let inside the definition of g like this:
(defun f (x)
(flet ((g (z) ; renamed to not shadow original x
(let* ((y (1+ x)))
(+ 2 z y)))
(g x))))
But imagine you used it with mapcar, reduce or recursion. Then it would have done the calculation for every iteration instead of once before the call. These are the cases that really matter.
In haskell it is posible to partially apply an infix function using sections, for instance given the infix function < (less than) one can partially apply any of the function's arguments: (5 <) , (< 5)
In other words, in haskell we have the following shorthand notation:
op :: a -> b -> c
(`op` y) === \x -> x `op` y
(x `op`) === \y -> x `op` y
Does F# have a similar concept?
No, neither of those (apart from standard partial application like (=) x).
Whereas I like the succinctness of Seq.find ((=) x), things like Seq.filter ((<) 3) (or even Seq.map (flip (-) 1)) are simply awkward to read and should immediately be replaced by a lambda expression, imo.
If you want to invent your own standards...
let lsection x f y -> f x y
let rsection f y x -> f x y
Then lsection 5 (<) === (5 <) and rsection (<) 5 === (< 5).
Though really, without language support, just put a lambda in there and it'll be clearer.