How to write a PicoLisp function that does not shadow variables with it's parameters - scope

I am idly exploring PicoLisp, and find myself perplexed about how to write meta-programming functions that would traditionally be handled with macros (in other lisp dialects). The biggest source of concern for me is that I do not see how I can prevent variable name shadowing. Reviewing the examples in Metaprogramming 101 has, if anything, just left me more confused.
Examples on how to implement the function mapeach, as seen in the linked article:
[de mapeach "Args" # expression
(let [(#Var #List . #Body) "Args"]
(macro
(mapcar
'((#Var) . #Body)
#List ]
(de mapeach "Args"
(mapcar
(cons (cons (car "Args")) (cddr "Args"))
(eval (cadr "Args")) ) )
(de mapeach "Args"
(mapcar
'(("E")
(bind (car "Args")
(set (car "Args") "E")
(run (cddr "Args")) ) )
(eval (cadr "Args")) ) )
(de mapeach "Args"
(let "Vars" (pop '"Args")
(apply mapcar
(mapcar eval (cut (length "Vars") '"Args"))
(cons "Vars" "Args") ) ) )
I have tested each of these with the call (let "Args" * (mapeach N (1 2 3) ("Args" N N))). As expected, the PicoLisp interpreter (started with the command pil +) experiences a segfault and crashes. I assume this is because mapeach's "Args" shadows the "Args" defined at call point.
I also tried both of their implementations of map# (the "cuter" alternative to mapeach).
(de map# "Args"
(mapcar
'(("E") (and "E" (run (cdr "Args")))) # 'and' sets '#'
(eval (car "Args")) ) )
(de map# "Args"
(mapcar
'((#) (run (cdr "Args")))
(eval (car "Args")) ) )
I used (let "Args" * (map# (1 2 3) ("Args" # #))) to test each of those implementations. Bizarrely, the first time I tested the first implementation, not only did it not segfault, it actually produced the correct result (1 4 9). Each subsequent test has resulted in a segfault. For clarity, the snippet from the prompt:
: (de map# "Args"
(mapcar
'(("E") (and "E" (run (cdr "Args")))) # 'and' sets '#'
(eval (car "Args")) ) )
-> map#
: (let "Args" * (mapeach N (1 2 3) ("Args" N N)))
!? (mapeach N (1 2 3) ("Args" N N))
mapeach -- Undefined
?
: (let "Args" * (map# (1 2 3) ("Args" # #)))
-> (1 4 9)
I believe that the segfault was somehow prevented by the call to the (then) undefined function mapeach, I also tried (ooga booga), which similarly prevented the segfault. If I do not have the erroneous call separating the definition from the proper call, a segfault always occurs.
This ultimately culminates in 2 questions:
How can I prevent name shadowing? Clearly the examples do not succeed in that regard.
Why does that call to map# not result in a segfault?

According to this "The index for transient symbols is cleared automatically before and after loading a source file, or it can be reset explicitly with the ==== function". It doesn't specify any way that it is automatically cleared during regular REPL usage, which is the context in which I was testing this.
This code runs properly:
[de mapeach "Args" # expression
(let [(#Var #List . #Body) "Args"]
(macro
(mapcar
'((#Var) . #Body)
#List ]
(====)
(let "Args" * (mapeach N (1 2 3) ("Args" N N)))
It also runs as expected without the call to ====, but only if the call to mapeach is not in the same file.
To address the 2 parts of my question:
You can prevent name shadowing by using transient symbols either in different files, or followed by a call to ====.
Those calls likely worked because the debugger clears the index which contains the transient symbols.

Related

Binary Search Tree "not a procedure" issue for In-Order-Traversal in Racket/Scheme/Lisp [duplicate]

During the execution of my code I get the following errors in the different Scheme implementations:
Racket:
application: not a procedure;
expected a procedure that can be applied to arguments
given: '(1 2 3)
arguments...:
Ikarus:
Unhandled exception
Condition components:
1. &assertion
2. &who: apply
3. &message: "not a procedure"
4. &irritants: ((1 2 3))
Chicken:
Error: call of non-procedure: (1 2 3)
Gambit:
*** ERROR IN (console)#2.1 -- Operator is not a PROCEDURE
((1 2 3) 4)
MIT Scheme:
;The object (1 2 3) is not applicable.
;To continue, call RESTART with an option number:
; (RESTART 2) => Specify a procedure to use in its place.
; (RESTART 1) => Return to read-eval-print level 1.
Chez Scheme:
Exception: attempt to apply non-procedure (1 2 3)
Type (debug) to enter the debugger.
Guile:
ERROR: In procedure (1 2 3):
ERROR: Wrong type to apply: (1 2 3)
Chibi:
ERROR in final-resumer: non procedure application: (1 2 3)
Why is it happening
Scheme procedure/function calls look like this:
(operator operand ...)
Both operator and operands can be variables like test, and + that evaluates to different values. For a procedure call to work it has to be a procedure. From the error message it seems likely that test is not a procedure but the list (1 2 3).
All parts of a form can also be expressions so something like ((proc1 4) 5) is valid syntax and it is expected that the call (proc1 4) returns a procedure that is then called with 5 as it's sole argument.
Common mistakes that produces these errors.
Trying to group expressions or create a block
(if (< a b)
((proc1)
(proc2))
#f)
When the predicate/test is true Scheme assumes will try to evaluate both (proc1) and (proc2) then it will call the result of (proc1) because of the parentheses. To create a block in Scheme you use begin:
(if (< a b)
(begin
(proc1)
(proc2))
#f)
In this (proc1) is called just for effect and the result of teh form will be the result of the last expression (proc2).
Shadowing procedures
(define (test list)
(list (cdr list) (car list)))
Here the parameter is called list which makes the procedure list unavailable for the duration of the call. One variable can only be either a procedure or a different value in Scheme and the closest binding is the one that you get in both operator and operand position. This would be a typical mistake made by common-lispers since in CL they can use list as an argument without messing with the function list.
wrapping variables in cond
(define test #t) ; this might be result of a procedure
(cond
((< 5 4) result1)
((test) result2)
(else result3))
While besides the predicate expression (< 5 4) (test) looks correct since it is a value that is checked for thurthness it has more in common with the else term and whould be written like this:
(cond
((< 5 4) result1)
(test result2)
(else result3))
A procedure that should return a procedure doesn't always
Since Scheme doesn't enforce return type your procedure can return a procedure in one situation and a non procedure value in another.
(define (test v)
(if (> v 4)
(lambda (g) (* v g))
'(1 2 3)))
((test 5) 10) ; ==> 50
((test 4) 10) ; ERROR! application: not a procedure
Undefined values like #<void>, #!void, #<undef>, and #<unspecified>
These are usually values returned by mutating forms like set!, set-car!, set-cdr!, define.
(define (test x)
((set! f x) 5))
(test (lambda (x) (* x x)))
The result of this code is undetermined since set! can return any value and I know some scheme implementations like MIT Scheme actually return the bound value or the original value and the result would be 25 or 10, but in many implementations you get a constant value like #<void> and since it is not a procedure you get the same error. Relying on one implementations method of using under specification makes gives you non portable code.
Passing arguments in wrong order
Imagine you have a fucntion like this:
(define (double v f)
(f (f v)))
(double 10 (lambda (v) (* v v))) ; ==> 10000
If you by error swapped the arguments:
(double (lambda (v) (* v v)) 10) ; ERROR: 10 is not a procedure
In higher order functions such as fold and map not passing the arguments in the correct order will produce a similar error.
Trying to apply as in Algol derived languages
In algol languages, like JavaScript and C++, when trying to apply fun with argument arg it looks like:
fun(arg)
This gets interpreted as two separate expressions in Scheme:
fun ; ==> valuates to a procedure object
(arg) ; ==> call arg with no arguments
The correct way to apply fun with arg as argument is:
(fun arg)
Superfluous parentheses
This is the general "catch all" other errors. Code like ((+ 4 5)) will not work in Scheme since each set of parentheses in this expression is a procedure call. You simply cannot add as many as you like and thus you need to keep it (+ 4 5).
Why allow these errors to happen?
Expressions in operator position and allow to call variables as library functions gives expressive powers to the language. These are features you will love having when you have become used to it.
Here is an example of abs:
(define (abs x)
((if (< x 0) - values) x))
This switched between doing (- x) and (values x) (identity that returns its argument) and as you can see it calls the result of an expression. Here is an example of copy-list using cps:
(define (copy-list lst)
(define (helper lst k)
(if (null? lst)
(k '())
(helper (cdr lst)
(lambda (res) (k (cons (car lst) res))))))
(helper lst values))
Notice that k is a variable that we pass a function and that it is called as a function. If we passed anything else than a fucntion there you would get the same error.
Is this unique to Scheme?
Not at all. All languages with one namespace that can pass functions as arguments will have similar challenges. Below is some JavaScript code with similar issues:
function double (f, v) {
return f(f(v));
}
double(v => v * v, 10); // ==> 10000
double(10, v => v * v);
; TypeError: f is not a function
; at double (repl:2:10)
// similar to having extra parentheses
function test (v) {
return v;
}
test(5)(6); // == TypeError: test(...) is not a function
// But it works if it's designed to return a function:
function test2 (v) {
return v2 => v2 + v;
}
test2(5)(6); // ==> 11

Can someone breakdown what's happening in this piece of code?

I'm very new to LISP programming and I'm having a real hard time with the syntax. The following code is from my notes and I know what it does but I'd really appreciate a line by line breakdown to better understand what's happening here. The "when" loop seemed pretty simple to understand but specifically I'm having a hard time trying to understand the first 3 lines in the "do" loop. Also I'm not sure why (:= acc (1+ acc) was used in the last line of the when loop.
(defun count-lower-case-vowels (str)
(do ((i 0 (1+ i))
(acc 0)
(len (length str)))
((= i len) acc)
(when (or (equal (aref str i) #\a) (equal (aref str i) #\e)
(equal (aref str i) #\i) (equal (aref str i) #\o)
(equal (aref str i) #\u))
(:= acc (1+ acc)))))
I'm a big proponent of lots and lots of extra white space, to achieve visual code alignment (in 2D, yes, as if on a piece of paper) to improve readability:
(defun count-lower-case-vowels (str)
(do ( (i 0 (1+ i) ) ; loop var `i`, its init, step exprs
(acc 0 ) ; loop var `acc`, its init expr
(len (length str) ) ) ; loop var `len`, its init expr
((= i len) ; loop stop condition
acc) ; return value when loop stops
(if ; loop body: if
(find (aref str i) "aeiou") ; test
(setf acc (1+ acc))))) ; consequent
Is this better?
It is definitely not the accepted standard of LISP code formatting. But whatever makes it more readable, I think is for the best.
The i's step expression's meaning is that on each step after the loop didn't stop and its body was evaluated, (setf i (1+ i)) is called. acc and len have no step expressions, so for them nothing is called on each step.
As to the "when loop" you mention, it is not a loop at all, and is not a part at all of the do loop's looping mechanism. A when form is just like an if without the alternative, which also allows for multiple statements in the consequent, as if with an implicit progn:
(when test a1 a2 ...)
===
(if test (progn a1 a2 ...))
It just so happens that this loop's body consists of one form which is a when form. I have re-written it with an equivalent if.
do is a macro expecting 3 parameters:
(do ((i 0 (1+ i))
(acc 0)
(len (length str))) ;; first argument
((= i len) acc) ;; Second one
(when ...) ;; third
)
The first argument is itself a list, each element of this element being of the following form:
<var-name> <var-initial-value> <var-next-value>
In your case, the form (i 0 (1+ i)) means that in the body of the do macro (= in the third argument), you introduce a new, local variable called i. It starts with the value 0, and at each step of the loop, it gets updated to the value (1+ i) (i.e. it gets incremented by 1).
You see that the second element of this list is acc 0 with no <var-next-value> in it. It means that acc won't get updated automatically by the macro, and its value will change only according to what is done in the body.
The second argument is a list of one or (optionally) two elements <condition> <return-val> The first one <condition> is stating when to stop the iteration: once it evaluates to true, the macro stops. It gets evaluated before each iteration. The second, optional part, is a form stating what the do form returns. By default, it returns nil, but if you specify a form there, it will be evaluated before exiting the loop and return-val is returned instead.
The third argument is simply a list of forms that will get executed at each step, provided the condition is false.
Note that the code you have posted is older style.
Nowadays it can be written much shorter with loop and find:
(defun count-lower-case-vowels (string)
(loop for c across string
count (find c "aeiou")))

Returning a string in a function in Clojure

I'm trying to return a string value when I read this CSV file that contains cities and city attributes. Here is what I have so far:
(defn city [name]
(with-open [rdr (reader)]
(doseq [line (drop 1 (line-seq rdr))]
(def x2 line)
(def y (string/split x2 #","))
(if (= name (y 0))
(println line)
))))
(city "Toronto")
=> Toronto,43.666667,-79.416667,Canada,CA,Ontario,admin,5213000,3934421
I can get it to print out the row, but how would I go about getting the function to return the row instead, if that makes sense?
With how you have the code setup currently, you can't. doseq is meant to carry out side effects; it doesn't return anything. Rarely do you ever want to use doseq, and rarely should you ever use def inside of function definitions.
You want to find the first line where (= name (y 0)) is true. There's a few ways of approaching that. A basic way would be using loop and just stopping it once you find the line. I think using map or for to loop over the line-seq, then grabbing the first result would work out well here though:
(defn city [name]
(with-open [rdr (reader)]
(first
(for [line (drop 1 (line-seq rdr)) ; Same syntax here as with doseq
:let [y (string/split line #",")] ; Use let instead of def for local definitions
:when (= name (y 0))] ; Only add to the list ":when (= name (y 0))"
line))))
for is like Python's generator expression (if you're familiar with Python). It is not like a normal imperative for loop like in Java. The for will return a list of lines for which (= name (y 0)) was true. Because presumably there's only one such valid line in the file though, we only want one result, so we pass the list to first to get the first valid line found.
And note that for is lazy. This does not iterate the entire file before passing off to first. first requests the first element before for has even iterated, and no more iteration is done once a matching line is found.
The println function is meant for side-effects, and always returns nil. Adjust your function to return line as the last item after the if:
(if (= name (y 0))
line)
If you haven't seen it yet, look at
Brave Clojure (free & book)
Getting Clojure (book)
Clojure Cheatsheet
Here is a better organized version of the code. Your project dependencies will need to look like:
:dependencies [
[org.clojure/clojure "1.10.1"]
[prismatic/schema "1.1.12"]
[tupelo "0.9.168"]
]
and the code can then look like:
(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:require
[clojure.string :as str]
[clojure.java.io :as io]
[tupelo.string :as ts]))
(def city-data
" city,lat,lon,country,ccode,province,unk1,unk2,unk3
Toronto,43.666667,-79.416667,Canada,CA,Ontario,admin,5213000,3934421
Chicago,40.666667,-99.416667,USA,US,dummy,admin,5213000,3934421
")
(defn city->fields [city-str]
(str/split city-str #","))
(defn city [name]
(with-open [rdr (io/reader (ts/string->stream city-data))]
(let [lines (mapv str/trim (line-seq rdr))
hdrs-line (first lines)
city-lines (rest lines)
cities-fields (mapv city->fields city-lines)
city-match (first (filterv #(= name (first %)) cities-fields))]
; debug printouts
(spyx hdrs-line)
(spyx-pretty city-lines)
(spyx city-match)
city-match))) ; <= return value
(dotest
(println "Result: " (city "Toronto"))
)
with result:
-------------------------------
Clojure 1.10.1 Java 13
-------------------------------
Testing tst.demo.core
hdrs-line => "city,lat,lon,country,ccode,province,unk1,unk2,unk3"
city-lines =>
("Toronto,43.666667,-79.416667,Canada,CA,Ontario,admin,5213000,3934421"
"Chicago,40.666667,-99.416667,USA,US,dummy,admin,5213000,3934421"
"")
city-match => ["Toronto" "43.666667" "-79.416667" "Canada" "CA" "Ontario" "admin" "5213000" "3934421"]
Result: [Toronto 43.666667 -79.416667 Canada CA Ontario admin 5213000 3934421]

Split a string even if the last character is a delimiter

I want to delete some characters at the end of a string.
I made this function :
(defun del-delimiter-at-end (string)
(cond
((eq (delimiterp (char string (- (length string) 1))) nil)
string )
(t
(del-delimiterp-at-end (subseq string 0 (- (length string) 1))) ) ) )
with this :
(defun delimiterp (c) (position c " ,.;!?/"))
But I don't understand why it doesn't work. I have the following error :
Index must be positive and not -1
Note that I want to split a string in list of strings, I already looked here :
Lisp - Splitting Input into Separate Strings
but it doesn't work if the end of the string is a delimiter, that's why I'm trying to do that.
What am I doing wrong?
Thanks in advance.
The Easy Way
Just use string-right-trim:
(string-right-trim " ,.;!?/" s)
Your Error
If you pass an empty string to you del-delimiter-at-end, you will be passing -1 as the second argument to char.
Your Code
There is no reason to do (eq (delimiterp ...) nil); just use (delimiterp ...) instead (and switch the clauses!)
It is mode idiomatic to use if and not cond when you have just two clauses and each has just one form.
You call subseq recursively, which means that you not only allocate memory for no reason, your algorithm is also quadratic in string length.
There are really two questions here. One is more specific, and is described in the body of the question. The other is more general, and is what the title asks about (how to split a sequence). I'll handle the immediate question that's in the body, of how to trim some elements from the end of a sequence. Then I'll handle the more general question of how to split a sequence in general, and how to split a list in the special case, since people who find this question based on its title may be interested in that.
Right-trimming a sequence
sds answered this perfectly if you're only concerned with strings. The language already includes string-right-trim, so that's probably the best way to solve this problem, if you're only concerned with strings.
A solution for sequences
That said, if you want a subseq based approach that works with arbitrary sequences, it makes sense to use the other sequence manipulation functions that the language provides. Many functions take a :from-end argument and have -if-not variants that can help. In this case, you can use position-if-not to find the rightmost non-delimiter in your sequence, and then use subseq:
(defun delimiterp (c)
(position c " ,.;!?/"))
(defun right-trim-if (sequence test)
(let ((pos (position-if-not test sequence :from-end t)))
(subseq sequence 0 (if (null pos) 0 (1+ pos)))))
(right-trim-if "hello!" 'delimiterp) ; some delimiters to trim
;=> "hello"
(right-trim-if "hi_there" 'delimiterp) ; nothing to trim, with other stuff
;=> "hi_there"
(right-trim-if "?" 'delimiterp) ; only delimiters
;=> ""
(right-trim-if "" 'delimiterp) ; nothing at all
;=> ""
Using complement and position
Some people may point out that position-if-not is deprecated. If you don't want to use it, you can use complement and position-if to achieve the same effect. (I haven't noticed an actual aversion to the -if-not functions though.) The HyperSpec entry on complement says:
In Common Lisp, functions with names like xxx-if-not are related
to functions with names like xxx-if in that
(xxx-if-not f . arguments) == (xxx-if (complement f) . arguments)
For example,
(find-if-not #'zerop '(0 0 3)) ==
(find-if (complement #'zerop) '(0 0 3)) => 3
Note that since the xxx-if-not functions and the :test-not
arguments have been deprecated, uses of xxx-if functions or :test
arguments with complement are preferred.
That said, position and position-if-not take function designators, which means that you can pass the symbol delimiterp to them, as we did in
(right-trim-if "hello!" 'delimiterp) ; some delimiters to trim
;=> "hello"
complement, though, doesn't want a function designator (i.e., a symbol or function), it actually wants a function object. So you can define right-trim-if as
(defun right-trim-if (sequence test)
(let ((pos (position-if (complement test) sequence :from-end t)))
(subseq sequence 0 (if (null pos) 0 (1+ pos)))))
but you'll have to call it with the function object, not the symbol:
(right-trim-if "hello!" #'delimiterp)
;=> "hello"
(right-trim-if "hello!" 'delimiterp)
; Error
Splitting a sequence
If you're not just trying to right-trim the sequence, then you can implement a split function without too much trouble. The idea is to increment a "start" pointer into the sequence. It first points to the beginning of the sequence. Then you find the first delimiter and grab the subsequence between them. Then find the the next non-delimiter after that, and treat that as the new start point.
(defun split (sequence test)
(do ((start 0)
(results '()))
((null start) (nreverse results))
(let ((p (position-if test sequence :start start)))
(push (subseq sequence start p) results)
(setf start (if (null p)
nil
(position-if-not test sequence :start p))))))
This works on multiple kinds of sequences, and you don't end up with non delimiters in your subsequences:
CL-USER> (split '(1 2 4 5 7) 'evenp)
((1) (5 7))
CL-USER> (split '(1 2 4 5 7) 'oddp)
(NIL (2 4))
CL-USER> (split "abc123def456" 'alpha-char-p)
("" "123" "456")
CL-USER> (split #(1 2 3 foo 4 5 6 let 7 8 list) 'symbolp)
(#(1 2 3) #(4 5 6) #(7 8))
Although this works for sequences of all types, it's not very efficient for lists, since subseq, position, etc., all have to traverse the list up to the start position. For lists, it's better to use a list specific implementation:
(defun split-list (list test)
(do ((results '()))
((endp list)
(nreverse results))
(let* ((tail (member-if test list))
(head (ldiff list tail)))
(push head results)
(setf list (member-if-not test tail)))))
CL-USER> (split-list '(1 2 4 5 7) 'oddp)
(NIL (2 4))
CL-USER> (split-list '(1 2 4 5 7) 'evenp)
((1) (5 7))
Instead of member-if and ldiff, you could also us cut from this answer to Idiomatic way to group a sorted list of integers?.

Alternate version of swap! also returning swapped out value

I talked about this a bit on IRC's #clojure channel today but would like to go more in detail here. Basically, in order to better understand atoms, swap!, deref and Clojure concurrency as a whole, I'd like to try to write a function which not only returns the value that was swapped-in using swap!, but also the value that was swapped out.
(def foo (atom 42))
.
.
.
((fn [a]
(do
(println "swapped out: " #a)
(println "swapped in: "(swap! a rand-int)))) foo)
may print:
swapped out: 42
swapped in: 14
However if another thread does swap! the same atom between the #a deref and the call to swap! then I may be swapping out a value that is not 42.
How can I write a function which gives back correctly both values (the swapped out and the swapped in)?
I don't care about the various values that the atom does change to: all I want to know is what was the value swapped out.
Can this be written using code that is guaranteed not to deadlock and if so why?
Clojure's swap! is just a spinning compare-and-set. You can define an alternate version that returns whatever you like:
(defn alternate-swap [atom f & args]
(loop []
(let [old #atom
new (apply f old args)]
(if (compare-and-set! atom old new)
[old new] ; return value
(recur)))))
Atoms are un-coordinated so it seems likely that any attempt to do this outside of the swapping function it's self will likely fail. You could write a function that you call instead of swap! which constructs a function that saves the existing value before applying the real function, and then pass this constructed function to swap!.
user> (def foo (atom []))
#'user/foo
user> (defn save-n-swap! [a f & args]
(swap! a (fn [old-val]
(let [new-val (apply f (cons old-val args))]
(println "swapped out: " old-val "\n" "swapped in: " new-val)
new-val))))
#'user/save-n-swap!
user> (save-n-swap! foo conj 4)
swapped out: []
swapped in: [4]
[4]
user> (save-n-swap! foo conj 4)
swapped out: [4]
swapped in: [4 4]
[4 4]
This example prints it, It would also make sense to push them to a changelog stored in another atom
If you want the return value, Stuart answer is the correct one, but if you are just going to do a bunch of println to understand how atoms/refs work, I would recommend to add a watch to the atom/ref http://clojuredocs.org/clojure_core/1.2.0/clojure.core/add-watch
(add-watch your-atom :debug (fn [_ _ old new] (println "out" old "new" new)))
You could use a macro like:
(defmacro swap!-> [atom & args]
`(let [old-val# (atom nil)
new-val# (swap! ~atom #(do
(swap! old-val# (constantly %))
(-> % ~args)))]
{:old #old-val# :new new-val#}))
(def data (atom {}))
(swap!-> data assoc :a 3001)
=> {:new {:a 3001} :old {}}
Refer to swap-vals! available since 1.9: https://clojuredocs.org/clojure.core/swap-vals%21
You could rely on a promise to store the current value inside the swap! operation. Then you return the new and old value in a vector, as follows:
(defn- swap-and-return-old-value!
[^clojure.lang.IAtom atom f & args]
(let [old-value-promise (promise)
new-value (swap! atom
(fn [old-value]
(deliver old-value-promise old-value)
(apply f old-value args)))]
[new-value #old-value-promise]))

Resources