Split a string even if the last character is a delimiter

Split a string even if the last character is a delimiter - string

I want to delete some characters at the end of a string.
I made this function :
(defun del-delimiter-at-end (string)
(cond
((eq (delimiterp (char string (- (length string) 1))) nil)
string )
(t
(del-delimiterp-at-end (subseq string 0 (- (length string) 1))) ) ) )
with this :
(defun delimiterp (c) (position c " ,.;!?/"))
But I don't understand why it doesn't work. I have the following error :
Index must be positive and not -1
Note that I want to split a string in list of strings, I already looked here :
Lisp - Splitting Input into Separate Strings
but it doesn't work if the end of the string is a delimiter, that's why I'm trying to do that.
What am I doing wrong?
Thanks in advance.

The Easy Way
Just use string-right-trim:
(string-right-trim " ,.;!?/" s)
Your Error
If you pass an empty string to you del-delimiter-at-end, you will be passing -1 as the second argument to char.
Your Code
There is no reason to do (eq (delimiterp ...) nil); just use (delimiterp ...) instead (and switch the clauses!)
It is mode idiomatic to use if and not cond when you have just two clauses and each has just one form.
You call subseq recursively, which means that you not only allocate memory for no reason, your algorithm is also quadratic in string length.

There are really two questions here. One is more specific, and is described in the body of the question. The other is more general, and is what the title asks about (how to split a sequence). I'll handle the immediate question that's in the body, of how to trim some elements from the end of a sequence. Then I'll handle the more general question of how to split a sequence in general, and how to split a list in the special case, since people who find this question based on its title may be interested in that.
Right-trimming a sequence
sds answered this perfectly if you're only concerned with strings. The language already includes string-right-trim, so that's probably the best way to solve this problem, if you're only concerned with strings.
A solution for sequences
That said, if you want a subseq based approach that works with arbitrary sequences, it makes sense to use the other sequence manipulation functions that the language provides. Many functions take a :from-end argument and have -if-not variants that can help. In this case, you can use position-if-not to find the rightmost non-delimiter in your sequence, and then use subseq:
(defun delimiterp (c)
(position c " ,.;!?/"))
(defun right-trim-if (sequence test)
(let ((pos (position-if-not test sequence :from-end t)))
(subseq sequence 0 (if (null pos) 0 (1+ pos)))))
(right-trim-if "hello!" 'delimiterp) ; some delimiters to trim
;=> "hello"
(right-trim-if "hi_there" 'delimiterp) ; nothing to trim, with other stuff
;=> "hi_there"
(right-trim-if "?" 'delimiterp) ; only delimiters
;=> ""
(right-trim-if "" 'delimiterp) ; nothing at all
;=> ""
Using complement and position
Some people may point out that position-if-not is deprecated. If you don't want to use it, you can use complement and position-if to achieve the same effect. (I haven't noticed an actual aversion to the -if-not functions though.) The HyperSpec entry on complement says:
In Common Lisp, functions with names like xxx-if-not are related
to functions with names like xxx-if in that
(xxx-if-not f . arguments) == (xxx-if (complement f) . arguments)
For example,
(find-if-not #'zerop '(0 0 3)) ==
(find-if (complement #'zerop) '(0 0 3)) => 3
Note that since the xxx-if-not functions and the :test-not
arguments have been deprecated, uses of xxx-if functions or :test
arguments with complement are preferred.
That said, position and position-if-not take function designators, which means that you can pass the symbol delimiterp to them, as we did in
(right-trim-if "hello!" 'delimiterp) ; some delimiters to trim
;=> "hello"
complement, though, doesn't want a function designator (i.e., a symbol or function), it actually wants a function object. So you can define right-trim-if as
(defun right-trim-if (sequence test)
(let ((pos (position-if (complement test) sequence :from-end t)))
(subseq sequence 0 (if (null pos) 0 (1+ pos)))))
but you'll have to call it with the function object, not the symbol:
(right-trim-if "hello!" #'delimiterp)
;=> "hello"
(right-trim-if "hello!" 'delimiterp)
; Error
Splitting a sequence
If you're not just trying to right-trim the sequence, then you can implement a split function without too much trouble. The idea is to increment a "start" pointer into the sequence. It first points to the beginning of the sequence. Then you find the first delimiter and grab the subsequence between them. Then find the the next non-delimiter after that, and treat that as the new start point.
(defun split (sequence test)
(do ((start 0)
(results '()))
((null start) (nreverse results))
(let ((p (position-if test sequence :start start)))
(push (subseq sequence start p) results)
(setf start (if (null p)
nil
(position-if-not test sequence :start p))))))
This works on multiple kinds of sequences, and you don't end up with non delimiters in your subsequences:
CL-USER> (split '(1 2 4 5 7) 'evenp)
((1) (5 7))
CL-USER> (split '(1 2 4 5 7) 'oddp)
(NIL (2 4))
CL-USER> (split "abc123def456" 'alpha-char-p)
("" "123" "456")
CL-USER> (split #(1 2 3 foo 4 5 6 let 7 8 list) 'symbolp)
(#(1 2 3) #(4 5 6) #(7 8))
Although this works for sequences of all types, it's not very efficient for lists, since subseq, position, etc., all have to traverse the list up to the start position. For lists, it's better to use a list specific implementation:
(defun split-list (list test)
(do ((results '()))
((endp list)
(nreverse results))
(let* ((tail (member-if test list))
(head (ldiff list tail)))
(push head results)
(setf list (member-if-not test tail)))))
CL-USER> (split-list '(1 2 4 5 7) 'oddp)
(NIL (2 4))
CL-USER> (split-list '(1 2 4 5 7) 'evenp)
((1) (5 7))
Instead of member-if and ldiff, you could also us cut from this answer to Idiomatic way to group a sorted list of integers?.

Related

Binary Search Tree "not a procedure" issue for In-Order-Traversal in Racket/Scheme/Lisp [duplicate]

During the execution of my code I get the following errors in the different Scheme implementations:
Racket:
application: not a procedure;
expected a procedure that can be applied to arguments
given: '(1 2 3)
arguments...:
Ikarus:
Unhandled exception
Condition components:
1. &assertion
2. &who: apply
3. &message: "not a procedure"
4. &irritants: ((1 2 3))
Chicken:
Error: call of non-procedure: (1 2 3)
Gambit:
*** ERROR IN (console)#2.1 -- Operator is not a PROCEDURE
((1 2 3) 4)
MIT Scheme:
;The object (1 2 3) is not applicable.
;To continue, call RESTART with an option number:
; (RESTART 2) => Specify a procedure to use in its place.
; (RESTART 1) => Return to read-eval-print level 1.
Chez Scheme:
Exception: attempt to apply non-procedure (1 2 3)
Type (debug) to enter the debugger.
Guile:
ERROR: In procedure (1 2 3):
ERROR: Wrong type to apply: (1 2 3)
Chibi:
ERROR in final-resumer: non procedure application: (1 2 3)

Why is it happening
Scheme procedure/function calls look like this:
(operator operand ...)
Both operator and operands can be variables like test, and + that evaluates to different values. For a procedure call to work it has to be a procedure. From the error message it seems likely that test is not a procedure but the list (1 2 3).
All parts of a form can also be expressions so something like ((proc1 4) 5) is valid syntax and it is expected that the call (proc1 4) returns a procedure that is then called with 5 as it's sole argument.
Common mistakes that produces these errors.
Trying to group expressions or create a block
(if (< a b)
((proc1)
(proc2))
#f)
When the predicate/test is true Scheme assumes will try to evaluate both (proc1) and (proc2) then it will call the result of (proc1) because of the parentheses. To create a block in Scheme you use begin:
(if (< a b)
(begin
(proc1)
(proc2))
#f)
In this (proc1) is called just for effect and the result of teh form will be the result of the last expression (proc2).
Shadowing procedures
(define (test list)
(list (cdr list) (car list)))
Here the parameter is called list which makes the procedure list unavailable for the duration of the call. One variable can only be either a procedure or a different value in Scheme and the closest binding is the one that you get in both operator and operand position. This would be a typical mistake made by common-lispers since in CL they can use list as an argument without messing with the function list.
wrapping variables in cond
(define test #t) ; this might be result of a procedure
(cond
((< 5 4) result1)
((test) result2)
(else result3))
While besides the predicate expression (< 5 4) (test) looks correct since it is a value that is checked for thurthness it has more in common with the else term and whould be written like this:
(cond
((< 5 4) result1)
(test result2)
(else result3))
A procedure that should return a procedure doesn't always
Since Scheme doesn't enforce return type your procedure can return a procedure in one situation and a non procedure value in another.
(define (test v)
(if (> v 4)
(lambda (g) (* v g))
'(1 2 3)))
((test 5) 10) ; ==> 50
((test 4) 10) ; ERROR! application: not a procedure
Undefined values like #<void>, #!void, #<undef>, and #<unspecified>
These are usually values returned by mutating forms like set!, set-car!, set-cdr!, define.
(define (test x)
((set! f x) 5))
(test (lambda (x) (* x x)))
The result of this code is undetermined since set! can return any value and I know some scheme implementations like MIT Scheme actually return the bound value or the original value and the result would be 25 or 10, but in many implementations you get a constant value like #<void> and since it is not a procedure you get the same error. Relying on one implementations method of using under specification makes gives you non portable code.
Passing arguments in wrong order
Imagine you have a fucntion like this:
(define (double v f)
(f (f v)))
(double 10 (lambda (v) (* v v))) ; ==> 10000
If you by error swapped the arguments:
(double (lambda (v) (* v v)) 10) ; ERROR: 10 is not a procedure
In higher order functions such as fold and map not passing the arguments in the correct order will produce a similar error.
Trying to apply as in Algol derived languages
In algol languages, like JavaScript and C++, when trying to apply fun with argument arg it looks like:
fun(arg)
This gets interpreted as two separate expressions in Scheme:
fun ; ==> valuates to a procedure object
(arg) ; ==> call arg with no arguments
The correct way to apply fun with arg as argument is:
(fun arg)
Superfluous parentheses
This is the general "catch all" other errors. Code like ((+ 4 5)) will not work in Scheme since each set of parentheses in this expression is a procedure call. You simply cannot add as many as you like and thus you need to keep it (+ 4 5).
Why allow these errors to happen?
Expressions in operator position and allow to call variables as library functions gives expressive powers to the language. These are features you will love having when you have become used to it.
Here is an example of abs:
(define (abs x)
((if (< x 0) - values) x))
This switched between doing (- x) and (values x) (identity that returns its argument) and as you can see it calls the result of an expression. Here is an example of copy-list using cps:
(define (copy-list lst)
(define (helper lst k)
(if (null? lst)
(k '())
(helper (cdr lst)
(lambda (res) (k (cons (car lst) res))))))
(helper lst values))
Notice that k is a variable that we pass a function and that it is called as a function. If we passed anything else than a fucntion there you would get the same error.
Is this unique to Scheme?
Not at all. All languages with one namespace that can pass functions as arguments will have similar challenges. Below is some JavaScript code with similar issues:
function double (f, v) {
return f(f(v));
}
double(v => v * v, 10); // ==> 10000
double(10, v => v * v);
; TypeError: f is not a function
; at double (repl:2:10)
// similar to having extra parentheses
function test (v) {
return v;
}
test(5)(6); // == TypeError: test(...) is not a function
// But it works if it's designed to return a function:
function test2 (v) {
return v2 => v2 + v;
}
test2(5)(6); // ==> 11

Can someone breakdown what's happening in this piece of code?

I'm very new to LISP programming and I'm having a real hard time with the syntax. The following code is from my notes and I know what it does but I'd really appreciate a line by line breakdown to better understand what's happening here. The "when" loop seemed pretty simple to understand but specifically I'm having a hard time trying to understand the first 3 lines in the "do" loop. Also I'm not sure why (:= acc (1+ acc) was used in the last line of the when loop.
(defun count-lower-case-vowels (str)
(do ((i 0 (1+ i))
(acc 0)
(len (length str)))
((= i len) acc)
(when (or (equal (aref str i) #\a) (equal (aref str i) #\e)
(equal (aref str i) #\i) (equal (aref str i) #\o)
(equal (aref str i) #\u))
(:= acc (1+ acc)))))

I'm a big proponent of lots and lots of extra white space, to achieve visual code alignment (in 2D, yes, as if on a piece of paper) to improve readability:
(defun count-lower-case-vowels (str)
(do ( (i 0 (1+ i) ) ; loop var `i`, its init, step exprs
(acc 0 ) ; loop var `acc`, its init expr
(len (length str) ) ) ; loop var `len`, its init expr
((= i len) ; loop stop condition
acc) ; return value when loop stops
(if ; loop body: if
(find (aref str i) "aeiou") ; test
(setf acc (1+ acc))))) ; consequent
Is this better?
It is definitely not the accepted standard of LISP code formatting. But whatever makes it more readable, I think is for the best.
The i's step expression's meaning is that on each step after the loop didn't stop and its body was evaluated, (setf i (1+ i)) is called. acc and len have no step expressions, so for them nothing is called on each step.
As to the "when loop" you mention, it is not a loop at all, and is not a part at all of the do loop's looping mechanism. A when form is just like an if without the alternative, which also allows for multiple statements in the consequent, as if with an implicit progn:
(when test a1 a2 ...)
===
(if test (progn a1 a2 ...))
It just so happens that this loop's body consists of one form which is a when form. I have re-written it with an equivalent if.

do is a macro expecting 3 parameters:
(do ((i 0 (1+ i))
(acc 0)
(len (length str))) ;; first argument
((= i len) acc) ;; Second one
(when ...) ;; third
)
The first argument is itself a list, each element of this element being of the following form:
<var-name> <var-initial-value> <var-next-value>
In your case, the form (i 0 (1+ i)) means that in the body of the do macro (= in the third argument), you introduce a new, local variable called i. It starts with the value 0, and at each step of the loop, it gets updated to the value (1+ i) (i.e. it gets incremented by 1).
You see that the second element of this list is acc 0 with no <var-next-value> in it. It means that acc won't get updated automatically by the macro, and its value will change only according to what is done in the body.
The second argument is a list of one or (optionally) two elements <condition> <return-val> The first one <condition> is stating when to stop the iteration: once it evaluates to true, the macro stops. It gets evaluated before each iteration. The second, optional part, is a form stating what the do form returns. By default, it returns nil, but if you specify a form there, it will be evaluated before exiting the loop and return-val is returned instead.
The third argument is simply a list of forms that will get executed at each step, provided the condition is false.

Note that the code you have posted is older style.
Nowadays it can be written much shorter with loop and find:
(defun count-lower-case-vowels (string)
(loop for c across string
count (find c "aeiou")))

How to do looping in lisp?

Just started learning and coding lisp,
I'm trying to create a program that will continuously accept a number and stops only if and only if the last input number is twice the previous number.
Here's my code
----------
(let((a 0)
(b 0)
(count 0))
(loop
(= a b))
(princ"Enter Number: ")
(defvar a(read))
(format t "~% a = ~d" a)
(setq count (1+ count))
(while
(!= b(* a 2) || <= count 1)
(princ "Program Terminated Normally")
)
Thank you

a bit feedback
(let ((a 0)
(b 0)
(count 0))
(loop
(= a b)) ; here the LOOP is already over.
; You have a closing parenthesis
; -> you need better formatting
(princ"Enter Number: ")
(defvar a(read))
(format t "~% a = ~d" a)
(setq count (1+ count))
(while
(!= b(* a 2) || <= count 1)
(princ "Program Terminated Normally")
)
some improved formatting:
(let ((a 0)
(b 0)
(count 0))
(loop
(= a b)) ; LOOP ends here, that's not a good idea
(princ "Enter Number: ")
(defvar a(read)) ; DEFVAR is the wrong construct,
; you want to SETQ an already defined variable
(format t "~% a = ~d" a)
(setq count (1+ count))
(while ; WHILE does not exist as an operator
(!= b(* a 2) || <= count 1) ; This expression is not valid Lisp
(princ "Program Terminated Normally")
)
You may need to learn a bit more Lisp operators, before you really can write such loops. You also may want to use Lisp interactively and try out things, instead of trying to write code into an editor and never get feedback from a Lisp...

Here's an answer which definitely is not how you would do this in real life, but if you understand what it does you will understand one of the two big important things about Lisps.
(If you understand why the equivalent program would not work reliably in Scheme you'll also understand one of the important things about writing safe programs! Fortunately this is Common Lisp, not Scheme, so it's OK here.)
First of all let's write a little helper function to read integers. This is just fiddly detail: it's not important.
(defun read-integer (&key (prompt "Integer: ")
(stream *query-io*))
;; Read an integer. This is just fiddly details
(format stream "~&~A" prompt)
(values (parse-integer (read-line stream))))
OK, now here's a slightly odd function called mu (which stands for 'mutant U'):
(defun mu (f &rest args)
(apply f f args))
And now here is our program:
(defun read-integers-until-double-last ()
(mu (lambda (c getter current next factor)
(if (= next (* current factor))
(values current next)
(mu c getter next (funcall getter) factor)))
#'read-integer
(read-integer)
(read-integer)
2))
And here it is working:
> (read-integers-until-double-last)
Integer: 0
Integer: 4
Integer: 3
Integer: -2
Integer: -4
-2
-4
For extra mysteriosity you can essentially expand out the calls to mu here, which makes it either more clear or less clear: I'm not quite sure which:
(defun read-integers-until-double-last ()
((lambda (c)
(funcall c c
#'read-integer
(read-integer)
(read-integer)
2))
(lambda (c getter current next factor)
(if (= next (* current factor))
(values current next)
(funcall c c getter next (funcall getter) factor)))))
Again, this is not how you do it in real life, but if you understand what this does and how it does it you will understand quite an important thing about Lisps and their theoretical underpinnings. This is not all (not even most) of the interesting things about them, but it is a thing worth understanding, I think.

Returning a string in a function in Clojure

I'm trying to return a string value when I read this CSV file that contains cities and city attributes. Here is what I have so far:
(defn city [name]
(with-open [rdr (reader)]
(doseq [line (drop 1 (line-seq rdr))]
(def x2 line)
(def y (string/split x2 #","))
(if (= name (y 0))
(println line)
))))
(city "Toronto")
=> Toronto,43.666667,-79.416667,Canada,CA,Ontario,admin,5213000,3934421
I can get it to print out the row, but how would I go about getting the function to return the row instead, if that makes sense?

With how you have the code setup currently, you can't. doseq is meant to carry out side effects; it doesn't return anything. Rarely do you ever want to use doseq, and rarely should you ever use def inside of function definitions.
You want to find the first line where (= name (y 0)) is true. There's a few ways of approaching that. A basic way would be using loop and just stopping it once you find the line. I think using map or for to loop over the line-seq, then grabbing the first result would work out well here though:
(defn city [name]
(with-open [rdr (reader)]
(first
(for [line (drop 1 (line-seq rdr)) ; Same syntax here as with doseq
:let [y (string/split line #",")] ; Use let instead of def for local definitions
:when (= name (y 0))] ; Only add to the list ":when (= name (y 0))"
line))))
for is like Python's generator expression (if you're familiar with Python). It is not like a normal imperative for loop like in Java. The for will return a list of lines for which (= name (y 0)) was true. Because presumably there's only one such valid line in the file though, we only want one result, so we pass the list to first to get the first valid line found.
And note that for is lazy. This does not iterate the entire file before passing off to first. first requests the first element before for has even iterated, and no more iteration is done once a matching line is found.

The println function is meant for side-effects, and always returns nil. Adjust your function to return line as the last item after the if:
(if (= name (y 0))
line)
If you haven't seen it yet, look at
Brave Clojure (free & book)
Getting Clojure (book)
Clojure Cheatsheet
Here is a better organized version of the code. Your project dependencies will need to look like:
:dependencies [
[org.clojure/clojure "1.10.1"]
[prismatic/schema "1.1.12"]
[tupelo "0.9.168"]
]
and the code can then look like:
(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:require
[clojure.string :as str]
[clojure.java.io :as io]
[tupelo.string :as ts]))
(def city-data
" city,lat,lon,country,ccode,province,unk1,unk2,unk3
Toronto,43.666667,-79.416667,Canada,CA,Ontario,admin,5213000,3934421
Chicago,40.666667,-99.416667,USA,US,dummy,admin,5213000,3934421
")
(defn city->fields [city-str]
(str/split city-str #","))
(defn city [name]
(with-open [rdr (io/reader (ts/string->stream city-data))]
(let [lines (mapv str/trim (line-seq rdr))
hdrs-line (first lines)
city-lines (rest lines)
cities-fields (mapv city->fields city-lines)
city-match (first (filterv #(= name (first %)) cities-fields))]
; debug printouts
(spyx hdrs-line)
(spyx-pretty city-lines)
(spyx city-match)
city-match))) ; <= return value
(dotest
(println "Result: " (city "Toronto"))
)
with result:
-------------------------------
Clojure 1.10.1 Java 13
-------------------------------
Testing tst.demo.core
hdrs-line => "city,lat,lon,country,ccode,province,unk1,unk2,unk3"
city-lines =>
("Toronto,43.666667,-79.416667,Canada,CA,Ontario,admin,5213000,3934421"
"Chicago,40.666667,-99.416667,USA,US,dummy,admin,5213000,3934421"
"")
city-match => ["Toronto" "43.666667" "-79.416667" "Canada" "CA" "Ontario" "admin" "5213000" "3934421"]
Result: [Toronto 43.666667 -79.416667 Canada CA Ontario admin 5213000 3934421]

How to convert a string to a list of symbols?

I am not sure if I am missing something very basic here, but I want to read in a string from a file and use it in my later program as a list of symbols, i.e.
"8C TS" should become ((8 C) (T S))
I know that I can split the initial string using the split-sequence library without problems, but as a string is a sequence of characters I end up with
> (loop :for c :across "8C" :collect c)
(#\8 #\C)
Is it possible to convert the initial string as specified above or is there some reason why this should not/could not be done?

If you want to represent cards as a generic datastructure, you might as well use a vector instead of a list. A vector of characters is just a string, so (split-sequence #\space hand), which gives ("8C" "TS"), should be enough. You'd define a hand to be a list of cards, a card a string of length 2 containing value and suit, and value and suit as characters.
You then use simple readers to access the attributes:
(defun card-value (card)
(aref card 0))
(defun card-suit (card)
(aref card 1))
If you want a more explicit approach, you might prefer defining classes or structs for each:
(defclass hand ()
((cards :initarg :cards
:reader hand-cards)))
(defclass card ()
((value :initarg :value
:reader card-value)
(suit :initarg :suit
:reader card-suit)))
Parsing creates such objects:
(defun read-hand (string &aux (upcased (string-upcase string)))
(make-instance 'hand
:cards (mapcar #'read-card
(split-sequence #\space upcased))))
(defun read-card (string)
(make-instance 'card
:value (case (aref string 0)
(#\T 10)
(#\J 11)
(#\Q 12)
(#\K 13)
(#\A 14)
(t (parse-integer (string (aref string 0)))))
:suit (intern (aref string 1) '#:keyword))
This would represent the value as an integer and the suit as a keyword. You might then want to define predicates like card=, card-suit-=, card-value-< etc.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Split a string even if the last character is a delimiter - string

Related

Binary Search Tree "not a procedure" issue for In-Order-Traversal in Racket/Scheme/Lisp [duplicate]

Can someone breakdown what's happening in this piece of code?

How to do looping in lisp?

Returning a string in a function in Clojure

How to convert a string to a list of symbols?

Categories

Resources