string-trim does not work on global variable - string

Why string-trim does not work on global variable in Common Lisp?
(defvar *whitespaces* '(#\Space #\Newline #\Backspace #\Tab
#\Linefeed #\Page #\Return #\Rubout))
(defvar *str* "Hello World")
(defun trim (s)
(string-trim *whitespaces* s))
(print (trim *str*))
;; output "Hello World"

As said in the manual:
string-trim returns a substring of string, with all characters in character-bag stripped off the beginning and end.
So,
CL-USER> (defvar *str* " Hello World ")
*STR*
CL-USER> (trim *str*)
"Hello World"
If you want to remove all the space character between words you can use some library, for instance cl-str:
CL-USER> (ql:quickload "str")
...
CL-USER> (str:collapse-whitespaces *str*)
"Hello World"
T

Related

Is there a function in the included Clojure libraries to split a string around another string?

I know that in clojure.string there is the split function which returns a sequence of the parts of the string excluding the given pattern.
(require '[clojure.string :as str-utils])
(str-utils/split "Yes, hello, this is dog yes hello it is me" #"hello")
;; -> ["Yes, " ", this is dog yes " " it is me"]
However, I'm trying to find a function that instead leaves the token as an element in the returned vector. So it would be like
(split-around "Yes, hello, this is dog yes hello it is me" #"hello")
;; -> ["Yes, " "hello" ", this is dog yes " "hello" " it is me"]
Is there a function that does this in any of the included libraries? Any in external libraries? I've been trying to write it myself but haven't been able to figure it out.
you can also use the regex lookahead/lookbehind feature for that:
user> (clojure.string/split "Yes, hello, this is dog yes hello it is me" #"(?<=hello)|(?=hello)")
;;=> ["Yes, " "hello" ", this is dog yes " "hello" " it is me"]
you can read it as "split with zero-length string at point where preceding or subsequent word is 'hello'"
notice, that it also ignores the dangling empty strings for adjacent patterns and leading/trailing ones:
user> (clojure.string/split "helloYes, hello, this is dog yes hellohello it is mehello" #"(?<=hello)|(?=hello)")
;;=> ["hello"
;; "Yes, "
;; "hello"
;; ", this is dog yes "
;; "hello"
;; "hello"
;; " it is me"
;; "hello"]
you can wrap it into a function like this, for example:
(defn split-around [source word]
(let [word (java.util.regex.Pattern/quote word)]
(->> (format "(?<=%s)|(?=%s)" word word)
re-pattern
(clojure.string/split source))))
(-> "Yes, hello, this is dog yes hello it is me"
(str/replace #"hello" "~hello~")
(str/split #"~"))
Example using #Shlomi's solution:
(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:require [clojure.string :as str]))
(dotest
(let [input-str "Yes, hello, this is dog yes hello it is me"
segments (mapv str/trim
(str/split input-str #"hello"))
result (interpose "hello" segments)]
(is= segments ["Yes," ", this is dog yes" "it is me"])
(is= result ["Yes," "hello" ", this is dog yes" "hello" "it is me"])))
Update
Might be best to write a custom loop for this use case. Something like:
(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:require
[clojure.string :as str] ))
(defn strseg
"Will segment a string like '<a><tgt><b><tgt><c>' at each occurrence of `tgt`, producing
an output vector like [ <a> <tgt> <b> <tgt> <c> ]."
[tgt source]
(let [tgt-len (count tgt)
segments (loop [result []
src source]
(if (empty? src)
result
(let [i (str/index-of src tgt)]
(if (nil? i)
(let [result-next (into result [src])
src-next nil]
(recur result-next src-next))
(let [pre-tgt (subs src 0 i)
result-next (into result [pre-tgt tgt])
src-next (subs src (+ tgt-len i))]
(recur result-next src-next))))))
result (vec
(remove (fn [s] (or (nil? s)
(empty? s)))
segments))]
result))
with unit tests
(dotest
(is= (strseg "hello" "Yes, hello, this is dog yes hello it is me")
["Yes, " "hello" ", this is dog yes " "hello" " it is me"] )
(is= (strseg "hello" "hello")
["hello"])
(is= (strseg "hello" "") [])
(is= (strseg "hello" nil) [])
(is= (strseg "hello" "hellohello") ["hello" "hello" ])
(is= (strseg "hello" "abchellodefhelloxyz") ["abc" "hello" "def" "hello" "xyz" ])
)
Here is another solution that avoids the problems with repetitive patterns and double recognitions present in leetwinski's answer (see my comments) and also computes the parts lazily-as-possible:
(defn partition-str [s sep]
(->> s
(re-seq
(->> sep
java.util.regex.Pattern/quote ; remove this to treat sep as a regex
(format "((?s).*?)(?:(%s)|\\z)")
re-pattern))
(mapcat rest)
(take-while some?)
(remove empty?))) ; remove this to keep empty parts
HOWEVER this does not behave correctly/intuitively when the separator is/matches the empty string.
Another way could be to use both re-seq and split with the same pattern and interleave the resulting sequences as shown in this related question. Unfortunately this way every occurrence of the separator will be recognized twice.
Perhaps a better approach would be to build on a more primitive basis using re-matcher and re-find.
Finally, to offer a straighter answer to the initial question, there is no such function in Clojure's standard library or any external library AFAIK. Moreover I don't know of any simple and completely unproblematic solution to this problem (especially with a regex-separator).
UPDATE
Here is the best solution I can think of right now, working on a lower level, lazily and with a regex-separator:
(defn re-partition [re s]
(let [mr (re-matcher re s)]
((fn rec [i]
(lazy-seq
(if-let [m (re-find mr)]
(list* (subs s i (.start mr)) m (rec (.end mr)))
(list (subs s i)))))
0)))
(def re-partition+ (comp (partial remove empty?) re-partition))
Notice that we can (re)define:
(def re-split (comp (partial take-nth 2) re-partition))
(def re-seq (comp (partial take-nth 2) rest re-partition))

Elisp: How to search a wordlist and copy the results to another buffer?

I have a wordlist
dempron {hic, haec, hoc, huius, huic, hunc, hanc, hac, hi, hae, horum, harum, his, hos, has}
I have a xml-kind-of text
<p>Hoc templum magnum est.</p>
<p>Templa Romanorum magna sunt.</p>
<p>Claudia haec templa in foro videt.</p>
I would like to search the wordlist "dempron" and copy the sentences that have words from the wordlist to a buffer called results.
I agree with Simon Fromme, but hopefully this will get you started. Let me know if you have any questions!
(defconst dempron
'("hic" "haec" "hoc" "huius" "huic" "hunc" "hanc" "hac" "hi" "hae" "horum"
"harum" "his" "hos" "has"))
(defun dempron-search ()
"A function to naively search for sentences in XML <p> tags
containing words from `dempron'. Run this in the buffer you want
to search, and it will search from POINT onwards, writing results
to a buffer called 'results'."
(interactive)
(beginning-of-line)
(while (not (eobp)) ;; while we're not at the end of the buffer
(let ((cur-line ;; get the current line as a string
(buffer-substring-no-properties
(progn (beginning-of-line) (point))
(progn (end-of-line) (point)))))
;; See if our current line is in a <p> tag (and run `string-match' so we
;; can extract the sentence with `match-string')
(if (string-match "^<p>\\(.*\\)</p>$" cur-line)
(progn
;; then extract the actual sentence with `match-string'
(setq cur-line (match-string 1 cur-line))
;; For each word in our sentence... (split on whitespace and
;; anything the sentence is likely to end with)
(dolist (word (split-string cur-line "[[:space:].?!\"]+"))
;; `downcase' to make our search case-insensitive
(if (member (downcase word) dempron)
;; We have a match! Temporarily switch to the
;; results buffer and write the sentence
(with-current-buffer (get-buffer-create "results")
(insert cur-line "\n")))))))
(forward-line 1))) ;; Move to the next line

In emacs: exclude folders from searching IDs "gid"

I have this snippet of code in my dotemacs file that helps me view IDs.
How is it possible to exclude some folders from the ID search?
; gid.el -- run gid using compilation mode.
;(require 'compile)
;(require 'elisp-utils)
;(provide 'gid)
(defvar gid-command "gid" "The command run by the gid function.")
(defun gid (args)
"Run gid, with user-specified ARGS, and collect output in a buffer.
While gid runs asynchronously, you can use the \\[next-error] command to
find the text that gid hits refer to. The command actually run is
defined by the gid-command variable."
(interactive (list
;(read-input (concat "Run " gid-command " (with args): ") ;confirmation
(word-around-point)))
;)
;; Preserve the present compile-command
(let (compile-command
(gid-buffer ;; if gid for each symbol use: compilation-buffer-name-function
(lambda (mode) (concat "*gid " args "*"))))
;; For portability between v18 & v19, use compile rather than compile-internal
(compile (concat gid-command " " args))))
(defun word-around-point ()
"Return the word around the point as a string."
(save-excursion
(if (not (eobp))
(forward-char 1))
(forward-word -1)
(forward-word 1)
(forward-sexp -1)
(let ((beg (point)))
(forward-sexp 1)
(buffer-substring beg (point)))))
Found the solution.
Simply when making the IDs prune any uninteresting folders like this:
mkid --prune X

REBOL metaprogramming questions

I'm very new to REBOL (i.e. yesterday).
I am using the term "metaprogramming" here, but I'm not sure if it is accurate. At any rate, I'm trying to understand how REBOL can execute words. To give an example, here is some code in TCL:
> # puts is the print command
> set x puts
> $x "hello world"
hello world
I've tried many different ways to do something similar in REBOL, but can't get quite the same effect. Can someone offer a few different ways to do it (if possible)?
Thanks.
Here's a few ways:
x: :print ;; assign 'x to 'print
x "hello world" ;; and execute it
hello world
blk: copy [] ;; create a block
append blk :print ;; put 'print in it
do [blk/1 "hello world"] ;; execute first entry in the block (which is 'print)
hello world
x: 'print ;; assign 'x to the value 'print
do x "hello world" ;; execute the value contained in 'x (ie 'print)
hello world
x: "print" ;; assign x to the string "print"
do load x "hello world" ;; execute the value you get from evaluating 'x
hello world

Racket eof-object read from input port

I tried to read string from an input-port in Racket, but no matter what API functions I used to read (read, read-string, read-bytes etc), the return value of those functions was never equal eof-object.
(define (some_process inp)
(begin
(let ([c (read-string 1 inp)])
(if (eof-object? c)
(begin
(display "EOF \n")
#f)
(if (equal? c "\n")
(begin
(display "NEWLINE \n"))
(some_process inp))))))
The c can never be an eof-object?
If you display what c is, it is always a newline.
Read reference:
read-char: "Reads a single character from in – which may involve reading several bytes to UTF-8-decode them into a character. If no bytes are available before an end-of-file, then eof is returned."
read-string: "Returns a string containing the next amt characters from in. If no characters are available before an end-of-file, then eof is returned."
Examples:
> (read-char (open-input-string "char"))
#\c
> (read-string 50 (open-input-string "the string"))
"the string"
>
But if there are no character(s) in the buffer, you'll get eof:
> (read-char (open-input-string ""))
#<eof>
> (read-string 50 (open-input-string ""))
#<eof>
I think you just want to read some amount of characters in a loop and do something with them. If so, the solution would look something along the lines of:
(define (another-process inp)
(let ([c (read-char inp)])
(if (eof-object? c)
(begin (display "==== EOF ====") (newline))
(begin (display c) (newline)
(another-process inp)))))
Example:
> (another-process (open-input-string "OK"))
O
K
==== EOF ====
> (another-process (open-input-string ""))
==== EOF ====
>
Notice a second call to another-process with an empty line, it detects eof immediately and exits the loop.
EDIT:
In case you need to check if the read character is newline:
(define (process-moo inp)
(let ([c (read-char inp)])
(cond
((eof-object? c)
(display "==== EOF ====") (newline))
((eq? c #\newline)
(newline) (display "---- NEWLINE ----") (newline)
(process-moo inp))
(else
(display c)
(process-moo inp)))))
Example:
> (call-with-input-string "Hello\nhumans\n!" process-moo)
Hello
---- NEWLINE ----
humans
---- NEWLINE ----
!==== EOF ====
>
Hope that helps.
If you are entering your input from the console, try pressing Ctrl+D (in Unix and MacOSX), or Ctrl+Z, then Enter (in Windows). This will signal the end of the input.

Resources