Given the following string:
(def text "this is the first sentence . And this is the second sentence")
I wanted to count the number of times a word like "this" appears in the text, by appending the count after each occurrence of the word. Like this:
["this: 1", "is" "the" "first" "sentence" "." "and" "this: 2" ...]
As a first step, I tokenized the string:
(def words (split text #" "))
Then I created a helper function to get the number of times "this" appears in the text:
(defn count-this [x] (count(re-seq #"this" text)))
Finally I tried to use the result of the count-this function inside this loop:
(for [x words]
(if (= x "this")
(str "this: "(apply str (take (count-this)(iterate inc 0))))
x))
Here is what I get:
("this: 01" "is" "the" "first" "sentence" "." "And" "this: 01" "is" ...)
This can be achieved fairly succinctly using reduce to thread a counter through your vector traversal, in addition to building the new strings as needed:
(def text "this is the first sentence. And this is the second sentence.")
(defn notate-occurences [word string]
(->
(reduce
(fn [[count string'] member]
(if (= member word)
(let [count' (inc count)]
[count' (conj string' (str member ": " count'))])
[count (conj string' member)]))
[0 []]
(clojure.string/split string #" "))
second))
(notate-occurences "this" text)
;; ["this: 1" "is" "the" "first" "sentence." "And" "this: 2" "is" "the" "second""sentence."]
(defn split-by-word [word text]
(remove empty?
(flatten
(map #(if (number? %) (str word ": " (+ 1 %)) (clojure.string/split (clojure.string/trim %) #" "))
(butlast (interleave
(clojure.string/split (str text " ") (java.util.regex.Pattern/compile (str "\\b" word "\\b")))
(range)))))))
You need to keep some state as you are going along. reduce, loop/recur and iterate all do this. iterate just transitions from one state to another. Here is the transition function:
(defn transition [word]
(fn [[[head & tail] counted out]]
(let [[next-counted to-append] (if (= word head)
[(inc counted) (str head ": " (inc counted))]
[counted head])]
[tail next-counted (conj out to-append)])))
Then you can use iterate to exercise this function until there is no input left:
(let [in (s/split "this is the first sentence . And this is the second sentence" #" ")
step (transition "this")]
(->> (iterate step [in 0 []])
(drop-while (fn [[[head & _] _ _]]
head))
(map #(nth % 2))
first))
;; => ["this: 1" "is" "the" "first" "sentence" "." "And" "this: 2" "is" "the" "second" "sentence"]
The problem with that approach is that (apply str (take (count-this)(iterate inc 0))) is going to evaluate to the same thing every time.
To exert complete control over variables you generally want to use the loop form.
e.g.
(defn add-indexes [word phrase]
(let [words (str/split phrase #"\s+")]
(loop [src words
dest []
counter 1]
(if (seq src)
(if (= word (first src))
(recur (rest src) (conj dest (str word " " counter)) (inc counter))
(recur (rest src) (conj dest (first src)) counter))
dest))))
user=> (add-indexes "this" "this is the first sentence . And this is the second sentence")
["this 1" "is" "the" "first" "sentence" "." "And" "this 2" "is" "the" "second" "sentence"]
loop allows you to specify the value of every of the loop variables on each pass. So you can decide to change them or not according to your own logic.
If you're willing to dip into Java and maybe do something that feels like cheating, this would work too.
(defn add-indexes2 [word phrase]
(let [count (java.util.concurrent.atomic.AtomicInteger. 1)]
(map #(if (= word %) (str % " " (.getAndIncrement count)) %)
(str/split phrase #"\s+"))))
user=> (add-indexes2 "this" "this is the first sentence . And this is the second sentence")
("this 1" "is" "the" "first" "sentence" "." "And" "this 2" "is" "the" "second" "sentence")
Using the mutable counter may not be pure, but on the other hand, it never escapes the context of the function, so its behavior cannot be changed by external forces.
Usually, you can find a simple way of composing your solution from existing Clojure functions in a very succinct way.
Here's two quite short solutions to your problem. First, if you don't need the result as a sequence, but replacements to the string are ok:
(require '(clojure.string))
(def text "this is the first sentence . And this is the second sentence")
(defn replace-token [ca token]
(swap! ca inc)
(str token ": " #ca))
(defn count-this [text]
(let [counter (atom 0)
replacer-fn (partial replace-token counter)]
(clojure.string/replace text #"this" replacer-fn)))
(count-this text)
; => "this: 1 is the first sentence . And this: 2 is the second sentence"
The above solution makes use of the fact that a function can be supplied to clojure.string/replace.
Second, if you need the result as a sequence, there is some overhead from tokenizing:
(defn count-seq [text]
(let [counter (atom 0)
replacer-fn (partial replace-token counter)
converter (fn [tokens] (map #(if (not= % "this")
%
(replacer-fn %))
tokens))]
(-> text
(clojure.string/split #" ")
(converter))))
(count-seq text)
; => ("this: 1" "is" "the" "first" "sentence" "." "And" "this: 2" "is" "the" "second" "sentence")
The loop-recur pattern is very common for beginning Clojurians, who come from non-functional languages. In most cases, there is a cleaner and more idiomatic solution using functional processing with map, reduce, and friends.
Like other answers have stated, the main issue in your original attempt is the binding of your counter. In fact, (iterate inc 0) is not bound to anything. Look at my examples above to think through the scope of the bound atom counter. As a reference, here is an example of using closures, which could also be used in this case with great success!
As a footnote for above examples: For cleaner code, you should make a more general solution by extracting and reusing the common parts of count-seq and count-this functions. Also, the local converter function could be extracted out of count-seq. replace-token is already general for all tokens, but consider how the whole solution could be expanded beyond matching text other than "this". These are left as exercises for the reader.
We are trying to implement a Draft Sight/AutoCad script that will transform a SVG file into a CAD drawing.
The principal idea is to read the file line by line (performed by ReadSVGData), split the svg definitions by spaces (ReadHTMLItemData), read the individual HTML attributes into a list and based on the type of the SVG item draw a CAD element. So much in regards to the principal...
The unususal part is, that whenever the Html Attributes, like "id="Box_8_0"" are sent to the findchar function, by attrlis function, the script fails, although the same arrangement went well before
Does anybody have a hint where my mistake is hidden?
(defun findchar (FindChar Text)
(setq
;current location in string
coord 1
;Init Return Coordinate
ReturnCoord 0
;Length of Searched Item, to enable string searching
FindCharLen (strlen FindChar)
;Nil Count: Requires as regular expressions like (/t) are identified as two times ascii char 9
NilCnt 0
;Storage of last Char Ascii to identify regular expressions
LastCharAsci -1
)
;iterate the String and break in case of the first occurence
(while (and (<= coord (strlen Text) ) (= ReturnCoord 0))
;Current Character
(setq CurChar (substr Text coord FindCharLen))
;Find Searched String
(if (= FindChar CurChar)
(setq ReturnCoord coord)
)
;Check for regular expression
(if (and (= LastCharAsci 9) (= (ascii CurChar) 9))
(setq NilCnt (+ NilCnt 1))
)
;Update String position and String
(setq LastCharAsci (ascii CurChar))
(setq coord (+ coord 1))
)
;return variable
(- ReturnCoord NilCnt)
)
(defun attrlis (HTMLAttr)
(setq Koordi 0)
(progn
(setq CharLoc (findchar "<" HTMLAttr))
(princ HTMLAttr)
(terpri)
)
(+ Koordi 1)
)
(defun ReadHTMLItemData(HTMLItem)
(setq
coord 1
HTMLItmBgn 1
Attributes 0
CurChar 0
Dictionary 0
)
;(princ HTMLItem)
;(terpri)
(while (<= coord (strlen HTMLItem))
(setq CurChar (substr HTMLItem coord 1))
(if (or (= (ascii CurChar) 32) (= (ascii CurChar) 62))
(progn
(if (> (- coord HTMLItmBgn) 0)
(progn
(setq htmlattr (substr HTMLItem HTMLItmBgn (- coord HTMLItmBgn)))
(setq Result (attrlis htmlattr))
(princ Result)
(setq HTMLItmBgn (+ coord 1))
)
)
)
)
(setq coord (+ coord 1))
)
)
(defun ReadLineContents(Line)
(if (/= Line nil)
(progn
;(princ Line)
;(terpri)
(setq
Bgn (findchar "<" Line)
End (findchar ">" Line)
ItemDef (substr Line (+ Bgn (strlen "<")) End)
)
(ReadHTMLItemData ItemDef)
)
)
)
(defun C:ReadSVGData()
(setq SVGFile (open (getfiled "Select a file" "" "svg" 0) "r"))
(setq Line 1)
(while (/= Line nil)
(setq Line (read-line SVGFile))
(ReadLineContents Line)
)
(close SVGFile)
(princ "Done")
)
Reading the following file:
<svg class="boxview" id="boxview" style="width:1198.56px; height:486.8004px; display:block;" viewBox="0 0 1198.56 486.8004">
<g id="BD_box">
<rect class="box" id="Box_8_0" x="109.21" y="394.119" width="58.512" height="62.184" box="4047"></rect>
</g>
</svg>
EDIT
Change of substring Index, based on satraj's answer
The problem lies in the way the "substr" Autolisp function is used. The start index of substr always starts from index 1 (not from 0). So your code must be changed such that the start index are initialized as 1. The following lines in your code fails.
(setq CurChar (substr HTMLItem coord 1))
(setq htmlattr (substr HTMLItem HTMLItmBgn (- coord HTMLItmBgn)))
Since coord and HTMLItemBgn variables are initialized as 0, the substr function fails.
Also, why not use "vl-string-search" function if you want to find the position of a text in a string? you can get rid of the findchar function.
An Example:
(setq CharLoc (vl-string-search "<" HTMLAttr))
In general, if you want to debug failures in AutoLisp, add the following function to your lisp file and it will print a stack trace in case of failures, which will enable you to locate exact place where the error occured.
(defun *error* (msg)
(vl-bt)
)
I want to write a function that deletes all vowels from a string. I thought of defining a function that detects the vowels, something similar to symbolp, zerop and so on and if it is a vowel, delete it. How can I do this? I would appreciate any input on this. Thanks
(defun deletevowels (string)
(go through the list
(if vowel-p deletevowels )
)
)
Nevertheless, I have the problem that deletes a vowel if it's the last, how can I modify this to meet what I want to do, to delete all vowels in a string? In the code below there's this function I was mentioning, vowel-p one.
(defun strip-vowel (word)
"Strip off a trailing vowel from a string."
(let* ((str (string word))
(end (- (length str) 1)))
(if (vowel-p (char str end))
(subseq str 0 end)
str)))
(defun vowel-p (char) (find char "aeiou" :test #'char-equal))
Moreover, is it easier if I would use the function below to turn the string into a list and then loop in the list instead of the string to look for the vowel and remove it?
(defun string-to-list (string)
(loop for char across string collect char))
CL-USER 27 > (defun vowel-p (char)
(find char "aeiou" :test #'char-equal))
VOWEL-P
CL-USER 28 > (remove-if #'vowel-p "abcdef")
"bcdf"
See: Common Lisp Hyperspec, REMOVE-IF.
CL-USER 29 > (defun deletevowels (string)
(remove-if #'vowel-p string))
DELETEVOWELS
CL-USER 30 > (deletevowels "spectacular")
"spctclr"
In Emacs or Vim, what's a smooth way to join strings as in this example:
Transform from:
(alpha, beta, gamma) blah (123, 456, 789)
To:
(alpha=123, beta=456, gamma=789)
It would need to scale to:
many lines of these
many elements in the parentheses
I have recently found myself needing this kind of transformation often.
I use Evil in Emacs which is why a Vim answer would likely also help.
UPDATE:
The solutions were not as general as I had hoped. For example, I'd like the solution to also work when I have a list of strings and wish to distribute them into a large XML document. eg:
<item foo="" bar="barval1"/>
<item foo="" bar="barval2"/>
<item foo="" bar="barval3"/>
<item foo="" bar="barval4"/>
fooval1
fooval2
fooval3
fooval4
I formulated a solution and have added it as an answer.
%s/(\(\S\{-}\), \(\S\{-}\), \(\S\{-}\)).\{-}(\(\S\{-}\), \(\S\{-}\), \(\S\{-}\))/(\1=\4, \2=\5, \3=\6)
%s: global search and replace
\(\S{-}\),: non greedy search for non-whitespace characters up to the next comma, enclosed by "(" for backreferencing
\1=\4 : prints out the first match, an "=" sign, then the fourth match
for such text transformation, I would go with awk:
this one-liner may help:
awk -F'\\(|\\)' '{split($2,t,",");split($4,v,",");printf "( "; for(x in t)s=s""sprintf("%s=%s, ", t[x],v[x]);sub(", $","",s);printf s")\n";s=""}' file
little test:
kent$ cat test
(alpha, beta, gamma) blah (123, 456, 789)
(a, b, c) foo (1, 2, 3)
(x, y, z, m, n) bar (100, 200, 300, 400, 500)
kent$ awk -F'\\(|\\)' '{split($2,t,",");split($4,v,",");printf "( "; for(x in t)s=s""sprintf("%s=%s, ", t[x],v[x]);sub(", $","",s);printf s")\n";s=""}' test
( alpha=123, beta= 456, gamma= 789)
( a=1, b= 2, c= 3)
( m= 400, n= 500, x=100, y= 200, z= 300)
Emacs Lisp version of Prince Goulash answer
(require 'cl)
(defun split-and-trim (str separator)
(let ((strs (split-string str separator)))
(mapcar (lambda (s)
(replace-regexp-in-string "^\\s-+" "" s))
(mapcar (lambda (s)
(replace-regexp-in-string "\\s-$" "" s)) strs))))
(defun my/merge-list (beg end)
(interactive "r")
(goto-char beg)
(let ((endmark (set-mark end))
(regexp "(\\([^)]+\\))[^(]+(\\([^)]+\\))"))
(while (re-search-forward regexp end t)
(let ((replace-start (match-beginning 0))
(replace-end (match-end 0))
(keys-str (match-string-no-properties 1))
(values-str (match-string-no-properties 2)))
(let* ((keys (split-and-trim keys-str ","))
(values (split-and-trim values-str ",")))
(while (> (length keys) (length values))
(setq values (append values '(""))))
(let* ((pairs (mapcar* (lambda (k v)
(format "%s=%s" k v)) keys values))
(transformed (format "(%s)" (mapconcat #'identity pairs ", "))))
(goto-char replace-start)
(delete-region replace-start replace-end)
(insert transformed)))))
(goto-char (marker-position endmark))))
For example, you select region as following
(alpha, beta, gamma) blah (123, 456, 789)
(alpha, beta, gamma, delta) blah (123, 456, 789, aaa)
After M-x my/merge-list
(alpha=123, beta=456, gamma=789)
(alpha=123, beta=456, gamma=789, delta=aaa)
This method I'm going to describe is a bit wacky, but it involves the minimum amount of Elisp code I could manage. It's only applicable if the lists to be joined can be interpreted as Lisp lists once the commas in them are removed. Numbers and sequences of alphabetic characters, as in your example, would be fine.
First, make sure that the Common Lisp library is loaded: M-:(require 'cl)RET.
Now, starting with the cursor at the start of the first list:
M-C-k ; kill-forward-sexp
C-e ; move-end-of-line
M-C-b ; backward-sexp
M-C-k ; kill-forward-sexp
C-a ; move-beginning-of-line
C-k ; kill-line
Now blah (or whatever) is the first entry in the kill ring, the second list is the second entry, and the first list is the third entry.
Type (, then M-: (eval-expression), take a deep breath, and type this:
(loop with (a b) = (mapcar (lambda (x) (car (read-from-string (remove ?, x))))
(subseq kill-ring 1 3))
for x in a for y in b do (insert (format "%s=%s, " y x)))
(I've broken it up for presentation purposes, but you can type it all on one line.)
Then finally DELDEL), and you're done! You could turn it into a macro, if you wanted.
Here is a Vimscript solution. It is nowhere near as elegant as ash's answer, but it works with lists of any length.
function! ListMerge()
" Get line, remove text between lists, split lists at parentheses:
let curline = getline('.')
let curline = substitute(curline,')\zs.*\ze(','','g')
let curline = substitute(curline,'(','','g')
let lists = map(split(curline,')'),'split(v:val,",")')
" Return if we don't have two lists of equal length:
if len(lists) != 2 || len(lists[0]) != len(lists[1])
return
endif
" Loop over the lists, remove whitespace, build the replacement string:
let i=0
let string = '('
while i<len(lists[0])
let string .= substitute(lists[0][i],'^ *','','')
let string .= '='
let string .= substitute(lists[1][i],'^ *','','')
let string .= ', '
let i+=1
endwhile
" Add the concluding bracket:
let string = substitute(string,', $',')','')
" Replace the current line with the string:
execute "normal! S" . string
endfunction
You can then call this function on all lines like this:
:%call ListMerge()
My approach is to create one command to set a match-list, then use replace-regexp as the second command to distribute match-list, leveraging replace-regexp's existing \, facility.
Evaluate Elisp, such as in the .emacs file:
(defvar match-list nil
"A list of matches, as set through the set-match-list and consumed by the cycle-match-list function. ")
(defvar match-list-iter nil
"Iterator through the global match-list variable. ")
(defun reset-match-list-iter ()
"Set match-list-iter to the beginning of match-list and return it. "
(interactive)
(setq match-list-iter match-list))
(defun make-match-list (match-regexp use-regexp beg end)
"Set the match-list variable as described in the documentation for set-match-list. "
;; Starts at the beginning of region, searches forward and builds match-list.
;; For efficiency, matches are appended to the front of match-list and then reversed
;; at the end.
;;
;; Note that the behavior of re-search-backward is such that the same match-list
;; is not created by starting at the end of the region and searching backward.
(let ((match-list nil))
(save-excursion
(goto-char beg)
(while
(let ((old-pos (point)) (new-pos (re-search-forward match-regexp end t)))
(when (equal old-pos new-pos)
(error "re-search-forward makes no progress. old-pos=%s new-pos=%s end=%s match-regexp=%s"
old-pos new-pos end match-regexp))
new-pos)
(setq match-list
(cons (replace-regexp-in-string match-regexp
use-regexp
(match-string 0)
t)
match-list)))
(setq match-list (nreverse match-list)))))
(defun set-match-list (match-regexp use-regexp beg end)
"Set the match-list global variable to a list of regexp matches. MATCH-REGEXP
is used to find matches in the region from BEG to END, and USE-REGEXP is the
regexp to place in the match-list variable.
For example, if the region contains the text: {alpha,beta,gamma}
and MATCH-REGEXP is: \\([a-z]+\\),
and USE-REGEXP is: \\1
then match-list will become the list of strings: (\"alpha\" \"beta\")"
(interactive "sMatch regexp: \nsPlace in match-list: \nr")
(setq match-list (make-match-list match-regexp use-regexp beg end))
(reset-match-list-iter))
(defun cycle-match-list (&optional after-end-string)
"Return the next element of match-list.
If AFTER-END-STRING is nil, cycle back to the beginning of match-list.
Else return AFTER-END-STRING once the end of match-list is reached."
(let ((ret-elm (car match-list-iter)))
(unless ret-elm
(if after-end-string
(setq ret-elm after-end-string)
(reset-match-list-iter)
(setq ret-elm (car match-list-iter))))
(setq match-list-iter (cdr match-list-iter))
ret-elm))
(defadvice replace-regexp (before my-advice-replace-regexp activate)
"Advise replace-regexp to support match-list functionality. "
(reset-match-list-iter))
Then to solve the original problem:
M-x set-match-list
Match regexp: \([0-9]+\)[,)]
Place in match-list: \1
M-x replace-regexp
Replace regexp: \([a-z]+\)\([,)]\)
Replace regexp with: \1=\,(cycle-match-list)\2
And to solve the XML example:
[Select fooval strings.]
M-x set-match-list
Match regexp: .+
Place in match-list: \&
[Select XML tags.]
M-x replace-regexp
Replace regexp: foo=""
Replace regexp with: foo="\,(cycle-match-list)"