I tried to read string from an input-port in Racket, but no matter what API functions I used to read (read, read-string, read-bytes etc), the return value of those functions was never equal eof-object.
(define (some_process inp)
(begin
(let ([c (read-string 1 inp)])
(if (eof-object? c)
(begin
(display "EOF \n")
#f)
(if (equal? c "\n")
(begin
(display "NEWLINE \n"))
(some_process inp))))))
The c can never be an eof-object?
If you display what c is, it is always a newline.
Read reference:
read-char: "Reads a single character from in – which may involve reading several bytes to UTF-8-decode them into a character. If no bytes are available before an end-of-file, then eof is returned."
read-string: "Returns a string containing the next amt characters from in. If no characters are available before an end-of-file, then eof is returned."
Examples:
> (read-char (open-input-string "char"))
#\c
> (read-string 50 (open-input-string "the string"))
"the string"
>
But if there are no character(s) in the buffer, you'll get eof:
> (read-char (open-input-string ""))
#<eof>
> (read-string 50 (open-input-string ""))
#<eof>
I think you just want to read some amount of characters in a loop and do something with them. If so, the solution would look something along the lines of:
(define (another-process inp)
(let ([c (read-char inp)])
(if (eof-object? c)
(begin (display "==== EOF ====") (newline))
(begin (display c) (newline)
(another-process inp)))))
Example:
> (another-process (open-input-string "OK"))
O
K
==== EOF ====
> (another-process (open-input-string ""))
==== EOF ====
>
Notice a second call to another-process with an empty line, it detects eof immediately and exits the loop.
EDIT:
In case you need to check if the read character is newline:
(define (process-moo inp)
(let ([c (read-char inp)])
(cond
((eof-object? c)
(display "==== EOF ====") (newline))
((eq? c #\newline)
(newline) (display "---- NEWLINE ----") (newline)
(process-moo inp))
(else
(display c)
(process-moo inp)))))
Example:
> (call-with-input-string "Hello\nhumans\n!" process-moo)
Hello
---- NEWLINE ----
humans
---- NEWLINE ----
!==== EOF ====
>
Hope that helps.
If you are entering your input from the console, try pressing Ctrl+D (in Unix and MacOSX), or Ctrl+Z, then Enter (in Windows). This will signal the end of the input.
Related
Why string-trim does not work on global variable in Common Lisp?
(defvar *whitespaces* '(#\Space #\Newline #\Backspace #\Tab
#\Linefeed #\Page #\Return #\Rubout))
(defvar *str* "Hello World")
(defun trim (s)
(string-trim *whitespaces* s))
(print (trim *str*))
;; output "Hello World"
As said in the manual:
string-trim returns a substring of string, with all characters in character-bag stripped off the beginning and end.
So,
CL-USER> (defvar *str* " Hello World ")
*STR*
CL-USER> (trim *str*)
"Hello World"
If you want to remove all the space character between words you can use some library, for instance cl-str:
CL-USER> (ql:quickload "str")
...
CL-USER> (str:collapse-whitespaces *str*)
"Hello World"
T
I know that in clojure.string there is the split function which returns a sequence of the parts of the string excluding the given pattern.
(require '[clojure.string :as str-utils])
(str-utils/split "Yes, hello, this is dog yes hello it is me" #"hello")
;; -> ["Yes, " ", this is dog yes " " it is me"]
However, I'm trying to find a function that instead leaves the token as an element in the returned vector. So it would be like
(split-around "Yes, hello, this is dog yes hello it is me" #"hello")
;; -> ["Yes, " "hello" ", this is dog yes " "hello" " it is me"]
Is there a function that does this in any of the included libraries? Any in external libraries? I've been trying to write it myself but haven't been able to figure it out.
you can also use the regex lookahead/lookbehind feature for that:
user> (clojure.string/split "Yes, hello, this is dog yes hello it is me" #"(?<=hello)|(?=hello)")
;;=> ["Yes, " "hello" ", this is dog yes " "hello" " it is me"]
you can read it as "split with zero-length string at point where preceding or subsequent word is 'hello'"
notice, that it also ignores the dangling empty strings for adjacent patterns and leading/trailing ones:
user> (clojure.string/split "helloYes, hello, this is dog yes hellohello it is mehello" #"(?<=hello)|(?=hello)")
;;=> ["hello"
;; "Yes, "
;; "hello"
;; ", this is dog yes "
;; "hello"
;; "hello"
;; " it is me"
;; "hello"]
you can wrap it into a function like this, for example:
(defn split-around [source word]
(let [word (java.util.regex.Pattern/quote word)]
(->> (format "(?<=%s)|(?=%s)" word word)
re-pattern
(clojure.string/split source))))
(-> "Yes, hello, this is dog yes hello it is me"
(str/replace #"hello" "~hello~")
(str/split #"~"))
Example using #Shlomi's solution:
(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:require [clojure.string :as str]))
(dotest
(let [input-str "Yes, hello, this is dog yes hello it is me"
segments (mapv str/trim
(str/split input-str #"hello"))
result (interpose "hello" segments)]
(is= segments ["Yes," ", this is dog yes" "it is me"])
(is= result ["Yes," "hello" ", this is dog yes" "hello" "it is me"])))
Update
Might be best to write a custom loop for this use case. Something like:
(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:require
[clojure.string :as str] ))
(defn strseg
"Will segment a string like '<a><tgt><b><tgt><c>' at each occurrence of `tgt`, producing
an output vector like [ <a> <tgt> <b> <tgt> <c> ]."
[tgt source]
(let [tgt-len (count tgt)
segments (loop [result []
src source]
(if (empty? src)
result
(let [i (str/index-of src tgt)]
(if (nil? i)
(let [result-next (into result [src])
src-next nil]
(recur result-next src-next))
(let [pre-tgt (subs src 0 i)
result-next (into result [pre-tgt tgt])
src-next (subs src (+ tgt-len i))]
(recur result-next src-next))))))
result (vec
(remove (fn [s] (or (nil? s)
(empty? s)))
segments))]
result))
with unit tests
(dotest
(is= (strseg "hello" "Yes, hello, this is dog yes hello it is me")
["Yes, " "hello" ", this is dog yes " "hello" " it is me"] )
(is= (strseg "hello" "hello")
["hello"])
(is= (strseg "hello" "") [])
(is= (strseg "hello" nil) [])
(is= (strseg "hello" "hellohello") ["hello" "hello" ])
(is= (strseg "hello" "abchellodefhelloxyz") ["abc" "hello" "def" "hello" "xyz" ])
)
Here is another solution that avoids the problems with repetitive patterns and double recognitions present in leetwinski's answer (see my comments) and also computes the parts lazily-as-possible:
(defn partition-str [s sep]
(->> s
(re-seq
(->> sep
java.util.regex.Pattern/quote ; remove this to treat sep as a regex
(format "((?s).*?)(?:(%s)|\\z)")
re-pattern))
(mapcat rest)
(take-while some?)
(remove empty?))) ; remove this to keep empty parts
HOWEVER this does not behave correctly/intuitively when the separator is/matches the empty string.
Another way could be to use both re-seq and split with the same pattern and interleave the resulting sequences as shown in this related question. Unfortunately this way every occurrence of the separator will be recognized twice.
Perhaps a better approach would be to build on a more primitive basis using re-matcher and re-find.
Finally, to offer a straighter answer to the initial question, there is no such function in Clojure's standard library or any external library AFAIK. Moreover I don't know of any simple and completely unproblematic solution to this problem (especially with a regex-separator).
UPDATE
Here is the best solution I can think of right now, working on a lower level, lazily and with a regex-separator:
(defn re-partition [re s]
(let [mr (re-matcher re s)]
((fn rec [i]
(lazy-seq
(if-let [m (re-find mr)]
(list* (subs s i (.start mr)) m (rec (.end mr)))
(list (subs s i)))))
0)))
(def re-partition+ (comp (partial remove empty?) re-partition))
Notice that we can (re)define:
(def re-split (comp (partial take-nth 2) re-partition))
(def re-seq (comp (partial take-nth 2) rest re-partition))
I have a wordlist
dempron {hic, haec, hoc, huius, huic, hunc, hanc, hac, hi, hae, horum, harum, his, hos, has}
I have a xml-kind-of text
<p>Hoc templum magnum est.</p>
<p>Templa Romanorum magna sunt.</p>
<p>Claudia haec templa in foro videt.</p>
I would like to search the wordlist "dempron" and copy the sentences that have words from the wordlist to a buffer called results.
I agree with Simon Fromme, but hopefully this will get you started. Let me know if you have any questions!
(defconst dempron
'("hic" "haec" "hoc" "huius" "huic" "hunc" "hanc" "hac" "hi" "hae" "horum"
"harum" "his" "hos" "has"))
(defun dempron-search ()
"A function to naively search for sentences in XML <p> tags
containing words from `dempron'. Run this in the buffer you want
to search, and it will search from POINT onwards, writing results
to a buffer called 'results'."
(interactive)
(beginning-of-line)
(while (not (eobp)) ;; while we're not at the end of the buffer
(let ((cur-line ;; get the current line as a string
(buffer-substring-no-properties
(progn (beginning-of-line) (point))
(progn (end-of-line) (point)))))
;; See if our current line is in a <p> tag (and run `string-match' so we
;; can extract the sentence with `match-string')
(if (string-match "^<p>\\(.*\\)</p>$" cur-line)
(progn
;; then extract the actual sentence with `match-string'
(setq cur-line (match-string 1 cur-line))
;; For each word in our sentence... (split on whitespace and
;; anything the sentence is likely to end with)
(dolist (word (split-string cur-line "[[:space:].?!\"]+"))
;; `downcase' to make our search case-insensitive
(if (member (downcase word) dempron)
;; We have a match! Temporarily switch to the
;; results buffer and write the sentence
(with-current-buffer (get-buffer-create "results")
(insert cur-line "\n")))))))
(forward-line 1))) ;; Move to the next line
I have this snippet of code in my dotemacs file that helps me view IDs.
How is it possible to exclude some folders from the ID search?
; gid.el -- run gid using compilation mode.
;(require 'compile)
;(require 'elisp-utils)
;(provide 'gid)
(defvar gid-command "gid" "The command run by the gid function.")
(defun gid (args)
"Run gid, with user-specified ARGS, and collect output in a buffer.
While gid runs asynchronously, you can use the \\[next-error] command to
find the text that gid hits refer to. The command actually run is
defined by the gid-command variable."
(interactive (list
;(read-input (concat "Run " gid-command " (with args): ") ;confirmation
(word-around-point)))
;)
;; Preserve the present compile-command
(let (compile-command
(gid-buffer ;; if gid for each symbol use: compilation-buffer-name-function
(lambda (mode) (concat "*gid " args "*"))))
;; For portability between v18 & v19, use compile rather than compile-internal
(compile (concat gid-command " " args))))
(defun word-around-point ()
"Return the word around the point as a string."
(save-excursion
(if (not (eobp))
(forward-char 1))
(forward-word -1)
(forward-word 1)
(forward-sexp -1)
(let ((beg (point)))
(forward-sexp 1)
(buffer-substring beg (point)))))
Found the solution.
Simply when making the IDs prune any uninteresting folders like this:
mkid --prune X
I wrote some script in elisp, it merges ls -l and du (showing real folder size instead of what is written in ls). I named it lsd. Here is screenshot:
http://i.imgur.com/PfSq6.png
Now i'll list implementation. I am not a good coder, so I will appreciate any information about bugs and things that can be made better.
lsd.el
#!/usr/bin/emacs --script
(progn
(setq argz command-line-args-left)
(setq folder "./")
(while argz
;; (message (car argz))
(if (/= ?- (aref (car argz) 0))
(setq folder (car argz)))
(setq argz (cdr argz)))
(if (/= ?/ (aref folder (1- (length folder)))) (setq folder (concat folder "/")))
(switch-to-buffer " *lsd*")
(erase-buffer)
(shell-command (concat "ls -l -h --color=always " " " (apply 'concat (mapcar '(lambda(arg) (concat arg " ")) command-line-args-left))) (current-buffer))
(switch-to-buffer " *du*")
(erase-buffer)
(shell-command (concat "du -h -d 1 " folder) (current-buffer))
(goto-char 1)
(while (search-forward "Permission denied" (point-max) t nil)
(goto-char (point-at-bol))
(let ((beg (point)))
(forward-line)
(delete-region beg (point)))) ; Remove all permission denied lines, thus show only permitted size.
(goto-char 1)
(while (and (search-forward folder (point-max) t nil) (/= (point-max) (1+ (point-at-eol)))) ; we do not need last line(the folder itself), so this line is something complex.
(setq DIR (buffer-substring (point) (point-at-eol)))
(goto-char (point-at-bol))
(setq SIZE (buffer-substring (point) (1- (search-forward " " (point-at-eol) nil nil))))
(goto-char (point-at-eol))
(switch-to-buffer " *lsd*")
(goto-char 1)
(if (search-forward DIR (point-max) t nil)
(progn
(goto-char (point-at-bol))
(search-forward-regexp "[0-9]+" (point-at-eol) nil nil)
(search-forward-regexp " *[0-9]+[^ \n]*[ \n]*" (point-at-eol) nil nil)
;; If ls have options, that makes some numbers before size column - we are doomed. (-s, for example)
(setq SIZE (concat SIZE " "))
(while (< (length SIZE) (length (match-string 0))) (setq SIZE (concat " " SIZE)))
(replace-match SIZE)))
(switch-to-buffer " *du*"))
(switch-to-buffer " *lsd*")
(message "%s" (buffer-substring (point-min) (point-max)))
(defun error(&rest args) args)
(defun message(&rest args) args)) ; Do not show any messages.
lsd
(I made this script to start emacs without loading anything but script. If it can be done easier, please point this)
#/bin/bash
emacs -Q --script /usr/local/bin/lsd.el $#
And here is the problem: how to use this lsd in dired?
Can I change something in dired to use lsd instead of ls?
Can I rename ls in oldls, and make some ls bash script that passes all arguments to ls if there no --lsd flag, and passing all arguments to lsd if --lsd is here?
Is it good idea at all?
In Emacs24 there is also `insert-directory-program' to set the ls executable. Put
(setq insert-directory-program "/usr/local/bin/lsd")
(adjust the path accordingly) in your .emacs or init.el and dired takes your lsd script.
I don't know if this the most efficient way to do things, I'm still a bit of an Emacs beginner. But here's how I would do it.
Since you're on Linux you should start by telling emacs to use its built-in ls emulation. A simple (require 'ls-lisp) in your init file should suffice.
Set the variable ls-lisp-use-insert-directory-program to true. This tells emacs to use an external program for ls.
The actual program it uses can be customized by setting the variable insert-directory-program to point to your lsd script.
Here's an example of how to do this:
;; Put this in your init file
(require 'ls-lisp)
(setq ls-lisp-use-insert-directory-program T)
(setq insert-directory-program "~/path/to/lsd")
Let me know if this works for you. I use emacs on Windows so I'm not sure how well this ports over to linux (the ls emulation part that is).