Decode a single character from octets in lisp - string

How can I decode a single character from a vector of octets in common lisp?
I want something like:
(decode-character vector :start i :encoding :utf-8)
or more specifically:
(decode-character #(195 164 195 173 99 195 176) :start 0)
=> #\LATIN_SMALL_LETTER_A_WITH_DIAERESIS
which would return the UTF-8 encoded character that starts at position i in vector.
I can't figure out how to do that using either babel or flexi-streams.

(defun decode-character (vector &rest args)
(char (apply #'babel:octets-to-string
(coerce vector '(vector (unsigned-byte 8))) args)
0))

This is maybe not what you are looking for (I'd gladly update if I can).
I did not look at Babel, but you could generalize the approach for other encodings I guess. I'll stick with trivial-utf-8 here. I would do this:
(defun decode-utf-8-char (octet-vector &key (start 0))
(char (trivial-utf-8:utf-8-bytes-to-string
octet-vector
:start start
:end (+ start 4)) 0))
Gives the result you want with your example vector.
The reason it works is because utf-8 characters are at most 4 bytes long. The call to char is here to grab the first character in case more than one were actually read.

Related

How to distinguish escaped characters from non-escaped e.g. "\x27" from "x27" in a string in Common Lisp?

Solving Advent of Code 2015 task 8 part2 I encountered the problem to have to distinguish in a string the occurrence of "\x27" from plain "x27".
But I don't see a way how I can do it. Because
(length "\x27") ;; is 3
(length "x27") ;; is also 3
(subseq "\x27" 0 1) ;; is "x"
(subseq "x27" 0 1) ;; is "x"
Neither print, prin1, princ made a difference.
# nor does `coerce`
(coerce "\x27" 'list)
;; (#\x #\2 #\7)
So how then to distinguish in a string when "\x27" or any of such
hexadecimal representation occurs?
It turned out, one doesn't need to solve this to solve the task. However, now I still would like to know whether there is a way to distinguish "\x" from "x" in common lisp.
The string literal "\x27" is read as the same as "x27", because \ is an escape character in string literals. If you want a string with the contents \x27, you need to write the literal as "\\x27" (i. e. escape the escape character). This has nothing to do with the strings themselves. If you read a string from a file containing \x27 (e. g. with read-line), then the four-character string \x27 results.
By the time that the Lisp reader gets to work, \x is the same as x. There may be some way to turn this off - I wouldn't be surprised - but the original text talks about Santa's file.
So, I created my own file, like this:
x27
\x27
And I read the data into special variables like this:
(defun read-line-crlf (stream)
(string-right-trim '(#\Return) (read-line stream nil)))
(defun read-lines (filename)
(with-open-file (stream filename)
(setf x (read-line-crlf stream))
(setf x-esc (read-line-crlf stream))
))
The length of x is then 3, and the length of x-esc is 4. The returned string must be trimmed on Windows, or an external format declared, because otherwise SBCL will leave half of the CR-LF on the end of the read strings.

In DrRacket how do I check if a string has a certain amount of characters, as well how do I determine what the first character in a string is

Basically I have a problem, here is the information needed to solve the problem.
PigLatin. Pig Latin is a way of rearranging letters in English words for fun. For example, the sentence “pig latin is stupid” becomes “igpay atinlay isway upidstay”.
Vowels(‘a’,‘e’,‘i’,‘o’,and‘u’)are treated separately from the consonants(any letter that isn’t a vowel).
For simplicity, we will consider ‘y’ to always be a consonant. Although various forms of Pig Latin exist, we will use the following rules:
(1) Words of two letters or less simply have “way” added on the end. So “a” becomes “away”.
(2) In any word that starts with consonants, the consonants are moved to the end, and “ay” is added. If a word begins with more than two consonants, move only the first two letters. So “hello” becomes “ellohay”, and “string” becomes “ringstay”.
(3) Any word which begins with a vowel simply has “way” added on the end. So “explain” becomes “explainway”.
Write a function (pig-latin L) that consumes a non-empty (listof Str) and returns a Str containing the words in L converted to Pig Latin.
Each value in L should contain only lower case letters and have a length of at least 1.
I understand that i need to set three main conditions here, i'm struggling with Racket and learning the proper syntax to write out my solutions. first I need to make a conditions that looks at a string and see if it's length is 2 or less to meet the (1) condition. For (2) I need to look at the first two characters in a string, i'm assuming I have to convert the string into a list of char(string->list). For (3) I understand I just have to look at the first character in the string, i basically have to repeat what I did with (2) but just look at the first character.
I don't know how to manipulate a list of char though. I also don't know how to make sure string-length meets a criteria. Any assistance would be appreciated. I basically have barely any code for my problem since I am baffled on what to do here.
An example of the problem is
(pig-latin (list "this" "is" "a" "crazy" "exercise")) =>
"isthay isway away azycray exerciseway"
The best strategy to solve this problem is:
Check in the documentation all the available string procedures. We don't need to transform the input string to a list of chars to operate upon it, and you'll find that there are existing procedures that meet all of our needs.
Write helper procedures. In fact, we only need a procedure that tells us if a string contains a vowel at a given position; the problem states that only a-z characters are used so we can negate this procedure to also find consonants.
It's also important to identify the best order to write the conditions, for example: conditions 1 and 3 can be combined in a single case. This is my proposal:
(define (vowel-at-index? text index)
(member (string-ref text index)
'(#\a #\e #\i #\o #\u)))
(define (pigify text)
; cases 1 and 3
(cond ((or (<= (string-length text) 2)
(vowel-at-index? text 0))
(string-append text "way"))
; case 2.1
((and (not (vowel-at-index? text 0))
(vowel-at-index? text 1))
(string-append (substring text 1)
(substring text 0 1)
"ay"))
; case 2.2
(else
(string-append (substring text 2)
(substring text 0 2)
"ay"))))
(define (pig-latin lst)
(string-join (map pigify lst)))
For the final step, we only need to apply the pigify procedure to each element in the input, and that's what map does. It works as expected:
(pig-latin '("this" "is" "a" "crazy" "exercise"))
=> "isthay isway away azycray exerciseway"

Select random char from string in Common Lisp

Im learning Common Lisp and writing a simple password generator as an intro project.
Here is my code:
(setq chars
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789")
(print (nth (random (length chars)) chars))
But using CLISP I just get
*** - NTH: "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789" is not a list
I thought every string in Lisp was a list? How can I "cast" the string to a list?
NTH works only for lists. Strings are not lists, but vectors of characters.
Here is the dictionary for strings. CHAR is an accessor for strings.
CL-USER 7 > (char "abc" 1)
#\b
Since strings are also sequences, all sequence operations apply. See: Sequence dictionary.
CL-USER 8 > (elt "abc" 1)
#\b
Lisp is an interactive system. Learn to have conversations with the REPL:
CL-USER> (type-of "abc")
(SIMPLE-ARRAY CHARACTER (3))
You should get a similar result from CLISP.
Can you take it from here?
A string is not a list. Not everything is a list in Lisp ;)
You can use coerce to create a list of the characters (coerce "some string" 'list).

Appending character to string in Common Lisp

I have a character ch that I want to append to a string str. I realize you can concatenate strings like such:
(setf str (concatenate 'string str (list ch)))
But that seems rather inefficient. Is there a faster way to just append a single character?
If the string has a fill-pointer and maybe is also adjustable.
Adjustable = can change its size.
fill-pointer = the content size, the length, can be less than the string size.
VECTOR-PUSH = add an element at the end and increment the fill-pointer.
VECTOR-PUSH-EXTEND = as VECTOR-PUSH, additionally resizes the array, if it is too small.
We can make an adjustable string from a normal one:
CL-USER 32 > (defun make-adjustable-string (s)
(make-array (length s)
:fill-pointer (length s)
:adjustable t
:initial-contents s
:element-type (array-element-type s)))
MAKE-ADJUSTABLE-STRING
CL-USER 33 > (let ((s (make-adjustable-string "Lisp")))
(vector-push-extend #\! s)
s)
"Lisp!"
If you want to extend a single string multiple times, it is often
quite performant to use with-output-to-string, writing to the stream
it provides. Be sure to use write or princ etc. (instead of format)
for performance.

Displaying a string while using cond in Lisp

I'm just starting off with Lisp and need some help. This is technically homework, but I gave it a try and am getting somewhat what I wanted:
(defun speed (kmp)
(cond ((> kmp 100) "Fast")
((< kmp 40) "Slow")
(t "Average")))
However, if I run the program it displays "Average" instead of just Average (without the quotes).
How can I get it to display the string without quotes?
You can use symbols instead of strings. But keep in mind that symbols will be converted to uppercase:
> 'Average
AVERAGE
If you care about case or want to embed spaces, use format:
> (format t "Average")
Average
The read-eval-print loop displays the return value of your function, which is one of the strings in a cond branch. Strings are printed readably by surrounding them with double-quotes.
You could use (write-string (speed 42)). Don't worry that it also shows the string in double-quotes - that's the return value of write-string, displayed after the quoteless output.
You can also use symbols instead of strings:
(defun speed (kmp)
(cond ((> kmp 100) 'fast)
((< kmp 40) 'slow)
(t 'average)))
Symbols are uppercased by default, so internally fast is then FAST.
You can write any symbol in any case and with any characters using escaping with vertical bars:
|The speeed is very fast!|
Above is a valid symbol in Common Lisp and is stored internally just as you write it with case preserved.

Resources