How do you securely parse untrusted input in Common Lisp? Given that there is no parse-float etc, and that read-from-string will execute reader macros like #. (read time eval).
e.g.
(read-from-string "#.(+ 1 2)") => 3
I can't find the other question or comment that described some of the safe input handling procedures for Common Lisp (if someone else finds them, please post a comment!), but there are at least two important things that you might do:
Use with-standard-io-syntax to make sure that you're reading with the standard readtable, etc. Note that this will bind *read-eval* to true, so be sure to also:
Bind *read-eval* to false (within with-standard-io-syntax). This disables the sharpsign-dot (#.) macro mentioned in the question.
(let ((*readtable* (copy-readtable)))
(set-macro-character #\n (constantly 'injected))
(read-from-string "(#.(+ 2 5) n)"))
;;=> (7 INJECTED)
(let ((*readtable* (copy-readtable)))
(set-macro-character #\n (constantly 'injected))
(with-standard-io-syntax
(let ((*read-eval* nil))
(read-from-string "(#.(+ 2 5) n)"))))
;; Evaluation aborted on #<SB-INT:SIMPLE-READER-ERROR
;; "can't read #. while *READ-EVAL* is NIL" {1004DA3603}>.
(let ((*readtable* (copy-readtable)))
(set-macro-character #\n (constantly 'injected))
(list (read-from-string "(n)")
(with-standard-io-syntax
(let ((*read-eval* nil))
(read-from-string "(n)")))))
;; ((INJECTED) (N))
Generally, just that the standard code reader is so readily available and can read many kinds of input does not mean that you should use it to read anything but code.
There are many libraries for parsing a lot of things, e. g. parse-number for the Lisp number formats, fare-csv for CSV files (among many other CSV libraries), json-streams for JSON (again, many others). For most formats, you can just do a system-apropos lookup with Quicklisp.
Related
The question
Suppose you have a struct, like this:
(struct soldier (name rank serial-number) #:transparent)
(define s (soldier 'Smith 'private 100134))
How can you find out what fields soldier or s contains? Or what generic interfaces it supports, or what structure type properties it has?
Research efforts so far
(Skip this section if you already know the answer.)
I've been reading through the documentation on structs for the last few days, and I haven't been able to figure out how you're supposed to put the pieces together. I'm probably just missing some elementary tidbit of information that goes without saying to people who know Racket.
The chapter on Reflection and Security has a section "Structure Inspectors", which says:
An inspector provides access to structure fields and structure type information without the normal field accessors and mutators.
but I haven't understood how to get an inspector to provide that.
struct-info and struct-type-info provide some information, but not field names, interfaces, properties, etc.:
> (struct-type-info struct:soldier)
'soldier
3
0
#<procedure:soldier-ref>
#<procedure:soldier-set!>
'(0 1 2)
#f
#f
struct->vector and struct->list provide access to an instance's contents and the above data, but that's all:
> (struct->vector s)
'#(struct:soldier Smith private 100134)
If you could show me an example of how to inspect a struct type to see what's in it, that would probably clarify whatever soon-to-be-obvious-in-hindsight thing I'm not seeing here.
The field names are not available at run time. However you can at expansion time use syntax-local-value on the struct name to get some information.
A quick example:
#lang racket
(require (for-syntax racket/struct-info))
(struct foo (a b))
(begin-for-syntax
(display (extract-struct-info (syntax-local-value #'foo))))
Update
In this example:
#lang racket
(require (for-syntax racket/struct-info))
(struct foo (a [b #:mutable] c))
(begin-for-syntax
(display (extract-struct-info (syntax-local-value #'foo))))
The list of identifiers for mutators is: (#f #<syntax:4:8 set-foo-b!> #f).
That is only the second field is mutable.
The information is available at expansion time, so you can transfer the information to runtime by calling a macro that expands into a definition like (define info '(#f set-foo-b! #f) or similar.
For rapid prototyping purposes in common-lisp it would be convenient to be able to easily function-modify an object in an arbitrary data structure. This would seem to involve calling an arbitrary function on a place in the data structure, replacing the object at that place with the result of the function call. Common-lisp has a number of specialized modification macros (eg, incf, push, getf, etc) for particular types of objects, and setf for generalized place modification (eg, setf-second, setf-aref, setf-gethash, etc). But rather than inventing new specialized macros for other object types, or having to mentally consider the characteristics of each macro (slowing down development), it might be nice to have a generalized setf-like modification capability that was simpler to use than setf. For example, instead of (setf (second (getf plist indicator)) (1+ (second (getf plist indicator)))) or (incf (second (getf plist indicator))), one might write (callf (second (getf plist indicator)) #'1+), using any of the normal one argument functions (or lambda-expression) provided by common-lisp or the user. Here is an attempt at code:
(defun call (object function) (funcall function object))
(define-modify-macro callf (&rest args) call)
Will something like this work for all general cases, and can it actually simplify code in practice?
callf
I think what you are looking for is _f from OnLisp 12.4:
(defmacro _f (op place &rest args)
"Modify place using `op`, e.g., (incf a) == (_f a 1+)"
(multiple-value-bind (vars forms var set access)
(get-setf-expansion place)
`(let* (,#(mapcar #'list vars forms)
(,(car var) (,op ,access ,#args)))
,set)))
This uses get-setf-expansion - the workhorse of generalized reference handling.
Note that _f only works with single-value places.
IOW, (_f (values a b c) 1+) will not increment all 3 variables.
It is not all that hard to fix that though...
can it actually simplify code in practice?
This really depends on your coding style and the specific problem you are solving.
This is a console program in Common Lisp for a Hangman type game. The first player enters a string to be guessed by the second player. My input function is below --- unfortunately the characters typed by the first player remain visible.
With JavaScript it's simple, just use a password text entry box. With VB it's simple using the same sort of facility. Is there any way to do this using a native Common Lisp function?
Thanks, CC.
(defun get-answer ()
(format t "Enter the word or phrase to be guessed: ~%")
(coerce (string-upcase (read-line)) 'list))
(defun start-hangman ()
(setf tries 6)
(greeting)
(setf answer (get-answer))
(setf obscure (get-obscure answer))
(game-loop answer obscure))
Each implementation supports this differently.
You might want to use an auxiliary library like iolib.termios or cl-charms (interface to libcurses) if you want a portability layer above different implementations.
SBCL
I found a discussion thread about it for SBCL, and here is the code for that implementation, from Richard M. Kreuter:
(require :sb-posix)
(defun echo-off ()
(let ((tm (sb-posix:tcgetattr sb-sys:*tty*)))
(setf (sb-posix:termios-lflag tm)
(logandc2 (sb-posix:termios-lflag tm) sb-posix:echo))
(sb-posix:tcsetattr sb-sys:*tty* sb-posix:tcsanow tm)))
(defun echo-on ()
(let ((tm (sb-posix:tcgetattr sb-sys:*tty*)))
(setf (sb-posix:termios-lflag tm)
(logior (sb-posix:termios-lflag tm) sb-posix:echo))
(sb-posix:tcsetattr sb-sys:*tty* sb-posix:tcsanow tm)))
And so, here is finally an opportunity to talk about PROG2:
(defun read-silently ()
(prog2
(echo-off)
(read-line sb-sys:*tty*)
(echo-on)))
However, you might want to ensure that the echo is always reset when unwinding the stack, and clear the input before inputting things:
(defun read-silently ()
(echo-off)
(unwind-protect
(progn
(clear-input sb-sys:*tty*)
(read-line sb-sys:*tty*))
(echo-on)))
CL-CHARMS
Here is an alternative using libcurse. The following is sufficient to make a simple test work.
(defun read-silently ()
(let (input)
(charms:with-curses ()
(charms:disable-echoing)
(charms:enable-raw-input)
(clear-input *terminal-io*)
(setf input (read-line *terminal-io*))
(charms:disable-raw-input)
(charms:enable-echoing))
input))
Besides, using libcurse might help you implement a nice-looking hangman console game.
Are you printing to a console? That's an inherent limitation of standard consoles.
You'll need to print a ton of newlines to push the text off the screen.
Many consoles aren't capable of fancy things like selectively erasing parts of the screen.
I have a convoluted search request. Lets say that I am searching for an URI pattern. I do know the scheme and the authority. Lets say http://mycompany.com.
After this URI pattern, ideally most of the URI in my search domain have two path variable.
/Context/Resource. Although it could have more. But it will always have a context.
I would like to find the distinct set of first path variable. I do not mind about the second and subsequent path variable.So if I have this.Lets use a qname is myc.
myc:/context1/resource1
myc:/context1/resource2
myc:/context2/resource1
myc:/context3/resource1
myc:/context4/resource8
myc:/context1/resource12
I will have to get context1..4. Thank You for your time.
If I understand you correctly,
(require 'cl)
(remove-duplicates
(loop while (re-search-forward "myc:/\\(.*?\\)/" nil t)
collect (match-string-no-properties 1))
:test #'string=)
Emacs supports regex searches which are normally bound to C-M-s.
The Emacs manual has a nice section about regular expressions in Emacs.
There is also M-x regexp-builder to help you build the search string with real-time feedback.
Is there any programming language that allows Names to include white spaces ? (By names, I intend variables, methods, field, etc.)
Scala does allow whitespace characters in identifier names (but for that to be possible, you need to surround the identifiers with pair of backticks).
Example (executed at Scala REPL):
Welcome to Scala version 2.8.0.final (Java HotSpot(TM) Client VM, Java 1.6.0_22).
Type in expressions to have them evaluated.
Type :help for more information.
scala> val `lol! this works! :-D` = 4
lol! this works! :-D: Int = 4
scala> val `omg!!!` = 4
omg!!!: Int = 4
scala> `omg!!!` + `lol! this works! :-D`
res0: Int = 8
In SQL you can have spaces and other non-identifier characters in field names and such. You just have to quote them like [field name] or "field name".
Common Lisp can do it with variables, if you surround the variable name with pipes (|):
CL-USER> (setf |hello world| 42)
42
CL-USER> |hello world|
42
Worth noting is that "piped" variable names also are case sensitive (which variable names normally aren't in CL).
CL-USER> |Hello World|
The variable |Hello World| is unbound.
[Condition of type UNBOUND-VARIABLE]
CL-USER> (setf hello-world 99)
99
CL-USER> hello-world
99
CL-USER> HeLlO-WoRlD
99
PHP can: http://blog.riff.org/2008_05_11_spaces_php_variable_names
Perl also:
${'some var'} = 42;
print ${'some var'}, "\n";
${'my method'} = sub {
print "method called\n";
};
&${'my method'};
A more recent innovation and experimental web script (sub)type of JavaScript: https://github.com/featurist/pogoscript/wiki
wind speed = 25
average temperature = 32
becomes
windSpeed = 25
averageTemperature = 32
Behind the screens. Also flexible rules on positioning of return variables so you can do:
y = compute some value from (z) and return it
md5 hash (read all text from file "sample.txt")
Becomes:
var y;
y = computeSomeValueFromAndReturnIt(z);
md5Hash(readAllTextFromFile("sample.txt"));
In Ruby you can have symbols that are named as :"this has a space" but it is enclosed in double-quotes so I'm not sure if you count that.
If other languages allowed whitespace as a valid character in symbol names, then you would have to use some other character to separate them.
The problem with spaces in variable names is that it's subject to interpretation since whitespace normally means "ok, end of the current token, starting another." Exceptions to this rule must have some special indicator such as quotation marks in a string ("This is a test").
Our PARLANSE parallel programming language is one such. In fact, it allows any character in identifiers, although many of them, including spaces, have to be escaped (preceded by ~) to be included in the name. Here's an example:
~'Buffer~ Marker~'
This is used to let PARLANSE easily refer to arbitrary symbols from other languages (in particular, from EBNFs taken from arbitrary reference documents, where we can't control the punctuation used).
We don't use this feature a lot, but when it is needed it means we can stay true to tokens from other documents.
You might be able to find esoteric languages that don't separate expression elements with whitespaces on this website: http://99-bottles-of-beer.net
For example... whitespace :D
Some dialects of SQL allow databases, tables, and fields to have spaces in their names.
For example, in SQL Server, you can refer to a table with a space in its name, either by putting the table name in [square brackets] or (depending on connection options) in "double quotes".
There shouldn't be much problems creating such languages supporting whitespaces in identifiers, as long as there are enough separating tokens which say the parser where the identifiers end (such as operators, braces, commas and the infamous semicolon). It just doesn't improve the readability of the source code much.