Single-quote notation for characters in Coq? - string

In most programming languages, 'c' is a character and "c" is a string of length 1. But Coq (according to its standard ascii and string library) uses "c" as the notation for both, which requires constant use of Open Scope to clarify which one is being referred to. How can you avoid this and designate characters in the usual way, with single quotes? It would be nice if there is a solution that only partially overrides the standard library, changing the notation but recycling the rest.

Require Import Ascii.
Require Import String.
Check "a"%char.
Check "b"%string.
or this
Program Definition c (s:string) : ascii :=
match s with "" => " "%char | String a _ => a end.
Check (c"A").
Check ("A").

I am quite confident that there is no smart way of doing this, but there is a somewhat annoying one: simply declare one notation for each character.
Notation "''c''" := "c" : char_scope.
Notation "''a''" := "a" : char_scope.
Check 'a'.
Check 'c'.
It shouldn't be too hard to write a script for automatically generating those declarations. I don't know if this has any negative side-effects on Coq's parser, though.

Related

Can I get scape characters still behave as such for a string provided by f:read() in Lua?

I'm working on a simple localization function for my scripts and, although it's starting to work quite well so far, I don't know how to avoid scape/special characters to be shown in UI as part of the text after feeding the widgets with the strings returned by f:read().
For example, if in a certain Strings.ES.txt's line I have: Ignorar \"Etiquetas de capa\", I'd expect backslashes didn't end showing up just like when I feed the widget with a normal string between doble quotes like: "Ignorar \"Etiquetas de capa\"", or at least have a way to avoid it. I've been trial-and-erroring with tostring() and load() functions and different (surely nonsense 🙄) concatenations like: load(tostring("[[" .. f:read()" .. ]]")) and such without any success, so here I'm again...
Do someone know if there is a way to get scape characters in a string returned by f:read() still behave as special as when they are found in a regular one?
I don't know how to avoid [e]scape/special characters to be shown in UI as part of the text
What you want is to "unescape" or "unquote" a string to interpret escape sequences as if it were parsed as a quoted string by Lua.
[...] with the strings returned by f:read() [...]
The fact that this string was obtained using f:read() can be ignored; all that matters is that it is a string literal without quotes using quoted string escapes.
I've been trial-and-erroring with tostring() and load() functions and different [...] concatenations like: load(tostring("[[" .. f:read()" .. ]]")) and such without any success [...]
This is almost how to do it, except you chose the wrong string literal type: "Long" strings using pairs square brackets ([ and ]) do not interpret escape sequences at all; they are intended for including long, raw, possibly multiline strings in Lua programs and often come in handy when you need to represent literal strings with backslashes (e.g. regular expressions - not to be confused with Lua patterns, which use % for escapes, and lack the basic alternation operator of regular expressions).
If you instead use single or double quotes to wrap the string, it will work fine:
local function unescape_string(escaped)
return assert(load(('return "%s"'):format(escaped)))()
end
this will produce a tiny Lua program (a "chunk") for each string, which just consists of return "<contents>". Recall that Lua chunks are just functions. Thus you can simply call the function to obtain the value of the string it returns. That way, Lua will interpret the escape sequences for us. The same approach is often used to use Lua for reading data serialized as Lua code.
Note also the use of assert for error handling: load returns nil, err if there is a syntax error. To deal with this gracefully, we can wrap the call to load in assert: assert returns its first argument (the chunk returned by load) if it is truthy; otherwise, if it is falsy (e.g. nil in this case), assert errors, using its second argument as an error message. If you omit the assert and your input causes a syntax error, you will instead get a cryptic "attempt to call a nil value" error.
You probably want to do additional validation, especially if these escaped strings are user-provided - otherwise a malicious string like str"; os.execute("...") can trivially invoke a remote code execution (RCE) vulnerability, allowing it to both execute Lua e.g. to block (while 1 do end), slow down or hijack your application, as well as shell commands using os.execute. To guard against this, searching for an unescaped closing quote should be sufficient (syntax errors e.g. through invalid escapes will still be possible, but RCE should not be possible excepting Lua interpreter bugs):
local function unescape_string(escaped)
-- match start & end of sequences of zero or more backslashes followed by a double quote
for from, to in escaped:gmatch'()\\*()"' do
-- number of preceding backslashes must be odd for the double quote to be escaped
assert((to - from) % 2 ~= 0, "unescaped double quote")
end
return assert(load(('return "%s"'):format(escaped)))()
end
Alternatively, a more robust (but also more complex) and presumably more efficient way of unescaping this would be to manually implement escape sequences through string.gsub; that way you get full control, which is more suitable for user-provided input:
-- Single-character backslash escapes of Lua 5.1 according to the reference manual: https://www.lua.org/manual/5.1/manual.html#2.1
local escapes = {a = '\a', b = '\b', f = '\b', n = '\n', r = '\r', t = '\t', v = '\v', ['\\'] = '\\', ["'"] = "'", ['"'] = '"'}
local function unescape_string(escaped)
return escaped:gsub("\\(.)", escapes)
end
you may implement escapes here as you see fit; for example, this misses decimal escapes, which could easily be implemented as escaped:gsub("\\(%d%d?%d?)", string.char) (this uses coercion of strings to numbers in string.char and a replacement function as second argument to string.gsub).
This function can finally be used straightforwardly as unescape_string(f:read()).

Python regular expressions with Foreign characters in python PyQT5

This problem might be very simple but I find it a bit confusing & that is why I need help.
With relevance to this question I posted that got solved, I got a new issue that I just noticed.
Source code:
from PyQt5 import QtCore,QtWidgets
app=QtWidgets.QApplication([])
def scroll():
#QtCore.QRegularExpression(r'\b'+'cat'+'\b')
item = listWidget.findItems(r'\bcat\b', QtCore.Qt.MatchRegularExpression)
for d in item:
print(d.text())
window = QtWidgets.QDialog()
window.setLayout(QtWidgets.QVBoxLayout())
listWidget = QtWidgets.QListWidget()
window.layout().addWidget(listWidget)
cats = ["love my cat","catirization","cat in the clouds","catść"]
for i,cat in enumerate(cats):
QtWidgets.QListWidgetItem(f"{i} {cat}", listWidget)
btn = QtWidgets.QPushButton('Scroll')
btn.clicked.connect(scroll)
window.layout().addWidget(btn)
window.show()
app.exec_()
Output GUI:
Now as you can see I am just trying to print out the text data based on the regex r"\bcat\b" when I press the "Scroll" button and it works fine!
Output:
0 love my cat
2 cat in the clouds
3 catść
However... as you can see on the #3, it should not be printed out cause it obviously does not match with the mentioned regular expression which is r"\bcat\b". However it does & I am thinking it has something to do with that special foreign character ść that makes it a match & prints it out (which it shouldn't right?).
I'm expecting an output like:
0 love my cat
2 cat in the clouds
Researches I have tried
I found this question and it says something about this \p{L} & based on the answer it means:
If all you want to match is letters (including "international"
letters) you can use \p{L}.
To be honest I'm not so sure how to apply that with PyQT5 also still I've made some tries & and I tried changing the regex to like this r'\b'+r'\p{cat}'+r'\b'. However I got this error.
QString::contains: invalid QRegularExpression object
QString::contains: invalid QRegularExpression object
QString::contains: invalid QRegularExpression object
QString::contains: invalid QRegularExpression object
Obviously the error says it's not a valid regex. Can someone educate me on how to solve this issue? Thank you!
In general, when you need to make your shorthand character classes and word boundaries Unicode-aware, you need to pass the QRegularExpression.UseUnicodePropertiesOption option to the regex compiler. See the QRegularExpression.UseUnicodePropertiesOption reference:
The meaning of the \w, \d, etc., character classes, as well as the meaning of their counterparts (\W, \D, etc.), is changed from matching ASCII characters only to matching any character with the corresponding Unicode property. For instance, \d is changed to match any character with the Unicode Nd (decimal digit) property; \w to match any character with either the Unicode L (letter) or N (digit) property, plus underscore, and so on. This option corresponds to the /u modifier in Perl regular expressions.
In Python, you could declare it as
rx = QtCore.QRegularExpression(r'\bcat\b', QtCore.QRegularExpression.UseUnicodePropertiesOption)
However, since the QListWidget.findItems does not support a QRegularExpression as argument and only allows the regex as a string object, you can only use the (*UCP) PCRE
verb as an alternative:
r'(*UCP)\bcat\b'
Make sure you define it at the regex beginning.

How can I use arbitrary text as a function name in Rust?

Is there a way in Rust to use any text as a function name? Something like:
fn 'This is the name of the function' { ... }
I find it useful for test functions and it is is allowed by other languages.
There's no way. According to the official reference:
An identifier is any nonempty ASCII string of the following form:
Either
The first character is a letter.
The remaining characters are alphanumeric or _.
Or
The first character is _.
The identifier is more than one character. _ alone is not an identifier.
The remaining characters are alphanumeric or _.
A raw identifier is like a normal identifier, but prefixed by r#. (Note that
the r# prefix is not included as part of the actual identifier.)
Unlike a normal identifier, a raw identifier may be any strict or reserved
keyword except the ones listed above for RAW_IDENTIFIER.
You can't have spaces in function names (and this is true of most programming languages). Usual practice for function names in Rust is to replace spaces with underscores, so the following is allowed:
fn This_is_the_name_of_the_function { ... }
although usual practice would use a lower-case t

What's the point of nesting brackets in Lua?

I'm currently teaching myself Lua for iOS game development, since I've heard lots of very good things about it. I'm really impressed by the level of documentation there is for the language, which makes learning it that much easier.
My problem is that I've found a Lua concept that nobody seems to have a "beginner's" explanation for: nested brackets for quotes. For example, I was taught that long strings with escaped single and double quotes like the following:
string_1 = "This is an \"escaped\" word and \"here\'s\" another."
could also be written without the overall surrounding quotes. Instead one would simply replace them with double brackets, like the following:
string_2 = [[This is an "escaped" word and "here's" another.]]
Those both make complete sense to me. But I can also write the string_2 line with "nested brackets," which include equal signs between both sets of the double brackets, as follows:
string_3 = [===[This is an "escaped" word and "here's" another.]===]
My question is simple. What is the point of the syntax used in string_3? It gives the same result as string_1 and string_2 when given as an an input for print(), so I don't understand why nested brackets even exist. Can somebody please help a noob (me) gain some perspective?
It would be used if your string contains a substring that is equal to the delimiter. For example, the following would be invalid:
string_2 = [[This is an "escaped" word, the characters ]].]]
Therefore, in order for it to work as expected, you would need to use a different string delimiter, like in the following:
string_3 = [===[This is an "escaped" word, the characters ]].]===]
I think it's safe to say that not a lot of string literals contain the substring ]], in which case there may never be a reason to use the above syntax.
It helps to, well, nest them:
print [==[malucart[[bbbb]]]bbbb]==]
Will print:
malucart[[bbbb]]]bbbb
But if that's not useful enough, you can use them to put whole programs in a string:
loadstring([===[print "o m g"]===])()
Will print:
o m g
I personally use them for my static/dynamic library implementation. In the case you don't know if the program has a closing bracket with the same amount of =s, you should determine it with something like this:
local c = 0
while contains(prog, "]" .. string.rep("=", c) .. "]") do
c = c + 1
end
-- do stuff

Lua string.format options

This may seem like a stupid question, but what are the symbols used for string replacement in string.format? can someone point me to a simple example of how to use it?
string.format in Lua follows the same patterns as Printf in c:
https://cplusplus.com/reference/cstdio/printf/
There are some exceptions, for those see here:
http://pgl.yoyo.org/luai/i/string.format
Chapter 20 of PiL describes string.format near the end:
The function string.format is a
powerful tool when formatting strings,
typically for output. It returns a
formatted version of its variable
number of arguments following the
description given by its first
argument, the so-called format string.
The format string has rules similar to
those of the printf function of
standard C: It is composed of regular
text and directives, which control
where and how each argument must be
placed in the formatted string.
The Lua Reference says:
The format string follows the same
rules as the printf family of standard
C functions. The only differences are
that the options/modifiers *, l, L, n,
p, and h are not supported and that
there is an extra option, q.
The function is implemented by str_format() in strlib.c which itself interprets the format string, but defers to the C library's implementation of sprintf() to actually format each field after determining what type of value is expected (string or number, essentially) to correspond to each field.
There should be "Lua Quick Reference" html file in your hard disk, if you used an installation package.
(for example: ../Lua/5.1/docs/luarefv51.html)
There you'll find, among other things,
string.format (s [, args ])
Formatting directives
Formatting field types
Formatting flags
Formatting examples
To add to the other answers: Lua does have a boolean data type, where C does not. C uses numbers for that, where 0 is false and everything else is true.
However, to format a boolean in a String in Lua,
local text = string.format("bool is %d", truth)
gets (at least in Hammerspoon):
bad argument #2 to 'format' (number expected, got boolean)
You can instead use %s for booleans (as for strings):
local text = string.format("bool is %s", truth)

Resources