Cannot assign string with single quote in golang - string

I am learning go and when playing with string I noticed that if a string is in single quotes then golang is giving me an error but double quotes are working fine.
func main() {
var a string
a = 'hello' //will give error
a = "hello" //will not give error
}
This is the error I get on my system:
illegal rune literal
While when I try to do the same on playground I am getting this error:
prog.go:9: missing '
prog.go:9: syntax error: unexpected name, expecting semicolon or newline or }
prog.go:9: newline in string
prog.go:9: empty character literal or unescaped ' in character literal
prog.go:9: missing '
I am not able to understand the exact reason behind this as in for example Python, Perl one can declare a string with both single and double quote.

In Go, '⌘' represents a single character (called a Rune), whereas "⌘" represents a string containing the character ⌘.
This is true in many programming languages where the difference between strings and characters is notable, such as C++.
Check out the "Code points, characters, and runes" section in the Go Blog on Strings

Another option, if you are wanting to embed double quotes:
package main
func main() {
s := `west "north" east`
println(s)
}
https://golang.org/ref/spec#raw_string_lit

Go is a statically typed language. Also Go is not a scripting language. Though we see Go is running like a scripting language, it is compiling the source we write and then execute the main function. So, we should treat Go as C, Java, C++ where single quote ' is used to declare characters (rune, char) unlike scripting languages like Python or JavaScript.
I think as this is a new language, and current trend is lying with scripting languages, this confusion has been occurred.

Related

Can I get scape characters still behave as such for a string provided by f:read() in Lua?

I'm working on a simple localization function for my scripts and, although it's starting to work quite well so far, I don't know how to avoid scape/special characters to be shown in UI as part of the text after feeding the widgets with the strings returned by f:read().
For example, if in a certain Strings.ES.txt's line I have: Ignorar \"Etiquetas de capa\", I'd expect backslashes didn't end showing up just like when I feed the widget with a normal string between doble quotes like: "Ignorar \"Etiquetas de capa\"", or at least have a way to avoid it. I've been trial-and-erroring with tostring() and load() functions and different (surely nonsense 🙄) concatenations like: load(tostring("[[" .. f:read()" .. ]]")) and such without any success, so here I'm again...
Do someone know if there is a way to get scape characters in a string returned by f:read() still behave as special as when they are found in a regular one?
I don't know how to avoid [e]scape/special characters to be shown in UI as part of the text
What you want is to "unescape" or "unquote" a string to interpret escape sequences as if it were parsed as a quoted string by Lua.
[...] with the strings returned by f:read() [...]
The fact that this string was obtained using f:read() can be ignored; all that matters is that it is a string literal without quotes using quoted string escapes.
I've been trial-and-erroring with tostring() and load() functions and different [...] concatenations like: load(tostring("[[" .. f:read()" .. ]]")) and such without any success [...]
This is almost how to do it, except you chose the wrong string literal type: "Long" strings using pairs square brackets ([ and ]) do not interpret escape sequences at all; they are intended for including long, raw, possibly multiline strings in Lua programs and often come in handy when you need to represent literal strings with backslashes (e.g. regular expressions - not to be confused with Lua patterns, which use % for escapes, and lack the basic alternation operator of regular expressions).
If you instead use single or double quotes to wrap the string, it will work fine:
local function unescape_string(escaped)
return assert(load(('return "%s"'):format(escaped)))()
end
this will produce a tiny Lua program (a "chunk") for each string, which just consists of return "<contents>". Recall that Lua chunks are just functions. Thus you can simply call the function to obtain the value of the string it returns. That way, Lua will interpret the escape sequences for us. The same approach is often used to use Lua for reading data serialized as Lua code.
Note also the use of assert for error handling: load returns nil, err if there is a syntax error. To deal with this gracefully, we can wrap the call to load in assert: assert returns its first argument (the chunk returned by load) if it is truthy; otherwise, if it is falsy (e.g. nil in this case), assert errors, using its second argument as an error message. If you omit the assert and your input causes a syntax error, you will instead get a cryptic "attempt to call a nil value" error.
You probably want to do additional validation, especially if these escaped strings are user-provided - otherwise a malicious string like str"; os.execute("...") can trivially invoke a remote code execution (RCE) vulnerability, allowing it to both execute Lua e.g. to block (while 1 do end), slow down or hijack your application, as well as shell commands using os.execute. To guard against this, searching for an unescaped closing quote should be sufficient (syntax errors e.g. through invalid escapes will still be possible, but RCE should not be possible excepting Lua interpreter bugs):
local function unescape_string(escaped)
-- match start & end of sequences of zero or more backslashes followed by a double quote
for from, to in escaped:gmatch'()\\*()"' do
-- number of preceding backslashes must be odd for the double quote to be escaped
assert((to - from) % 2 ~= 0, "unescaped double quote")
end
return assert(load(('return "%s"'):format(escaped)))()
end
Alternatively, a more robust (but also more complex) and presumably more efficient way of unescaping this would be to manually implement escape sequences through string.gsub; that way you get full control, which is more suitable for user-provided input:
-- Single-character backslash escapes of Lua 5.1 according to the reference manual: https://www.lua.org/manual/5.1/manual.html#2.1
local escapes = {a = '\a', b = '\b', f = '\b', n = '\n', r = '\r', t = '\t', v = '\v', ['\\'] = '\\', ["'"] = "'", ['"'] = '"'}
local function unescape_string(escaped)
return escaped:gsub("\\(.)", escapes)
end
you may implement escapes here as you see fit; for example, this misses decimal escapes, which could easily be implemented as escaped:gsub("\\(%d%d?%d?)", string.char) (this uses coercion of strings to numbers in string.char and a replacement function as second argument to string.gsub).
This function can finally be used straightforwardly as unescape_string(f:read()).

What $() syntax means for Groovy language?

I found this in Groovy Syntax documentation at 4.6.1. Special cases:
As slashy strings were mostly designed to make regexp easier so a few
things that are errors in GStrings like $() or $5 will work with
slashy strings.
What $() syntax means? give some usage examples please
I also found it at Define the Contract Locally in the Repository of the Fraud Detection Service:
body([ // (4)
"client.id": $(regex('[0-9]{10}')),
loanAmount : 99999
])
but I don't understand what $() means when used with regex('[0-9]{10}').
It means nothing (or what you make of it). There are two places, you
are addressing, but they have nothing to do with each other.
The docs just mention this as "you can use slashy strings to write
things, that would give you an error with a GString" - the same is true
for just using '-Strings.
E.g.
"hello $()"
Gives this error:
unknown recognition error type: groovyjarjarantlr4.v4.runtime.LexerNoViableAltException
org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed:
/tmp/x.groovy: 1: token recognition error at: '(' # line 1, column 9.
"hello $()"
The parser either wants a { or any char, that is a valid first char
for a variable (neither ( nor 5 is).
The other place you encountered $() (in Spring cloud contract), this
is just a function with the name $.
Form the docs 8. Contract DSL:
You can set the properties inside the body either with the value method or, if you use the Groovy map notation, with $()
So this is just a function, with a very short name.
E.g. you can try this yourself:
void $(x) { println x }
$("Hello")

Extract single unicode character from string

The problem begins when I stumble upon unicode characters. For example, árbol. Right now I handle this by asking if the character at position i, that is, string (i:i) is less than 127. That means that it belongs to the ASCII table, with this I know for sure thatstring (i:i) is a complete single character. In the other case (>= 127) and for my example 'árbol', string (1,2) is the complete character.
I think the way I'm handling the strings solves the problem for my practical purposes (handling files in spanish, polish and russian), but in the case of handling chinese letter where characters may take up to 4 bytes then I would have problems.
Is there a way in fortran to single out unicode characters inside a string?
gfortran does not currently support non-ASCII characters in UTF-8 encoded files, see here. You can find the corresponding bug report here.
As a work-around, you can specify the unicode char in Hex-notation: char(int(z'00E1'), ucs4), or '\u00E1'. The latter requires the compile option -fbackslash to enable the evaluation of the backslash.
program character_kind
use iso_fortran_env
implicit none
integer, parameter :: ucs4 = selected_char_kind ('ISO_10646')
character(kind=ucs4, len=20) :: string
! string = ucs4_'árbol' ! This does not work
! string = char(int(z'00E1'), ucs4) // ucs4_'rbol' ! This works
string = ucs4_'\u00E1rbol' ! This is also working
open (output_unit, encoding='UTF-8')
print *, string(1:1)
print *, string
end program character_kind
ifort seems not to support ISO_10646 at all, selected_char_kind ('ISO_10646') returns -1. With ifort 15.0.0 I get the same message as described here.

Single-quote notation for characters in Coq?

In most programming languages, 'c' is a character and "c" is a string of length 1. But Coq (according to its standard ascii and string library) uses "c" as the notation for both, which requires constant use of Open Scope to clarify which one is being referred to. How can you avoid this and designate characters in the usual way, with single quotes? It would be nice if there is a solution that only partially overrides the standard library, changing the notation but recycling the rest.
Require Import Ascii.
Require Import String.
Check "a"%char.
Check "b"%string.
or this
Program Definition c (s:string) : ascii :=
match s with "" => " "%char | String a _ => a end.
Check (c"A").
Check ("A").
I am quite confident that there is no smart way of doing this, but there is a somewhat annoying one: simply declare one notation for each character.
Notation "''c''" := "c" : char_scope.
Notation "''a''" := "a" : char_scope.
Check 'a'.
Check 'c'.
It shouldn't be too hard to write a script for automatically generating those declarations. I don't know if this has any negative side-effects on Coq's parser, though.

What's the point of nesting brackets in Lua?

I'm currently teaching myself Lua for iOS game development, since I've heard lots of very good things about it. I'm really impressed by the level of documentation there is for the language, which makes learning it that much easier.
My problem is that I've found a Lua concept that nobody seems to have a "beginner's" explanation for: nested brackets for quotes. For example, I was taught that long strings with escaped single and double quotes like the following:
string_1 = "This is an \"escaped\" word and \"here\'s\" another."
could also be written without the overall surrounding quotes. Instead one would simply replace them with double brackets, like the following:
string_2 = [[This is an "escaped" word and "here's" another.]]
Those both make complete sense to me. But I can also write the string_2 line with "nested brackets," which include equal signs between both sets of the double brackets, as follows:
string_3 = [===[This is an "escaped" word and "here's" another.]===]
My question is simple. What is the point of the syntax used in string_3? It gives the same result as string_1 and string_2 when given as an an input for print(), so I don't understand why nested brackets even exist. Can somebody please help a noob (me) gain some perspective?
It would be used if your string contains a substring that is equal to the delimiter. For example, the following would be invalid:
string_2 = [[This is an "escaped" word, the characters ]].]]
Therefore, in order for it to work as expected, you would need to use a different string delimiter, like in the following:
string_3 = [===[This is an "escaped" word, the characters ]].]===]
I think it's safe to say that not a lot of string literals contain the substring ]], in which case there may never be a reason to use the above syntax.
It helps to, well, nest them:
print [==[malucart[[bbbb]]]bbbb]==]
Will print:
malucart[[bbbb]]]bbbb
But if that's not useful enough, you can use them to put whole programs in a string:
loadstring([===[print "o m g"]===])()
Will print:
o m g
I personally use them for my static/dynamic library implementation. In the case you don't know if the program has a closing bracket with the same amount of =s, you should determine it with something like this:
local c = 0
while contains(prog, "]" .. string.rep("=", c) .. "]") do
c = c + 1
end
-- do stuff

Resources