I'm having a bit of an issue trying to do an ascii art challenge in GS, since it requires you finishing a line with the \ symbol.
The problem is that "\"p breaks the program since it thinks you escaped a quote, and "\\"p prints two backslashes. I've tried string concatenation, removing one character at a time, printing substrings, etcetc - Nothing seems to work!
I need this string to be printed out, so how would this be done?
It seems that the behavior with p is buggy. I'll look for a place to report it.
However, "\\" by itself does not print two backslashes; it prints one.
Here's a test link to prove it.
Output:
\
"\\" creates a string with 1 backslash because strings are escaped. This is the same as languages like Ruby.
p escapes strings, so a string with one backslash will be displayed as two. This is also the same as languages like Ruby.
So if you want to print a single backslash, or print things without the quotes, you need to print unescaped strings. The best way to do this is with implicit IO (anything on the stack that is left over is printed unescaped).
The program
"\\"
Should print
\
You could also use print or puts if you don't want to use implicit IO.
Related
i would like to create a string in the following exact format :
"\"\"\"\nThis is a beautiful world.\n"
But the code :
test ="\"\"\"\nthis is a beautiful world.\n"
test
gives the output :
'"""\nthis is a beautiful world.'
please help in getting an exact text.
My string test should look exactly like the string it has been initialized to. but after initialization, it actually gives the output as mentioned. i want to concatenate the test string to another string
the symbol "\" is called an escape character in most programming languages. this is used to add symbols to a string that might not be easy to add. eg - to add a double quote into a string, we add the \" to the string. eg -
a = "he said, \"hello\" to me"
this would give the output as -
he said "hello" to me
here, the "\" acts as a symbol for the code which allows it to recognize characters which might raise errors other-wise.
to include a backslash in your code, add an extra backslash to it. eg -
a = "\\"
here, the value of a is \.
if you still haven't been able to understand it, try - this tutorial
for your code,try this -
test = "\\\"\\\"\\\"\\nThis is a beautiful world.\\n"
and if you also want the double quotes at the ends,
test = "\"\\\"\\\"\\\"\\nThis is a beautiful world.\\n\""
The first thing to note is, that just typing in the variable name when running python interactively returns the canonical string representation of the object and not (necessarily) the plain value of the object.
For strings this means (among other things) that quotes are added around the output (in your example the outermost single quotes) and any newlines are replaced with "\n".
This means that, although your output does show "\n" the actual string contains a newline character in its place.
The check what a string looks like, you should use the print() function to, well, print it.
>>> test = "\"\"\"\nthis is a beautiful world."
>>> test
'"""\nthis is a beautiful world.'
>>> print(test)
"""
this is a beautiful world.
>>>
Also, when running the code from a file, lines just containing variable names will not result in any output.
To answer the question
There are a few ways to handle that.
Assuming that the desired output is
"""\nThis is a beautiful world.\n
i.e. the outermost double quotes are not supposed to be part of the string, that is
While using double quotes ("…") to denote strings: escape any \ or " by prepending it with \:
>>> test ="\\\"\\\"\\\"\\nthis is a beautiful world.\\n"
>>> print(test)
\"\"\"\nthis is a beautiful world.\n
Within regular strings \ is used to designate control character. For example: \n is interpreted as newline, \b would be a backspace. If you need to have a \ in a string, you need actually write two \\.
If you are usually using "…" for string notation, this allows for a more consistent coding style but it is (especially in this case) quite ugly and might be hard to understand at a glance.
As your string contains a lot of " characters, just use single quotes ('…') to designate the string. This removes the need to escape ":
>>> test = '\\"\\"\\"\\nthis is a beautiful world.\\n'
>>> print(test)
\"\"\"\nthis is a beautiful world.\n
This is less consistent (if "…" is usually used for strings, but allows the code to be quite a bit closer to the desired output.
Use raw strings (r'…' or r"…") to disable the interpretation of control characters and allow the use of " within the string:
>>> test = r'\"\"\"\nthis is a beautiful world.\n'
>>> print(test)
\"\"\"\nthis is a beautiful world.\n"
or even
>>> test = r"\"\"\"\nthis is a beautiful world.\n"
>>> print(test)
\"\"\"\nthis is a beautiful world.\n"
This allows the code to be identical to the desired output, but it has some limitations when it comes to freely mixing " and ' within a single string as it is not possible to escape the quotation marks within the string without also adding \ to the string output. This can be seen in the second example, where we use \" to escape the double quote within r"…" in the code but where the \s are still present in the output. While this works well in this specific case, I would recommend against using \' within r'…' or \" within r"…" to avoid confusion.
Please note, that the question is similar like this one, but still different so that those answers won't solve my problem:
For insertion of control characters like e.g. \x08, it seems that I have to use double quotes ".
All spaces needs to be preserved exactly as given. For line breaks I use explicitly \n.
I have some string data which I need to store in YAML, e.g.:
" This is my quite long string data "
"This is my quite long string data"
"This_is_my_quite_long_string_data"
"Sting data\nwhich\x08contains control characters"
and need it in YAML as something like this:
Key: " This is my" +
" quite long " +
" string data "
This is no problem as long as I stay on a single line, but I don't know how to put the string content to multiple lines.
YAML block scalar styles (>, |) won't help here, because they don't allow escaping and they even do some whitespace stripping, newline / space substitution which is useless for my case.
Looks that the only way seems to be using double quoting " and backslashes \, like this:
Key: "\
This is \
my quite \
long string data\
"
Trying this in YAML online parser results in "This is my quite long string data" as expected.
But it unfortunately fail if one of the "sub-lines" has leading space, like this:
Key: "\
This is \
my quite\
long st\
ring data\
"
This results in "This is my quitelong string data", removed the space between the words quite and long of this example. The only thing that comes to my mind to solve that, is to replace the first leading space of each sub-line by \x20 like this:
Key: "\
This is \
my quite\
\x20long st\
ring data\
"
As I'd chosen YAML to have a best possible human readable format, I find that \x20 a bit ugly solution. Maybe someone know a better approach?
For keeping human readable, I also don't want to use !!binary for this.
Instead of \x20, you can simply escape the first non-indentation space on the line:
Key: "\
This is \
my quite\
\ long st\
ring data\
"
This works with multiple spaces, you only need to escape the first one.
You are right in your observation that control characters can only be represented in double quoted scalars.
However the parser doesn't fail if the sub-lines (in YAML speak: continuation lines) have a leading space. It is your interpretation of the YAML standard that is incorrect. The standard explicitly states that for multi-line double quoted scalars:
All leading and trailing white space characters are excluded from the content.
So you can put as many spaces as you want before long as you want, it will not make a difference.
The representer for double quoted scalars for Python (both in ruamel.yaml and PyYAML) always does represent newlines as \n. I am not aware of YAML representers in other languages where you have more control over this (and e.g. get double newlines to represent \n in your double quoted scalars). So you probably have to write your own representer.
While writing a representer you can try to make the line breaking be smart, in that it minimizes the number of escaped spaces (by putting them between words on the same line). But especially on strings with a high double space to word ratio, combined with a small width to operate in, it will be hard (if not impossible) to do without escaped spaces.
Such a representer should IMO first check if double quoting is necessary (i.e. there are control characters apart from newlines). If not, and there are newlines you are probably better of representing the string a block style literal scalar (for which spaces at the beginning or end of line are not excluded).
So I have this string:
{"scores":{"1":["John",60],"2":["Jude",60],"3":["Max",60],"4":["Kyle",60],"5":["Smith",60],"6":["Mark",50],"7":["Luke",40],"8":["Anne",30],"9":["Bruce",20],"10":["kazuo",10]}}
There are a number of integers there that have quotes around them, and I want to get rid of them. How do I do that? I already tried out:
print(string.gsub(string, '/"(\d)"/', "%1"));
but it does not work. :(
Lua does not have regular expressions like Perl, instead, it does have patterns. These are similar with a few differences.
There is no need for delimiting slashes / /, and the escaping character is % but not \. Otherwise, your trial is essentially correct:
print(string.gsub(str, '"(%d+)"', "%1"))
Where str is the variable containing the input string. Also note that string.gsub returns 2 values, which are both printed, the second result being the number of substitutions. Use an extra pair of parentheses to keep only the first result.
You can simplify a little the notation using the colon : operator :
print((str:gsub('"(%d+)"', "%1")))
While running an R-plugin in SPSS, I receive a Windows path string as input e.g.
'C:\Users\mhermans\somefile.csv'
I would like to use that path in subsequent R code, but then the slashes need to be replaced with forward slashes, otherwise R interprets it as escapes (eg. "\U used without hex digits" errors).
I have however not been able to find a function that can replace the backslashes with foward slashes or double escape them. All those functions assume those characters are escaped.
So, is there something along the lines of:
>gsub('\\', '/', 'C:\Users\mhermans')
C:/Users/mhermans
You can try to use the 'allowEscapes' argument in scan()
X=scan(what="character",allowEscapes=F)
C:\Users\mhermans\somefile.csv
print(X)
[1] "C:\\Users\\mhermans\\somefile.csv"
As of version 4.0, introduced in April 2020, R provides a syntax for specifying raw strings. The string in the example can be written as:
path <- r"(C:\Users\mhermans\somefile.csv)"
From ?Quotes:
Raw character constants are also available using a syntax similar to the one used in C++: r"(...)" with ... any character sequence, except that it must not contain the closing sequence )". The delimiter pairs [] and {} can also be used, and R can be used in place of r. For additional flexibility, a number of dashes can be placed between the opening quote and the opening delimiter, as long as the same number of dashes appear between the closing delimiter and the closing quote.
First you need to get it assigned to a name:
pathname <- 'C:\\Users\\mhermans\\somefile.csv'
Notice that in order to get it into a name vector you needed to double them all, which gives a hint about how you could use regex. Actually, if you read it in from a text file, then R will do all the doubling for you. Mind you it not really doubling the backslashes. It is being stored as a single backslash, but it's being displayed like that and needs to be input like that from the console. Otherwise the R interpreter tries (and often fails) to turn it into a special character. And to compound the problem, regex uses the backslash as an escape as well. So to detect an escape with grep or sub or gsub you need to quadruple the backslashes
gsub("\\\\", "/", pathname)
# [1] "C:/Users/mhermans/somefile.csv"
You needed to doubly "double" the backslashes. The first of each couple of \'s is to signal to the grep machine that what next comes is a literal.
Consider:
nchar("\\A")
# returns `[1] 2`
If file E:\Data\junk.txt contains the following text (without quotes): C:\Users\mhermans\somefile.csv
You may get a warning with the following statement, but it will work:
texinp <- readLines("E:\\Data\\junk.txt")
If file E:\Data\junk.txt contains the following text (with quotes): "C:\Users\mhermans\somefile.csv"
The above readlines statement might also give you a warning, but will now contain:
"\"C:\Users\mhermans\somefile.csv\""
So, to get what you want, make sure there aren't quotes in the incoming file, and use:
texinp <- suppressWarnings(readLines("E:\\Data\\junk.txt"))
I am looking for some best practices as far as handling csv and tab delimited files.
For CSV files I am already doing some formatting if a value contains a comma or double quote but what if the value contains a new line character? Should I leave the new line intact and encase the value in double quotes + escape any double quotes within the value?
Same question for tab delimited files. I assume the answer would be very similar if not the same.
Usually you keep \n unaltered while exploiting the fact that the newline char will be enclosed in a " " string. This doesn't create ambiguities but it's really ugly if you have to take a look to the file using a normal texteditor.
But it is how you should do since you don't escape anything inside a string in a CSV except for the double quote itself.
#Jack is right, that your best bet is to keep the \n unaltered, since you'll expect it inside of double-quotes if that is the case.
As with most things, I think consistency here is key. As far as I know, your values only need to be double-quoted if they span multiple lines, contain commas, or contain double-quotes. In some implementations I've seen, all values are escaped and double-quoted, since it makes the parsing algorithm simpler (there's never a question of escaping and double-quoting, and the reverse on reading the CSV).
This isn't the most space-optimized solution, but makes reading and writing the file a trivial affair, for both your own library and others that may consume it in the future.
For TSV, if you want lossless representation of values, the "Linear TSV" specification is worth considering: http://paulfitz.github.io/dataprotocols/linear-tsv/index.html
For obvious reasons, most such conventions adhere to the following at a minimum:
\n for newline,
\t for tab,
\r for carriage return,
\\ for backslash
Some tools add \0 for NUL.