Create a string in golang with characters that require escaping - string

I am trying to make a string variable containing :
"C:\Program Files\Sublime Text 3\sublime_text.exe" C:\Users\User\Desktop\Guess.py
Unfortunately I am not succeeding in doing so. Is there a way to put the text about as is into a variable, double quotes and all?

In your example string you have characters that need escaping: " and \
fmt.Println("\"C:\\Program Files\\Sublime Text 3\\sublime_text.exe\" C:\\Users\\User\\Desktop\\Guess.py")
You can also use back quotes to create what is called a raw string which doesn't require escaping those characters.
fmt.Println(`"C:\Program Files\Sublime Text 3\sublime_text.exe" C:\Users\User\Desktop\Guess.py`)
List of escapes:
\a U+0007 alert or bell
\b U+0008 backspace
\f U+000C form feed
\n U+000A line feed or newline
\r U+000D carriage return
\t U+0009 horizontal tab
\v U+000b vertical tab
\\ U+005c backslash
\' U+0027 single quote (valid escape only within rune literals)
\" U+0022 double quote (valid escape only within string literals)
See the official docs.

Related

How to unescape special characters in Rust?

For example, if you have escaped string like
Hello wo\\\\rld.txt
you want to unescape it and make it, Hello wo\\rld.txt
essentially making,
\\ -> \, \\r -> \r, \\n -> \n,
etc
I tried doing string replace like:
out = out.replace("\\", "\");
but that is syntax error
You still have to escape every \ in your string literals i.e. out = out.replace("\\\\", "\\");

how to find special character like \t \n \u and single quotation mark (") in presto

I'm using presto. I have a table that contain address information. It has varchar format.
How do I find addresses that contain special characters like:
\t (tab)
\n (newline)
\u
single quotation mark (')
You can use LIKE with literal containing newline. Convenient way it to use unicode escapes for this (newline \n is U+000A in Unicode):
col LIKE U&'%\000A%'
U&'...' creates string literal, just like '...'.
The only difference is that U&'...' supports \hhhh escapes for Unicode.
Example:
presto:default> SELECT 'abc
-> def' LIKE U&'%\000A%';
_col0
-------
true
(1 row)
Tested on Presto 324.

how do you count and replace a string in a text file that starts at the end of one line and continues on the next using linux commands?

I have a large (4 GB) Windows .csv text file (each lines end in "\r\n") in a Linux environment that was supposed to have been a csv delimited file (delimiter = '|', text qualifier = '"') with each field separated by a pipe and enclosed in double quotes. Any narrative text field with embedded double quotes was supposed to have the double quote escaped with a second double quote (ie. " the quick "brown" fox" was supposed to have been represented as "the quick ""brown"" fox"). Unfortunately escaping the embedded double quotes did not occur. Further the text fields may include embedded new lines (i.e. Windows CR (\r\n)) which need to be retained.
Sample lines might look as follows:
"1234567890123456"|"2016-07-30"|"2016-08-01"|"123"|"456"|"789"|"text narrative field starts\r\n
with text lines that may have embedded double quotes "For example"\r\n
and may include measurements such as 1/2" x 2" with \r\n
the text continuing and includes embedded line breaks \r\n
which will finally be terminated with a double quote"\r\n
"9876543210654321"|"2017-01-31"|"2018-08-01"|"123"|"456"|"789"|"text narrative field"\r\n
"2345678901234567"|"...."\r\n
with the objective to have the output appear as follows:
~1234567890123456~|~2016-07-30~|~2016-08-01~|~123~|~456~|~789~|~text narrative field starts\r\n
with text lines that may have embedded double quotes ""For example""\r\n
and may include measurements such as 1/2"" x 2"" with \r\n
the text continuing and includes embedded line breaks \r\n
which will finally be terminated with a double quote~\r\n
~9876543210654321~|~2017-01-31~|~2018-08-01~|~123~|~456~|~789~|~text narrative field~\r\n
~2345678901234567~|~....~\r\n
The solution I was attempting to implement was to:
SUCCESSFUL: change all the "|" sequences to ~|~
SUCCESSFUL: change the double quote (")at the start of the first line and end of the last line to a tilde (~)
change the ending and starting double quotes to tildes for any lines ending in a double quote at the end of the first line and terminated with a CR (\r\n) (eg. ..."\r\n) and the next line begins with a double quote, followed by 16 digit number and a tilde (eg. "1234567890123456~...) (i.e. it is the start of a new record)
convert all remaining double quote characters to two successive double quotes (change " to "")
then reverse the first 3 steps above changing all ~ back to double quotes.
I started by using sed to replace all strings with double quote, followed by a pipe, followed by a double quote (i.e. "|") with a tilde, pipe, tilde (i.e. ~|~). I then manually replaced the first and last doublequote in the file with a tilde.
This is where I ran into issues as I tried to count the number of occurrences where a line ends with a doublequote(") and the start of the next line begins with a doublequote followed by a 16 digit number and a "~" which will tell me the actual number of csv records in the file (minus one) as opposed to the number of lines. I attempted to do this using grep: grep '"\r\n"\d{16}~' | wc -l but that didn't work
I then need to replace those double quotes wherein a double quote ends a record and the succeeding record begins with a double quote followed by a 16 digit number and a "~" leaving everything else intact.
I tried to use sed: sed 's/"\r\n"(\d{16}~)/~\r\n~\1' windows_file.txt but it is not working as hoped.
I would welcome any recommendations as to how to accomplish the above.
The script below does what you expect using awk, except for the very last line in the file since it does not know where that record ends.
It could be fixed counting lines in the file but would be impractical since it's a big file.
Looking at data structure records are separated by "\r\n" and fields by "|" let's use that with awk.
gawk 'BEGIN{
RS="\"\r\n\"" # input record separator RS, 2 double quotes with a DOS line ending in the middle
FS="\"\\|\"" # input field separator FS, 2 double quotes with a pipe in the middle
ORS="~\r\n~" # your record separator
OFS="~|~" # your field separator
} {
$1=$1 # trick awk into believing something has changed
if (NR == 1){ # first record, replace first character
print "~" substr($0,2)
}else{
print $0
}
} ' test.txt
Result (assuming lines end with \r\n):
~1234567890123456~|~2016-07-30~|~2016-08-01~|~123~|~456~|~789~|~text narrative field starts
with text lines that may have embedded double quotes "For example"
and may include measurements such as 1/2" x 2" with
the text continuing and includes embedded line breaks
which will finally be terminated with a double quote~
~9876543210654321~|~2017-01-31~|~2018-08-01~|~123~|~456~|~789~|~text narrative field~
~10654321~|~2018-09-31~|~2018-08-01~|~123~|~456~|~789~|~asdasdasdasdad asasda"
~
~
PS: will break if a field contains a line that starts with " and the preceding line within the same ends with "\r\n since the pattern will match the proposed RS.
"10654321"|"2018-09-31"|"2018-08-01"|"123"|"456"|"789"|"asdasdasdasdad asasda"\r\n
"some more"\r\n
"22222"|".... (another record)

Unescape a string with escaped sequences in Delphi

I use Delphi 5 and have a String like this from a http-connection:
str :='content=bell=7'#$8'size=20'#$8'other1'#$D#$A#$8'other2'
This string contains some sequence with escape characters and i want to unescape these characters. If I use the trim function, the escape sequence are still inside. Maybe this is because '#$8' is no viewable sign?
How can i replace '#&8' separately. For example with '&', so that i get the string:
str1 :='content=bell=7&size=20&other1'#$D#$A'&other2'
After this I can use trim to unescape the other sequences.
str2 :='content=bell=7&size=20&other1#13#10&other2'
Those are Delphi character sequences. The compiler interprets them as it processes your source file. It converts #$8 into a backspace character in the string. If you want to replace that character with something else, you could call StringReplace. (If that's your real code, then you could just skip the extra function call and use the desired characters in the string literal directly in your code.)
str2 := StringReplace(str1, #8, '&', [rfReplaceAll]);
Trim removes whitespace from the start and end of a string, but your characters aren't at either end.

How to use backslash escape char for new line in JavaCC?

I have an assignment to create a lexical analyser and I've got everything working except for one bit.
I need to create a string that will accept a new line, and the string is delimited by double quotes.
The string accepts any number, letter, some specified punctuation, backslashes and double quotes within the delimiters.
I can't seem to figure out how to escape a new line character.
Is there a certain way of escaping characters like new line and tab?
Here's some of my code that might help
< STRING : ( < QUOTE> (< QUOTE > | < BACKSLASH > | < ID > | < NUM > | " " )* <QUOTE>) >
< #QUOTE : "\"" >
< #BACKSLASH : "\\" >
So my string should allow for a quote, then any of the following characters like a backslash, a whitespace, a number etc, and then followed by another quote.
The newline char like "\n" is what's not working.
Thanks in advance!
For string literals, JavaCC borrows the syntax of Java. So, a single-character literal comprising a carriage return is escaped as "\r", and a single-character literal comprising a line feed is escaped as "\n".
However, the processed string value is just a single character; it is not the escape itself. So, suppose you define a token for line feed:
< LF : "\n" >
A match of the token <LF> will be a single line-feed character. When substituting the token in the definition of another token, the single character is effectively substituted. So, suppose you have the higher-level definition:
< STRING : "\"" ( <LF> ) "\"" >
A match of the token <STRING> will be three characters: a quotation mark, followed by a line feed, followed by a quotation mark. What you seem to want instead is for the escape sequence to be recognized:
< STRING : "\"" ( "\\n" ) "\"" >
Now a match of the token <STRING> will be four characters: a quotation mark, followed by an escape sequence representing a line feed, followed by a quotation mark.
In your current definition, I see that other often-escaped metacharacters like quotation mark and backslash are also being recognized literally, rather than as escape sequences.

Resources