I use Delphi 5 and have a String like this from a http-connection:
str :='content=bell=7'#$8'size=20'#$8'other1'#$D#$A#$8'other2'
This string contains some sequence with escape characters and i want to unescape these characters. If I use the trim function, the escape sequence are still inside. Maybe this is because '#$8' is no viewable sign?
How can i replace '#&8' separately. For example with '&', so that i get the string:
str1 :='content=bell=7&size=20&other1'#$D#$A'&other2'
After this I can use trim to unescape the other sequences.
str2 :='content=bell=7&size=20&other1#13#10&other2'
Those are Delphi character sequences. The compiler interprets them as it processes your source file. It converts #$8 into a backspace character in the string. If you want to replace that character with something else, you could call StringReplace. (If that's your real code, then you could just skip the extra function call and use the desired characters in the string literal directly in your code.)
str2 := StringReplace(str1, #8, '&', [rfReplaceAll]);
Trim removes whitespace from the start and end of a string, but your characters aren't at either end.
Related
I'm new to perl, and I'm trying to print out the folderName from mork files (from Thunderbird).
From: https://github.com/KevinGoodsell/mork-converter/blob/master/doc/mork-format.txt
The second type of special character sequence is a dollar sign
followed by two hexadecimal digits which give the value of the
replacement byte. This is often used for bytes that are non-printable
as ASCII characters, especially in UTF-16 text. For example, a string
with the Unicode snowman character (U+2603):
☃snowman☃
may be represented as UTF-16 text in an Alias this way:
<(83=$03$26s$00n$00o$00w$00m$00a$00n$00$03$26)>
From all the Thunderbird files I've seen it's actually encoded in UTF-8 (2 to 4 bytes).
The following characters need to be escaped (with \) within the string to be used literally: $, ) and \
Example: aaa\$AA$C3$B1b$E2$98$BA$C3$AD\\x08 should print aaa$AAñb☺í\x08
$C3$B1 is ñ; $E2$98$BA is ☺; $C3$ADis í
I tried using the regex to replaced unescaped $ into \x
my $unescaped = qr/(?<!\\)(?:(\\\\)*)/;
$folder =~ s/$unescaped\$/\\x/g;
$folder =~ s/\\([\\$)])/$1/g; # unescape "\ $ ("
Within perl it just prints the literal string.
My workaround is feeding it into bash's printf and it succeeds... unless there's a literal "\x" in the string
$ folder=$(printf "$(mork.pl 8777646a.msf)")
$ echo "$folder"
aaa$AAñb☺í
Questions i consulted:
Convert UTF-8 character sequence to real UTF-8 bytes
But it seems it interprets every byte by itself, not in groups.
In Perl, how can I convert an array of bytes to a Unicode string?
I don't know how to apply this solution to my use case.
Is there any way to achieve this in perl?
The following substitution seems to work for your input:
s/\\([\$\\])|\$(..)/$2 ? chr hex $2 : $1/ge;
Capture \$ or \\, if matched, replace them with $ or \. Otherwise, capture $.. and convert to the corresponding byte.
If you want to work with the result in Perl, don't forget to decode it from UTF-8.
$chars = decode('UTF-8', $bytes);
I am trying to make a string variable containing :
"C:\Program Files\Sublime Text 3\sublime_text.exe" C:\Users\User\Desktop\Guess.py
Unfortunately I am not succeeding in doing so. Is there a way to put the text about as is into a variable, double quotes and all?
In your example string you have characters that need escaping: " and \
fmt.Println("\"C:\\Program Files\\Sublime Text 3\\sublime_text.exe\" C:\\Users\\User\\Desktop\\Guess.py")
You can also use back quotes to create what is called a raw string which doesn't require escaping those characters.
fmt.Println(`"C:\Program Files\Sublime Text 3\sublime_text.exe" C:\Users\User\Desktop\Guess.py`)
List of escapes:
\a U+0007 alert or bell
\b U+0008 backspace
\f U+000C form feed
\n U+000A line feed or newline
\r U+000D carriage return
\t U+0009 horizontal tab
\v U+000b vertical tab
\\ U+005c backslash
\' U+0027 single quote (valid escape only within rune literals)
\" U+0022 double quote (valid escape only within string literals)
See the official docs.
Does string.gsub recognize the newline character in a string literal? I have a scenario in which I am trying to gsub a portion of a string indicated by a given operator from the start of the operator to the newline like so:
local function removeComments(str, operator)
local new_Sc = (str):gsub(operator..".*\n", "");
return new_Sc;
end
local source = [[
int hi = 123; //a basic comment
char ok = "abc"; //another comment
]];
source = removeComments(source, "//");
print(source);
however in the output I see that it removed the rest of the string literal after the first comment:
int hi = 123;
I tried using the literal newline character by using string.char(10) like so (str):gsub(operator..".*"..string.char(10), ""); however I still got the same output; it removes the comment and the rest of the string instead of the start of the comment to the newline.
So is there anyway to gsub a string literal for a pattern containing a newline character?
Thanks
The problem you are facing is akin to greedy vs. lazy matching in regular expressions (.* vs .*?).
In Lua patterns, X.*\n means "match X, then match as many as possible characters followed by a newline". gsub has no special handling for a newline, hence it will try to continue matching until the last newline, subbing as many characters as it can. You want to match as few characters as possible, which is represented by .- in Lua patterns.
Also, I am not sure if it is intended or not, but this strategy will not remove the comment from the last line, if it is not (properly) ended by a newline. I am not sure if it can be represented by a single pattern, but this function will remove comments from all lines:
local function removeComments(str, operator)
local new_Sc = str:gsub(operator..".-\n", "\n");
new_Sc = new_Sc:gsub(operator.."[^\n].*$", "");
return new_Sc;
end
Say I have the following string:
"abcdefghijklmnopqrstuvwxyz"
And I think its too long for one line in my YAML file, is there some way to split that over several lines?
>-
abcdefghi
jklmnopqr
stuvwxyz
Would result in "abcdefghi jklmnopqr stuvwxyz" which is close, but it shouldn't have any spaces.
Use double-quotes, and escape the newline:
"abcdefghi\
jklmnopqr\
stuvwxyz"
There are some subtleties that Jesse's answer will miss.
YAML (like many programming languages) treats single and double quotes differently. Consider this document:
regexp: "\d{4}"
This will fail to parse with an error such as:
found unknown escape character while parsing a quoted scalar at line 1 column 9
Compare that to:
regexp: '\d{4}'
Which will parse correctly. In order to use backslash character inside double-quoted strings you would need to escape them, as in:
regexp: "\\d{4}"
I'd also like to highlight Steve's comment about single-quoted strings. Consider this document:
s1: "this\
is\
a\
test"
s2: 'this\
is\
a\
test'
When parsed, you will find that it is equivalent to:
s1: thisisatest
s2: "this\\ is\\ a\\ test"
This is a direct result of the fact that YAML treats single-quoted strings as literals, while double-quoted strings are subject to escape character expansion.
I need to strip leading and trailing whitespace from a string in TCL. How?
Try this -
string trim string ?chars?
Returns a value equal to string except that any leading or trailing characters from the set given by chars are removed. If chars is not specified then white space is removed (spaces, tabs, newlines, and carriage returns).
Original Source :- http://wiki.tcl.tk/10174
try this. this will remove all the withe spaces
[string map {" " ""} $a];
a is your string