How to trim multi-space line but keep single space lines in dart? Can anyone please suggest what should i do? - string

Am trying to trim a string which has multi-line whitespace and also a single-whitespace ... I want to only remove the multi-line whitespace which varies in different length ex. " ", " ",
Can anyone please help me on this and also thank you in advance
I tried trim, replaceALL method with REG(\s) pattern but they are not giving the results in want

Line breaks are encoded by special characters that are different to a regular space. Usually, they are represented by "\n", whereas a whitespace is just " ".
So for your usecase, a command like
modifiedString = originalString.replace("\n","")
will do the trick. It replaces all newlines by a 0-length string, thus removing it.
If you want to dive deep into the topic, it will also be interesting to look into the differences of "\r" and "\n".

Related

How do you add a single backslash to a string in Golfscript?

I'm having a bit of an issue trying to do an ascii art challenge in GS, since it requires you finishing a line with the \ symbol.
The problem is that "\"p breaks the program since it thinks you escaped a quote, and "\\"p prints two backslashes. I've tried string concatenation, removing one character at a time, printing substrings, etcetc - Nothing seems to work!
I need this string to be printed out, so how would this be done?
It seems that the behavior with p is buggy. I'll look for a place to report it.
However, "\\" by itself does not print two backslashes; it prints one.
Here's a test link to prove it.
Output:
\
"\\" creates a string with 1 backslash because strings are escaped. This is the same as languages like Ruby.
p escapes strings, so a string with one backslash will be displayed as two. This is also the same as languages like Ruby.
So if you want to print a single backslash, or print things without the quotes, you need to print unescaped strings. The best way to do this is with implicit IO (anything on the stack that is left over is printed unescaped).
The program
"\\"
Should print
\
You could also use print or puts if you don't want to use implicit IO.

How to break a string over multiple lines and preserve spaces in YAML?

Please note, that the question is similar like this one, but still different so that those answers won't solve my problem:
For insertion of control characters like e.g. \x08, it seems that I have to use double quotes ".
All spaces needs to be preserved exactly as given. For line breaks I use explicitly \n.
I have some string data which I need to store in YAML, e.g.:
" This is my quite long string data "
"This is my quite long string data"
"This_is_my_quite_long_string_data"
"Sting data\nwhich\x08contains control characters"
and need it in YAML as something like this:
Key: " This is my" +
" quite long " +
" string data "
This is no problem as long as I stay on a single line, but I don't know how to put the string content to multiple lines.
YAML block scalar styles (>, |) won't help here, because they don't allow escaping and they even do some whitespace stripping, newline / space substitution which is useless for my case.
Looks that the only way seems to be using double quoting " and backslashes \, like this:
Key: "\
This is \
my quite \
long string data\
"
Trying this in YAML online parser results in "This is my quite long string data" as expected.
But it unfortunately fail if one of the "sub-lines" has leading space, like this:
Key: "\
This is \
my quite\
long st\
ring data\
"
This results in "This is my quitelong string data", removed the space between the words quite and long of this example. The only thing that comes to my mind to solve that, is to replace the first leading space of each sub-line by \x20 like this:
Key: "\
This is \
my quite\
\x20long st\
ring data\
"
As I'd chosen YAML to have a best possible human readable format, I find that \x20 a bit ugly solution. Maybe someone know a better approach?
For keeping human readable, I also don't want to use !!binary for this.
Instead of \x20, you can simply escape the first non-indentation space on the line:
Key: "\
This is \
my quite\
\ long st\
ring data\
"
This works with multiple spaces, you only need to escape the first one.
You are right in your observation that control characters can only be represented in double quoted scalars.
However the parser doesn't fail if the sub-lines (in YAML speak: continuation lines) have a leading space. It is your interpretation of the YAML standard that is incorrect. The standard explicitly states that for multi-line double quoted scalars:
All leading and trailing white space characters are excluded from the content.
So you can put as many spaces as you want before long as you want, it will not make a difference.
The representer for double quoted scalars for Python (both in ruamel.yaml and PyYAML) always does represent newlines as \n. I am not aware of YAML representers in other languages where you have more control over this (and e.g. get double newlines to represent \n in your double quoted scalars). So you probably have to write your own representer.
While writing a representer you can try to make the line breaking be smart, in that it minimizes the number of escaped spaces (by putting them between words on the same line). But especially on strings with a high double space to word ratio, combined with a small width to operate in, it will be hard (if not impossible) to do without escaped spaces.
Such a representer should IMO first check if double quoting is necessary (i.e. there are control characters apart from newlines). If not, and there are newlines you are probably better of representing the string a block style literal scalar (for which spaces at the beginning or end of line are not excluded).

Haskell new line character

I need to compare Windows new line character ('\r\n') but I get
lexical error in string/character literal at character '\\'
['\r\n']
How can I solve this?
Thanks!
Note that for files/streams in the default text mode on Windows, "\r\n" is automatically converted to "\n" on input, and vice versa on output.
If you have special needs that the default doesn't handle, you may wish to look into the System.IO.hSetNewlineMode function, which allows you to set the newline conversion used for a specific handle. For example:
hSetNewlineMode handle universalNewlineMode
sets a handle to accept either "\r\n" or "\n" as newline (internally "\n") on input, but uses the native OS convention on output.
The compiler is telling you that '\r\n' is not a valid character. This should not be surprising since Windows uses two characters to signify the end of a line. This means that you need a String:
"\r\n"
This also means that you will need more complex analysis when parsing input. Looking for a sequence of two characters is a little more difficult than looking for a single character.
One solution is to remove all '\r' characters with a simple filter applied to every String that might contain "\r\n":
deMicrosoftifyString = filter (/= '\r')

Replace character with a safe character and vice-versa

Here's my problem:
I need to store sentences "somewhere" (it doesn't matter where).
The sentences must not contain spaces.
When I extract the sentences from that "somewhere", I need to restore the spaces.
So, before storing the sentence "I am happy" I could replace the spaces with a safe character, such as &. In C#:
theString.Replace(' ', '&');
This would yield 'I&am&happy'.
And when retrieving the sentence, I would to the reverse:
theString.Replace('&', ' ');
But what if the original sentence already contains the '&' character?
Say I would do the same thing with the sentence 'I am happy & healthy'. With the design above, the string would come back as 'I am happy healthy', since the '&' char has been replaced with a space.
(Of course, I could change the & character to a more unlikely symbol, such as ยค, but I want this to be bullet proof)
I used to know how to solve this, but I forgot how.
Any ideas?
Thanks!
Fredrik
Maybe you can use url encoding (percent encoding) as an inspiration.
Characters that are not valid in a url are escaped by writing %XX where XX is a numeric code that represents the character. The % sign itself can also be escaped in the same way, so that way you never run into problems when translating it back to the original string.
There are probably other similar encodings, and for your own application you can use an & just as well as a %, but by using an existing encoding like this, you can probably also find existing functions to do the encoding and decoding for you.

New lines in tab delimited or comma delimtted output

I am looking for some best practices as far as handling csv and tab delimited files.
For CSV files I am already doing some formatting if a value contains a comma or double quote but what if the value contains a new line character? Should I leave the new line intact and encase the value in double quotes + escape any double quotes within the value?
Same question for tab delimited files. I assume the answer would be very similar if not the same.
Usually you keep \n unaltered while exploiting the fact that the newline char will be enclosed in a " " string. This doesn't create ambiguities but it's really ugly if you have to take a look to the file using a normal texteditor.
But it is how you should do since you don't escape anything inside a string in a CSV except for the double quote itself.
#Jack is right, that your best bet is to keep the \n unaltered, since you'll expect it inside of double-quotes if that is the case.
As with most things, I think consistency here is key. As far as I know, your values only need to be double-quoted if they span multiple lines, contain commas, or contain double-quotes. In some implementations I've seen, all values are escaped and double-quoted, since it makes the parsing algorithm simpler (there's never a question of escaping and double-quoting, and the reverse on reading the CSV).
This isn't the most space-optimized solution, but makes reading and writing the file a trivial affair, for both your own library and others that may consume it in the future.
For TSV, if you want lossless representation of values, the "Linear TSV" specification is worth considering: http://paulfitz.github.io/dataprotocols/linear-tsv/index.html
For obvious reasons, most such conventions adhere to the following at a minimum:
\n for newline,
\t for tab,
\r for carriage return,
\\ for backslash
Some tools add \0 for NUL.

Resources