When is it acceptable to not trim a user input string? - string

Can someone give me a real-world scenario of a method/function with a string argument which came from user input (e.g. form field, parsed data from file, etc.) where leading or trailing spaces SHOULD NOT have been trimmed?
I can't ever recall such a situation for myself.
EDIT: Mind you, I didn't say trimming any whitespace. I said trimming leading or trailing (only) spaces (or whitespace).

Search string in any "Find" dialog in an editor.

Password input boxes. There's lots of data out there, where whitespace can genuinely be considered important part of the string. It narrows things down alot by making it starting and ending whitespace only, but there's still many examples. Stuff you pass through a PHP style nl2br function.

If you are inputting code. There may be a scenario where whitespace at the begining and end are necessary.
Also, look at Stack Overflow's markdown editor. Code examples are indented. If you posted just a code example, then it will require leading and trailing white space not be trimmed.

Perhaps a Whitespace interpreter.

Python....

A Stackoverflow answer, or more generally input written in markdown (four leading spaces -> code block).

A paragraph entry.

If the input is python code (say, for a pastebin kinda thing), you certainly can't trim leading white space; but you also can't trim trailing white space, because it could be a part of a multi-line string (triple quoted string).

I've used whitespace as a delimiter before, so there. Also, for anything that involves concatenating multiple inputs, removing leading/trailing whitespace can break formatting or possibly do worse. Aside from that, as Spencer said, for indented paragraphs you probably would not want to remove the leading whitespace.

Obviously passwords should not be trimmed. Passwords can contain leading or trailing whitespaces that need to be be treated as valid characters.

Related

Excel/CSV - putting a single quote to keep leading zeros does not work on a Mac

Other suggested to keep the leading zeros in Excel/CSV, I can just add the single quote character. However, this does not work on a Mac properly. The single quote shows up, which is something that I do not want.
Please see image for more details.
https://social.msdn.microsoft.com/Forums/en-US/aae07b39-865f-4c68-a07f-7cad2dfd6733/how-do-i-open-csv-using-excel-without-deleting-leading-zeros?forum=isvvba
I can just use "=""00100""" to keep the leading zeros.

Kentico - Transformation Eval() & trailing space at the end

This must be the simplest question ever, but again, I don't know the answer. I've just noticed that (in my case) using something like Eval("Location") always creates a trailing blank space at the end of the output. Normally I don't care about that trailing space, but in one particular situation, it has to be removed. I've tried using replace() but that only works for the text itself but not the trailing blank space, such as "San Francisco " becomes "SanFrancisco ", but the trailing space still exists. Please let me know how to get rid of it. I've checked my text, and it doesn't have any space at the end.
Eval("Location").ToString().Replace(" ","")
The function you are looking for is .Trim() which will remove both trailing and leading blank spaces from a string. So you can use
Eval("Location").ToString().Trim()
However, if using .Replace() isn't removing that trailing space, then I would say the space is not coming from the field itself, but rather from something after it in the transformation.
If your code is like:
"<%#Eval("Location").ToString().Trim()%> [other content]" then there will always be a space between that field content and the rest of the content. Perhaps check the transformation and see if there is a space after you are evaluating that field?
As Brandon already mentioned, the function you're looking for is .Trim() This only works on strings though. So if it's not working, you will most likely need to cast that object to a string using one of the following:
ValidationHelper.GetString(Eval("Location"), "").Trim()
Eval<string>("Location").Trim()

Vim not detecting implicit newline characters instead of visible newline characters I am trying to strip

Here's an example of some text from which I'm trying to strip those newline characters, which appear explicitly in my vim, and replace them with actual newline characters that I don't see.
But when I search for a newline character using /[\n]/, what I get isn't these visible newline characters, but instead the implicit ones. So I can't do a search and replace.
How should I address this? Here is the text:
The Reason that can be reasoned\n is not the eternal Reason.The name that can\n be namedis not the eternal Name. The Unnamable is of heaven and earth the beginning.\n The Namable becomes of the\n ten thousand things the mother.Therefore it is said:\n '\n\n He\n who desireless is found\n The spiritual of the world will sound.\n But he who by desire is bound\n Sees the mere shell of things around.' These two things are the same in sour ce but different in name.\n Their sameness\n is called a mystery.Indeed
it is the mystery\n
You need to search for \\n, not [\n].
doing:
%s/\\n/\r/g
Should solve your problem (I have no idea why, but vim needs \r instead of \n')

Why this excel string comparison return fail?

Is it an Excel bug? Anyone have experienced this issue, please help?
Just a thought but here's what MS says about TRIM
The TRIM function was designed to trim the 7-bit ASCII space character
(value 32) from text. In the Unicode character set, there is an
additional space character called the nonbreaking space character that
has a decimal value of 160. This character is commonly used in Web
pages as the HTML entity, . By itself, the TRIM function does
not remove this nonbreaking space character.
you might try this to replace the non-breaking space (if that is your problem here).
=TRIM(SUBSTITUTE(A5,CHAR(160),CHAR(32)))
I would have to agree with #Jeeped. Your formula looks correct in all aspects. It must be a non-printing character. If this data is coming from some outside source (I.e. another file) then there very well could be a non-printed character. I just typed in everything you had manually and came up with this.

New lines in tab delimited or comma delimtted output

I am looking for some best practices as far as handling csv and tab delimited files.
For CSV files I am already doing some formatting if a value contains a comma or double quote but what if the value contains a new line character? Should I leave the new line intact and encase the value in double quotes + escape any double quotes within the value?
Same question for tab delimited files. I assume the answer would be very similar if not the same.
Usually you keep \n unaltered while exploiting the fact that the newline char will be enclosed in a " " string. This doesn't create ambiguities but it's really ugly if you have to take a look to the file using a normal texteditor.
But it is how you should do since you don't escape anything inside a string in a CSV except for the double quote itself.
#Jack is right, that your best bet is to keep the \n unaltered, since you'll expect it inside of double-quotes if that is the case.
As with most things, I think consistency here is key. As far as I know, your values only need to be double-quoted if they span multiple lines, contain commas, or contain double-quotes. In some implementations I've seen, all values are escaped and double-quoted, since it makes the parsing algorithm simpler (there's never a question of escaping and double-quoting, and the reverse on reading the CSV).
This isn't the most space-optimized solution, but makes reading and writing the file a trivial affair, for both your own library and others that may consume it in the future.
For TSV, if you want lossless representation of values, the "Linear TSV" specification is worth considering: http://paulfitz.github.io/dataprotocols/linear-tsv/index.html
For obvious reasons, most such conventions adhere to the following at a minimum:
\n for newline,
\t for tab,
\r for carriage return,
\\ for backslash
Some tools add \0 for NUL.

Resources