Why are " preferred over ' in R - string

From help("'"):
Single and double quotes delimit character constants. They can be used
interchangeably but double quotes are preferred (and character
constants are printed using double quotes), so single quotes are
normally only used to delimit character constants containing double
quotes.
If they are interchangeable, why are double quotes preferred? I've yet to find a difference between them in my own usage. Particularly surprising is that mixed character vectors are allowable:
> c("a",'b',"c")
[1] "a" "b" "c"
Edit
I'm really asking two questions here, I guess:
Are there any situations in which ' and " behave differently?
If not, why was " chosen as the preferred version by convention?
Answers so far have been related to (2), but (1) is at least as much of-interest.

I do not know of any cases where single-quotes are different than doubles. I think the preference is due to readability and to avoid potential confusion of single quotes with back-ticks which are handled differently. It's probably very hard for the eye-brain system in the wetware to pick up a mismatched back-tick paired with a single quote.
> `newfn` <- function() {}
> newfn
function() {}
> "newfn" <- function() {}
> newfn
function() {}
> 'newfn' <- function() {}
> newfn
function() {}
> var <- c(`a`, "b", 'c')
Error: object 'a' not found
> var <- c( "b", 'c')
> var
[1] "b" "c"
> a <- 1
> identical(`a`, a)
[1] TRUE
So for assignment to names, they (s-quotes, d-quotes, and back-ticks) are all handled the same on the LHS of assignment from function, but the unquoted a and the back-ticked a are the same on the command line and are different than either of the quoted "a" or 'a'.
The other situation where there may be a difference is in data input. Persons' names may have single quotes and it not case you may want to review the handling of the two different kinds of quotes by the read.table function. By default it uses both types of quotes, but it may be necessary to "turn off" the quoting action of single quotes by setting quote="\"" so that you don't get big blobs of data turned into a single text field by mistake. The count.fields function has the same defaults as read.table, so it makes sense to do a preliminary run with this to check for shortened lines cause by mismatched single quotes:
table( count.fields('filnam.ext') )

My guess is that "single quotes" occur much more often as apostrophes, so preferring double-quotes will reduce the chance of messing things up with an apostrophe.

Concerning the first question, Are there any situations in which ' and " behave differently?, I think it is important to note that since
identical("a", 'a')
TRUE
R users (including package developers) have no way of telling the difference, hence no way of creating different behaviors for one or the other.

To avoid confusion for those who are accustomed to programming in the
C family of languages (C, C++, Java), where there is a difference in
the meaning of single quotes and double quotes.
A C programmer reads 'a' as a single character and "a" as a character
string consisting of the letter 'a' followed by a null character to
terminate the string. In R there is no character data type, there are
only character strings. For consistency with other languages it helps
if character strings are delimited by double quotes. The single quote
version in R is for convenience. On most keyboards you don't need to
use the shift key to type a single quote but you do need the shift for
a double quote.

Related

Difference between single quote and double quote string *types* in octave? Reason of warning?

I am aware that in octave escape sequences are treated differently in single/double quotes. Nevertheless, there seems to be a type difference:
Whereas class("bla") and class('bla') are both char,
typeinfo("bla") is string, whereas typeinfo('bla') is sq_string,
which may be short for single quote string.
More interesting, warning("on", "Octave:mixed-string-concat") activates warning
that these two types are mixed.
So after activation, ["bla" 'bla'] yields a warning.
Note that typeinfo(["bla" "bla"]) is string,
whereas if one of the two strings concatenated is single quote, so is the result,
e.g. typeinfo(['bla' "bla"]) is sq_string.
I have a situation where someone activates the warning
and so I want to program so to avoid these.
Thus my question: is there a way to convert sq_string to string?
The core of my problem is that fieldnames seem to be single quoted strings.
What an interesting question. I've never thought one might have a need for such a warning or conversion ... though now that I think about it, it makes sense if you want to collect 'raw' strings, and have their escape sequences interpreted and vice versa ...
After some experimentation, I have found a way to do what you want: use sprintf. This seems to return a (double-quoted) string if your formatted string is in double quotes, and an sq_string if it's in single quotes. If your formatted string is simply "%s", then you can pass a bunch of strings as subsequent arguments, and these will be concatenated (as a double-quoted string).
If you'd prefer to go in the reverse direction and ensure your strings are always single quoted, you can then still do the above with a single-quoted formatted string, or you can just use strcat: this does not trigger your warning, can also be called with a single argument, and seems to always return an sq_string.
Also, since I would generally recommend using either of these with "cell-generated sequence" syntax for convenience, this means that you would be better off "collecting" individual strings in cells more generally. E.g.
a = { 'one', 'two', 'three' }
b = { "four", "five", "six" }
typeinfo( sprintf( "%s", a{:} ) ) % outputs: string
typeinfo( strcat( b{:} ) ) % outputs: sq_string

' vs " " vs ' ' ' in Groovy .When to Use What?

I am getting really confused in Groovy. When to use what for Strings in Groovy ?
1) Single Quotes - ' '
2) Double Quotes - " "
3) Triple Quotes - '''
My code:
println("Tilak Rox")
println('Tilak Rox')
println('''Tilak Rox''')
All tend to produce same results.
When to Use What ?
I would confuse you even more, saying, that you can also use slash /, dolar-slash $/ and triple-double quotes """ with same result. =)
So, what's the difference:
Single vs Double quote: Most important difference. Single-quoted is ordinary Java-like string. Double-quoted is a GString, and it allows string-interpolation. I.e. you can have expressions embedded in it: println("${40 + 5}") prints 45, while println('${ 40 + 5}') will produce ${ 40 + 5}. This expression can be pretty complex, can reference variables or call methods.
Triple quote and triple double-quote is the way to make string multiline. You can open it on one line in your code, copy-paste big piece of xml, poem or sql expression in it and don't bother yourself with string concatenation.
Slashy / and dollar-slashy $/ strings are here to help with regular expressions. They have special escape rules for '\' and '/' respectfully.
As #tim pointed, there is a good official documentation for that, explaining small differences in escaping rules and containing examples as well.
Most probably you don't need to use multiline/slashy strings very often, as you use them in a very particular scenarios. But when you do they make a huge difference in readability of your code!
Single quotes ' are for basic Strings
Double quotes " are for templated Strings ie:
def a = 'tim'
assert "Hi $a" == 'Hi tim'
Triple single quotes ''' are for multi-line basic Strings
Triple double """ quotes are for multi-line templated strings
There's also slashy strings /hello $a/ which are templated
And dollar slashy Strings $/hello $a/$ which are multi-line and templated
They're all documented quite well in the documentation

Efficient way to insert characters between other characters in a string

What is an efficient way in MATLAB to replace/insert one symbol (in series of symbols) with several others that correspond to the one that is being replaced?
For example, consider having a string Eq: Eq = 'A*exp(-((x-xc)/w)^2)'. Is there a way to replace * with .*, / with ./,\ with .\, and ^ with .^ without writing four separate strrep() lines?
Regular expressions will do the job nicely. Regular expressions simply find patterns in text. You specify what kind of pattern you are looking for by a regular expression, and the output gives you the locations of where the pattern occurred.
For our particular case, not only do we want to find where patterns occur, we also want to replace those patterns with something else. Specifically, use the function regexprep from MATLAB to replace matches in a string with something else. What you want to do is replace all *, /, \ and ^ symbols by adding a . in front of each.
How regexprep works is that the first input is the string you're looking at, the second input is a pattern that you're trying to find. In our case, we want to find any of *, /, \ and ^. To specify this pattern, you put those desired symbols in [] brackets. Regular expressions reserve \ as a special symbol to delineate characters that can be parsed as a regular expression but actually aren't. As such, you need to use \\ for the \ character and \^ for the ^ character. The third input is what you want to replace each match with. In our case, we simply want to reuse each matched character, but we add a . at the beginning of the match. This is done by doing \.$0 in the regular expression syntax. $0 means to grab the first token produced by a match... which is essentially the matched symbol from the pattern. . is also a reserved keyword using regular expressions, so we must prepend this symbol with a \ character.
Without further ado:
>> Eq = 'A*exp(-((x-xc)/w)^2)';
>> out = regexprep(Eq, '[*/\\\^]', '\.$0')
out =
A.*exp(-((x-xc)./w).^2)
The pattern we are looking for is [*/\\\^], which means that we want to find any of *, /, \ - denoted as \\ in regex, and \^ - denoted as ^ in regex. We want to find any of these symbols and replace them with the same symbol by adding a . character in front - \.$0.
As a more complicated example, let's make sure that we include all of the symbols you're looking for in a sample equation:
>> A = 'A*exp(-((x-xc)/w)^2) \ b^2';
>> out = regexprep(A, '[*/\\\^]', '\.$0')
out =
A.*exp(-((x-xc)./w).^2) .\ b.^2
I'd go with regexp as in rayryeng's answer. But here's another approach, just to provide an alternative.
ops = '*/\^'; %// operators that need a dot
ii = find(ismember(Eq, ops)); %// find where dots should be inserted
[~, jj] = sort([1:numel(Eq) ii-.5]); %// will be used to properly order the result
result = [Eq repmat('.',1,numel(ii))]; %// insert dots at the end
result = result(jj); %// properly order the result
And a variant:
ops = '*/\^'; %// operators that need a dot
ii = find(ismember(Eq, ops)); %// find where dots should be inserted
jj = sort([1:numel(Eq) ii-.5]); %// dot locations are marked with fractional part
result = Eq(ceil(jj)); %// repeat characters where the dots will be placed
result(mod(jj,1)>0) = '.'; %// place dots at indices with fractional part
The vectorize function already does almost all of what you want except that it does not convert mldivide (\) to ldivide (.\).
By "efficient," do you mean fewer lines of code or faster? Regular expressions are almost always slower than other approaches and less readable. I don't think they're necessary or a good choice in this case. If you only need to convert your string once, then speed is less of a concern than readability (strrep will still be faster). If you need to do it many times, this simple code that you alluded to is 4–5 times faster than regexrep for short strings like your example (and much faster for longer strings):
out = strrep(Eq,'*','.*');
out = strrep(out,'/','./');
out = strrep(out,'\','.\');
out = strrep(out,'^','.^');
If you want one line, use:
out = strrep(strrep(strrep(strrep(Eq,'*','.*'),'/','./'),'\','.\'),'^','.^');
which will also be slightly faster still. Or create your own version of vectorize and call that.
Where regular expressions shine is in more complex cases, e.g., if your string is already partially vectorized: Eq = 'A.*exp(-((x-xc)/w)^2)'. Even still, the vectorize function just uses strrep and then calls strfind to "remove any possible '..*', '../', etc." and replace them with the proper element-wise operators because it's faster (symbolic math strings can get very large, for example).

Multiline string literal in Matlab?

Is there a multiline string literal syntax in Matlab or is it necessary to concatenate multiple lines?
I found the verbatim package, but it only works in an m-file or function and not interactively within editor cells.
EDIT: I am particularly after readbility and ease of modifying the literal in the code (imagine it contains indented blocks of different levels) - it is easy to make multiline strings, but I am looking for the most convenient sytax for doing that.
So far I have
t = {...
'abc'...
'def'};
t = cellfun(#(x) [x sprintf('\n')],t,'Unif',false);
t = horzcat(t{:});
which gives size(t) = 1 8, but is obviously a bit of a mess.
EDIT 2: Basically verbatim does what I want except it doesn't work in Editor cells, but maybe my best bet is to update it so it does. I think it should be possible to get current open file and cursor position from the java interface to the Editor. The problem would be if there were multiple verbatim calls in the same cell how would you distinguish between them.
I'd go for:
multiline = sprintf([ ...
'Line 1\n'...
'Line 2\n'...
]);
Matlab is an oddball in that escape processing in strings is a function of the printf family of functions instead of the string literal syntax. And no multiline literals. Oh well.
I've ended up doing two things. First, make CR() and LF() functions that just return processed \r and \n respectively, so you can use them as pseudo-literals in your code. I prefer doing this way rather than sending entire strings through sprintf(), because there might be other backslashes in there you didn't want processed as escape sequences (e.g. if some of your strings came from function arguments or input read from elsewhere).
function out = CR()
out = char(13); % # sprintf('\r')
function out = LF()
out = char(10); % # sprintf('\n');
Second, make a join(glue, strs) function that works like Perl's join or the cellfun/horzcat code in your example, but without the final trailing separator.
function out = join(glue, strs)
strs = strs(:)';
strs(2,:) = {glue};
strs = strs(:)';
strs(end) = [];
out = cat(2, strs{:});
And then use it with cell literals like you do.
str = join(LF, {
'abc'
'defghi'
'jklm'
});
You don't need the "..." ellipses in cell literals like this; omitting them does a vertical vector construction, and it's fine if the rows have different lengths of char strings because they're each getting stuck inside a cell. That alone should save you some typing.
Bit of an old thread but I got this
multiline = join([
"Line 1"
"Line 2"
], newline)
I think if makes things pretty easy but obviously it depends on what one is looking for :)

How to break a big lua string into small ones

I have a big string (a base64 encoded image) and it is 1050 characters long. How can I append a big string formed of small ones, like this in C
function GetIcon()
return "Bigggg string 1"\
"continuation of string"\
"continuation of string"\
"End of string"
According to Programming in Lua 2.4 Strings:
We can delimit literal strings also by matching double square brackets [[...]]. Literals in this bracketed form may run for several lines, may nest, and do not interpret escape sequences. Moreover, this form ignores the first character of the string when this character is a newline. This form is especially convenient for writing strings that contain program pieces; for instance,
page = [[
<HTML>
<HEAD>
<TITLE>An HTML Page</TITLE>
</HEAD>
<BODY>
Lua
[[a text between double brackets]]
</BODY>
</HTML>
]]
This is the closest thing to what you are asking for, but using the above method keeps the newlines embedded in the string, so this will not work directly.
You can also do this with string concatenation (using ..):
value = "long text that" ..
" I want to carry over" ..
"onto multiple lines"
Most answers here solves this issue at run-time and not at compile-time.
Lua 5.2 introduces the escape sequence \z to solve this problem elegantly without incurring any run-time expense.
> print "This is a long \z
>> string with \z
>> breaks in between, \z
>> and is spanning multiple lines \z
>> but still is a single string only!"
This is a long string with breaks in between, and is spanning multiple lines but still is a single string only!
\z skips all subsequent characters in a string literal1 until the first non-space character. This works for non-multiline literal text too.
> print "This is a simple \z string"
This is a simple string
From Lua 5.2 Reference Manual
The escape sequence '\z' skips the following span of white-space characters, including line breaks; it is particularly useful to break and indent a long literal string into multiple lines without adding the newlines and spaces into the string contents.
1: All escape sequences, including \z, work only on short literal strings ("…", '…') and, understandably, not on long literal strings ([[...]], etc.)
I'd put all chunks in a table and use table.concat on it. This avoids the creation of new strings at every concatenation. for example (without counting overhead for strings in Lua):
-- bytes used
foo="1234".. -- 4 = 4
"4567".. -- 4 + 4 + 8 = 16
"89ab" -- 16 + 4 + 12 = 32
-- | | | \_ grand total after concatenation on last line
-- | | \_ second operand of concatenation
-- | \_ first operand of concatenation
-- \_ total size used until last concatenation
As you can see, this explodes pretty rapidly. It's better to:
foo=table.concat{
"1234",
"4567",
"89ab"}
Which will take about 3*4+12=24 bytes.
Have you tried the
string.sub(s, i [, j]) function.
You may like to look here:
http://lua-users.org/wiki/StringLibraryTutorial
This:
return "Bigggg string 1"\
"continuation of string"\
"continuation of string"\
"End of string"
C/C++ syntax causes the compiler to see it all as one large string. It is generally used for readability.
The Lua equivalent would be:
return "Bigggg string 1" ..
"continuation of string" ..
"continuation of string" ..
"End of string"
Do note that the C/C++ syntax is compile-time, while the Lua equivalent likely does the concatenation at runtime (though the compiler could theoretically optimize it). It shouldn't be a big deal though.

Resources