What characters can I use to quote this ruby string? - string

I'm embedding JRuby in Java, because I need to call some Ruby methods with Java strings as arguments. The thing is, I'm calling the methods like this:
String text = ""; // this can span over multiple lines, and will contain ruby code
Ruby ruby = Ruby.newInstance();
RubyRuntimeAdapter adapter = JavaEmbedUtils.newRuntimeAdapter();
String rubyCode = "require \"myscript\"\n" +
"str = build_string(%q~"+text+"~)\n"+
"str";
IRubyObject object = adapter.eval(ruby, codeFormat);
The thing is, I don't know what strings I can use as delimiters, because if the ruby code I'm sending to build_string will contain ruby code. Right know I'm using ~,but I think this could break my code. What characters can I use as delimiters to make sure my code will work no matter what the ruby code is?

use the heredoc format:
"require \"myscript\"\n" +
"str = build_string(<<'THISSHOUDLNTBE'\n" + text + "\nTHISSHOULDNTBE\n)\n"+
"str";
this however assumes you won't have "THISSHOULDNTBE" on a separate line in the input.

Since string text contain contain any character, there is no character left to use for quotation escaping like the ~ you're using now. You would still need to escape the tilde in string text in java and append that one to the string you're building.
Something like (untested, not a Java guru):
String rubyCode = "require \"myscript\"\n" +
"str = build_string(%q~" + text.replaceAll("~", "\\~") + "~)\n"+
"str";

Related

Kotlin String.split, ignore when delimiter is inside a quote

I have a string:
Hi there, "Bananas are, by nature, evil.", Hey there.
I want to split the string with commas as the delimiter. How do I get the .split method to ignore the comma inside the quotes, so that it returns 3 strings and not 5.
You can use regex in split method
According to this answer the following regex only matches , outside of the " mark
,(?=(?:[^\"]\"[^\"]\")[^\"]$)
so try this code:
str.split(",(?=(?:[^\\\"]*\\\"[^\\\"]*\\\")*[^\\\"]*\$)".toRegex())
You can use split overload that accepts regular expressions for that:
val text = """Hi there, "Bananas are, by nature, evil.", Hey there."""
val matchCommaNotInQuotes = Regex("""\,(?=([^"]*"[^"]*")*[^"]*$)""")
println(text.split(matchCommaNotInQuotes))
Would print:
[Hi there, "Bananas are, by nature, evil.", Hey there.]
Consider reading this answer on how the regular expression works in this case.
You have to use a regular expression capable of handling quoted values. See Java: splitting a comma-separated string but ignoring commas in quotes and C#, regular expressions : how to parse comma-separated values, where some values might be quoted strings themselves containing commas
The following code shows a very simple version of such a regular expression.
fun main(args: Array<String>) {
"Hi there, \"Bananas are, by nature, evil.\", Hey there."
.split(",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)".toRegex())
.forEach { println("> $it") }
}
outputs
> Hi there
> "Bananas are, by nature, evil."
> Hey there.
Be aware of the regex backtracking problem: https://www.regular-expressions.info/catastrophic.html. You might be better off writing a parser.
If you don't want regular expressions:
val s = "Hi there, \"Bananas are, by nature, evil.\", Hey there."
val hold = s.substringAfter("\"").substringBefore("\"")
val temp = s.split("\"")
val splitted: MutableList<String> = (temp[0] + "\"" + temp[2]).split(",").toMutableList()
splitted[1] = "\"" + hold + "\""
splitted is the List you want

Macros and string interpolation (Julia)

Let's say I make this simple string macro
macro e_str(s)
return string("I touched this: ",s)
end
If I apply it to a string with interpolation, I
obtain:
julia> e"foobar $(log(2))"
"I touched this: foobar \$(log(2))"
Whereas I would like to obtain:
julia> e"foobar $(log(2))"
"I touched this: foobar 0.6931471805599453"
What changes do I have to make to my macro declaration?
It's better to parse the string at compile-time than to delegate to Julia. Basically, put the string into an IOBuffer, scan the string for $ signs, and use the parse function whenever they come up.
macro e_str(s)
components = []
buf = IOBuffer(s)
while !eof(buf)
push!(components, rstrip(readuntil(buf, '$'), '$'))
if !eof(buf)
push!(components, parse(buf; greedy=false))
end
end
quote
string($(map(esc, components)...))
end
end
This doesn't work with escaped $ characters, but that can be resolved with some minor changes to handle \ also. I have included a basic example at the bottom of this post.
I wrote it this way because string macros are generally not for emulating Julia strings — regular macros with regular string literals are better for that purpose. So writing up the parsing yourself isn't that bad, especially because it allows customized extensions. If you really want parsing to be identical to how Julia parses it, you could escape the string and then reparse it, as #MattB suggested:
macro e_str(s)
esc(parse("\"$(escape_string(s))\""))
end
The resulting expression is a :string expression which you could dump and inspect, and then analyse the usual way.
String macros do not come with built-in interpolation facilities. However, it is possible to manually implement this functionality. Note that it is not possible to embed without escaping string literals that have the same delimiter as the surrounding string macro; that is, although """ $("x") """ is possible, " $("x") " is not. Instead, this must be escaped as " $(\"x\") ".
There are two approaches to implementing interpolation manually: implement parsing manually, or get Julia to do the parsing. The first approach is more flexible, but the second approach is easier.
Manual parsing
macro interp_str(s)
components = []
buf = IOBuffer(s)
while !eof(buf)
push!(components, rstrip(readuntil(buf, '$'), '$'))
if !eof(buf)
push!(components, parse(buf; greedy=false))
end
end
quote
string($(map(esc, components)...))
end
end
Julia parsing
macro e_str(s)
esc(parse("\"$(escape_string(s))\""))
end
This method escapes the string (but note that escape_string does not escape the $ signs) and passes it back to Julia's parser to parse. Escaping the string is necessary to ensure that " and \ do not affect the string's parsing. The resulting expression is a :string expression, which can be examined and decomposed for macro purposes.

In Swift how to obtain the "invisible" escape characters in a string variable into another variable

In Swift I can create a String variable such as this:
let s = "Hello\nMy name is Jack!"
And if I use s, the output will be:
Hello
My name is Jack!
(because the \n is a linefeed)
But what if I want to programmatically obtain the raw characters in the s variable? As in if I want to actually do something like:
let sRaw = s.raw
I made the .raw up, but something like this. So that the literal value of sRaw would be:
Hello\nMy name is Jack!
and it would literally print the string, complete with literal "\n"
Thank you!
The newline is the "raw character" contained in the string.
How exactly you formed the string (in this case from a string literal with an escape sequence in source code) is not retained (it is only available in the source code, but not preserved in the resulting program). It would look exactly the same if you read it from a file, a database, the concatenation of multiple literals, a multi-line literal, a numeric escape sequence, etc.
If you want to print newline as \n you have to convert it back (by doing text replacement) -- but again, you don't know if the string was really created from such a literal.
You can do this with escaped characters such as \n:
let secondaryString = "really"
let s = "Hello\nMy name is \(secondaryString) Jack!"
let find = Character("\n")
let r = String(s.characters.split(find).joinWithSeparator(["\\","n"]))
print(r) // -> "Hello\nMy name is really Jack!"
However, once the string s is generated the \(secondaryString) has already been interpolated to "really" and there is no trace of it other than the replaced word. I suppose if you already know the interpolated string you could search for it and replace it with "\\(secondaryString)" to get the result you want. Otherwise it's gone.

Convert underscores to spaces in Matlab string?

So say I have a string with some underscores like hi_there.
Is there a way to auto-convert that string into "hi there"?
(the original string, by the way, is a variable name that I'm converting into a plot title).
Surprising that no-one has yet mentioned strrep:
>> strrep('string_with_underscores', '_', ' ')
ans =
string with underscores
which should be the official way to do a simple string replacements. For such a simple case, regexprep is overkill: yes, they are Swiss-knifes that can do everything possible, but they come with a long manual. String indexing shown by AndreasH only works for replacing single characters, it cannot do this:
>> s = 'string*-*with*-*funny*-*separators';
>> strrep(s, '*-*', ' ')
ans =
string with funny separators
>> s(s=='*-*') = ' '
Error using ==
Matrix dimensions must agree.
As a bonus, it also works for cell-arrays with strings:
>> strrep({'This_is_a','cell_array_with','strings_with','underscores'},'_',' ')
ans =
'This is a' 'cell array with' 'strings with' 'underscores'
Try this Matlab code for a string variable 's'
s(s=='_') = ' ';
If you ever have to do anything more complicated, say doing a replacement of multiple variable length strings,
s(s == '_') = ' ' will be a huge pain. If your replacement needs ever get more complicated consider using regexprep:
>> regexprep({'hi_there', 'hey_there'}, '_', ' ')
ans =
'hi there' 'hey there'
That being said, in your case #AndreasH.'s solution is the most appropriate and regexprep is overkill.
A more interesting question is why you are passing variables around as strings?
regexprep() may be what you're looking for and is a handy function in general.
regexprep('hi_there','_',' ')
Will take the first argument string, and replace instances of the second argument with the third. In this case it replaces all underscores with a space.
In Matlab strings are vectors, so performing simple string manipulations can be achieved using standard operators e.g. replacing _ with whitespace.
text = 'variable_name';
text(text=='_') = ' '; //replace all occurrences of underscore with whitespace
=> text = variable name
I know this was already answered, however, in my case I was looking for a way to correct plot titles so that I could include a filename (which could have underscores). So, I wanted to print them with the underscores NOT displaying with as subscripts. So, using this great info above, and rather than a space, I escaped the subscript in the substitution.
For example:
% Have the user select a file:
[infile inpath]=uigetfile('*.txt','Get some text file');
figure
% this is a problem for filenames with underscores
title(infile)
% this correctly displays filenames with underscores
title(strrep(infile,'_','\_'))

Multiline string literal in Matlab?

Is there a multiline string literal syntax in Matlab or is it necessary to concatenate multiple lines?
I found the verbatim package, but it only works in an m-file or function and not interactively within editor cells.
EDIT: I am particularly after readbility and ease of modifying the literal in the code (imagine it contains indented blocks of different levels) - it is easy to make multiline strings, but I am looking for the most convenient sytax for doing that.
So far I have
t = {...
'abc'...
'def'};
t = cellfun(#(x) [x sprintf('\n')],t,'Unif',false);
t = horzcat(t{:});
which gives size(t) = 1 8, but is obviously a bit of a mess.
EDIT 2: Basically verbatim does what I want except it doesn't work in Editor cells, but maybe my best bet is to update it so it does. I think it should be possible to get current open file and cursor position from the java interface to the Editor. The problem would be if there were multiple verbatim calls in the same cell how would you distinguish between them.
I'd go for:
multiline = sprintf([ ...
'Line 1\n'...
'Line 2\n'...
]);
Matlab is an oddball in that escape processing in strings is a function of the printf family of functions instead of the string literal syntax. And no multiline literals. Oh well.
I've ended up doing two things. First, make CR() and LF() functions that just return processed \r and \n respectively, so you can use them as pseudo-literals in your code. I prefer doing this way rather than sending entire strings through sprintf(), because there might be other backslashes in there you didn't want processed as escape sequences (e.g. if some of your strings came from function arguments or input read from elsewhere).
function out = CR()
out = char(13); % # sprintf('\r')
function out = LF()
out = char(10); % # sprintf('\n');
Second, make a join(glue, strs) function that works like Perl's join or the cellfun/horzcat code in your example, but without the final trailing separator.
function out = join(glue, strs)
strs = strs(:)';
strs(2,:) = {glue};
strs = strs(:)';
strs(end) = [];
out = cat(2, strs{:});
And then use it with cell literals like you do.
str = join(LF, {
'abc'
'defghi'
'jklm'
});
You don't need the "..." ellipses in cell literals like this; omitting them does a vertical vector construction, and it's fine if the rows have different lengths of char strings because they're each getting stuck inside a cell. That alone should save you some typing.
Bit of an old thread but I got this
multiline = join([
"Line 1"
"Line 2"
], newline)
I think if makes things pretty easy but obviously it depends on what one is looking for :)

Resources