I'm guessing this is a well known issue and there's an efficient workaround somehow.
I am getting output which has lines in it that contain a fixed number of empty spaces. I'm doing a string comparison test such as the one below as part of a unit test. Is there a way to get this to pass without modifying the strings using stripIndent() or the like?
Note, the test below is supposed to have 4 white spaces in the seemingly empty line between testStart and testEnd in the multiline string. However, stack overflow may be removing it?
String singleLine = 'testStart\n \ntestEnd'
String multiLine =
'''
testStart
testEnd
'''
println singleLine
println multiLine
assert singleLine == multiLine
String singleLine = 'testStart\n \ntestEnd'
String multiLine =
'''
testStart
(assume there are 4 spaces on this line)
testEnd
'''
println singleLine
println multiLine
assert singleLine == multiLine
That assertion is supposed to fail. The first character in singleLine is the character t from testStart. The first character in multiLine is a newline character because the String begins immediately after the opening ''' and the first character you have after that is a newline character. You have the same issue at the end of the string. You could solve that in a couple of ways:
String multiLine =
'''\
testStart
(assume there are 4 spaces on this line)
testEnd\
'''
Or:
String multiLine =
'''testStart
(assume there are 4 spaces on this line)
testEnd'''
This was being caused by an intelliJ default setting. I have now resolved it.
http://blog.darrenscott.com/2015/01/24/intellij-idea-14-how-to-stop-stripping-of-trailing-spaces/
Related
I am trying to remove only specific pattern in a string using regEx which includes brackets and question marks and also by not changing of other brackets
Here is the code which I was trying
import re
string = "aaaa{v?}a $1?{23ru{n?}kkkk"
pattern = '[{?}]'
replace = ''
new_string = re.sub(pattern, replace, string)
print(new_string)
It generates below output
"aaaava $123runkkkk"
I want the output to be like below
"aaaava $1?{23runkkkk"
You can notice that it removed {v?}, {n?} bracket({}) and question mark(?) only in this format
There is unchange of brackets and question marks at the remaining places.
You can use
re.sub(r'{([a-z])\?}', r'\1', text)
See the regex demo. Details:
{ - a { char
([a-z]) - Group 1 (\1 refers to this group text from the replacement pattern): any lowercase ASCII letter
\? - a ? char
} - a } char.
This question already has answers here:
$ Windows newline symbol in Python bytes regex
(3 answers)
Closed 2 years ago.
I have stumbled upon a very strange situation, where "$" doesn't work (does not match line-ends) while "(\r|\n)" DOES work, in a simple re.search() command with the re.MULTILINE flag set, and I really wonder why this is?
Here is a short PoC, including its runtime output (when executed on Python 3.7.1):
import re
subj = 'row1\r\nContent-Length: 1234\r\nrow3'
test1 = re.search(r'^Content-Length: ([0-9]+)$', subj, re.MULTILINE)
test2 = re.search(r'^Content-Length: ([0-9]+)(\r|\n)', subj, re.MULTILINE)
if test1:
print('test1 is a match!')
else:
print('test1 is NOT a match!')
if test2:
print('test2 is a match!')
else:
print('test2 is NOT a match!')
and here is the output when running this code:
test1 is NOT a match!
test2 is a match!
As far as I can read in all the docs, the "$" should represent any line-break when using regexps in multiline mode, so why does it refuse to match in this case?
Quick check of Python RE document shows that only \n is treated a newline character; \r is not a newline character (although it is a whitespace character). In a multiline match, the "$" seeks a newline character (or, more precisely, the zero-length string to the immediate left of the newline character).
Suggest using this RE:
r'^Content-Length: ([0-9]+)\r?$'
I'd like to indent a multiline string in Groovy but I can't figure out the right RegEx syntax / or Regex flags to achieve that.
Here's what I tried so far:
def s="""This
is
multiline
"""
println s.replaceAll('/(.*)/'," \1")
println s.replaceAll('/^/'," ")
println s.replaceAll('(?m)/^/'," \1")
println s.replaceAll('(?m)/(.*)/'," \1")
These didn't work as expected for some reason.
The only thing that worked so for is this block:
def indented = ""
s.eachLine {
indented = indented + " " + it + "\n"
}
println indented
Is there a shorter / more efficient way to indent all lines of a string in Groovy?
You need to put the (?m) directive inside the regular expression; and the pattern is a slashy string, not a single quoted string with slashes inside:
s.replaceAll(/(?m)^/, " ")
You could split and join - don't know if it's more efficient, but shorter
def s="""This
is
multiline
"""
def indent = " "
println indent + s.split("\\n").join("\n" + indent);
Or perhaps using just the replace function from java which is non-regex and potentially faster:
def s="""\
This
is
multiline
"""
println ' ' + s.replace('\n', '\n ')
which prints:
This
is
multiline
note: for those who are picky enough, replace does use the java regex implementation (as in Pattern), but a LITERAL regex which means that it will ignore all normal regex escapes etc. So the above is probably still faster than split for large strings, but this makes you wish they had left some function in there that just did a replace without any involvement of the potentially slow Pattern implementation.
I am looking to remove special characters from a string using groovy, i'm nearly there but it is removing the white spaces that are already in place which I want to keep. I only want to remove the special characters (and not leave a whitespace). I am running the below on a PostCode L&65$$ OBH
def removespecialpostcodce = PostCode.replaceAll("[^a-zA-Z0-9]+","")
log.info removespecialpostcodce
Currently it returns L65OBH but I am looking for it to return L65 OBH
Can anyone help?
Use below code :
PostCode.replaceAll("[^a-zA-Z0-9 ]+","")
instead of
PostCode.replaceAll("[^a-zA-Z0-9]+","")
To remove all special characters in a String you can use the invert regex character:
String str = "..\\.-._./-^+* ".replaceAll("[^A-Za-z0-1]","");
System.out.println("str: <"+str+">");
output:
str: <>
to keep the spaces in the text add a space in the character set
String str = "..\\.-._./-^+* ".replaceAll("[^A-Za-z0-1 ]","");
System.out.println("str: <"+str+">");
output:
str: < >
This is really nice in Groovy:
println '''First line,
second line,
last line'''
Multi-line strings. I have seen in some languages tools that take a step further and can remove the indentation of line 2 and so, so that statement would print:
First line,
second line,
last line
and not
First line,
second line,
last line
Is it possible in Groovy?
You can use stripMargin() for this:
println """hello world!
|from groovy
|multiline string
""".stripMargin()
If you don't want leading character (like pipe in this case), there is stripIndent() as well, but string will need to be formatted little differently (as minimum indent is important)
println """
hello world!
from groovy
multiline string
""".stripIndent()
from docs of stripIndent
Strip leading spaces from every line in a String. The line with the least number of leading spaces determines the number to remove. Lines only containing whitespace are ignored when calculating the number of leading spaces to strip.
Update:
Regarding using a operator for doing it, I, personally, would not recommend doing so. But for records, it can be done by using Extension Mechanism or using Categories (simpler and clunkier). Categories example is as follows:
class StringCategory {
static String previous(String string) { // overloads `--`
return string.stripIndent()
}
}
use (StringCategory) {
println(--'''
hello world!
from groovy
multiline string
''')
}