Unpredictable behaviour of scala triple quoted string

Unpredictable behaviour of scala triple quoted string - string

I'm using junit in scala to compare string output from my scala code. Something like :
val expected = """<Destination id="0" type="EMAIL">
<Address>
me#test.com
</Address>
<TimeZone>
US/Eastern
</TimeZone>
<Message>
</Message>
</Destination>
"""
val actual = getTextToLog()
info(" actual = " + actual)
assert(expected == actual)
The issue is that for some strings, assertions like :
assert(expected == actual)
work and for some they strings they dont. Even when I copy actual (logged to Eclipse console) from Eclipse console and paste it into expected just to be sure , the assertion still fails.
What am I missing?

OK, since this turns out to be a whitespace issues, you should sanitise the two strings before comparing them. Look at the RichString methods like .lines, for example, which might let you create a line-ending or whitespace-agnostic comparison method.
Here is one naive way of doing this with implicit conversions:
import scala.language.implicitConversions
object WhiteSpace {
implicit def strToWhiteSpace(s: String) = new WhiteSpace(s)
}
class WhiteSpace(val s: String) {
def `~==` (other: String) = s.lines.toList == other.lines.toList
}
which allows
import WhiteSpace._
assert(expected ~== actual)
Or you could extend the appropriate jutils class to add an agnostic version of assertEquals.
Note that this comparison deconstructs both strings in the same way. This is much safer than sending one of them on a round-trip conversion.
Whitespace/crlf issues are so common that there's no point fighting it by trying to stop the mismatches; just do agnostic comparisons.

Related

Python GDAL, SetAttributeFilter not working

I am trying to use GDAL's SetAttributeFilter() to filter the features in a layer of my shapefile, but the filter seems to have no effect.
My current data is a shapefile from the US Census Bureau, but I have tried with other shapefiles and get a similar result.
For example
from osgeo import ogr
shapefile_path = '../input/processed/shapefile/'
shapefile_ds = ogr.Open(shapefile_path)
cbsa = shapefile_ds.GetLayer('cb_2016_us_cbsa_500k')
print(cbsa.GetFeatureCount())
cbsa.SetAttributeFilter('NAME = "Chicago-Naperville-Elgin, IL-IN-WI"')
feat = cbsa.GetNextFeature()
print(feat.GetField('NAME'))
print(cbsa.GetFeatureCount())
Yields
945
Platteville, WI
945
I'm using Python 3.6 and GDAL 2.2.1

You can capture the return value of the SetAttributeFilter statement and make sure its 0, otherwise something went wrong.
In this particular case, its probably due to the quoting. Single quotes refer to string literals (a value), and double quotes refer to a column/table name.
Depending on how you run this Python code, somewhere in the stdout/stderr GDAL prints something like:
ERROR 1: "Chicago-Naperville-Elgin, IL-IN-WI" not recognised as an available field.
More details can be found at:
https://trac.osgeo.org/gdal/wiki/rfc52_strict_sql_quoting
To get it working, simply swap the single/double quoting, so:
cbsa.SetAttributeFilter("NAME='Chicago-Naperville-Elgin, IL-IN-WI'")

While this was a while ago, when I learn something I like to say what worked in my case in case I search this again.
For me, I had to have the syntax like:
cbsa.SetAttributeFilter('"NAME" = \'Chicago-Naperville-Elgin\'') # I didn't test multiple values
where the referenced page of the accepted answer says:
<delimited identifier> ::= <double quote> <delimited identifier body> <double quote>
<character string literal> ::= <quote> [ <character representation> ... ] <quote>
It may be that there has been an update to ogr changing this since '17.

Different results for SHA-1 hashing between Groovy and Python

This question is related to my query titled "Create WS security headers for REST web service in SoapUI Pro". However that one is different as I raised that because I couldn't get the script to work at all, I came up with a solution for that but it only works about 66% of the time.
I have noticed that the code that I use to hash the string sometimes produces different results when compared to a python script hashing the same input.
Below is my groovy code that hashes an input value.
MessageDigest cript2 = MessageDigest.getInstance("SHA-1");
cript2.update(nonce.getBytes("ASCII"));
PasswordDigest = new String(cript2.digest());
If I run it with the input nonce value 201703281329 it produces the below
ë±ËËÐùìÓ0¼ÕM�¹,óßP‹
If I use the same input value using the Python code below then it produces ëﾱËËÐùìÓ0ﾼÕMﾏﾹ,óßPﾋ
digest = sha.new(nonce).digest()
However if I run the groovy and python scripts with input value 201703281350 then they both produce ..
iàvè©®É¹m:F Ë Â¾
Could someone tell me why I am seeing differences for some input values and not others and how I can modify my groovy code so that it produces same values as Python code?
much appreciated.

If you compare the bytes returned by the digest method of both languages, you'll find that they are indeed the same. The reason is that some combinations of bytes do not result in printable Java Strings.
To compare them:
def hexString = PasswordDigest.collect { String.format('%02x', it) }.join()
Compare to the output of sha.hexdigest():
def hexString = sha.new(nonce).hexdigest()
Both should produce ebb1cbcbd0f9ecd330bcd51b4d8fb92cf3df508b
Edited
Perhaps I didn't make it clear that passwordDigest should not be converted to a String. Below is the complete groovy and python code. Both programs produce the same output:
Groovy:
import java.security.*
String nonce = '201703281329'
MessageDigest digest = MessageDigest.getInstance("SHA-1")
digest.update(nonce.getBytes("ASCII"))
byte[] passwordDigest = digest.digest() // byte[], not string
String hexString = passwordDigest.collect { String.format('%02x', it) }.join()
println hexString
Python:
import sha
nonce = '201703281329'
sha.new(nonce).hexdigest()
print sha.new(nonce).hexdigest()
The output of both: ebb1cbcbd0f9ecd330bcd51b4d8fb92cf3df508b.

Basic string formatting with NIM

I am trying to do some very basic string formatting and I got immediately stuck.
What is wrong with this code?
import strutils
import parseopt2
for kind, key, val in getopt():
echo "$1 $2 $3" % [kind, key, val]
I get Error: type mismatch: got (TaintedString) but expected 'CmdLineKind = enum' but I don't understand how shall I fix it.

The problem here is that Nim's formatting operator % expects an array of objects with the same type. Since the first element of the array here has the CmdLineKind enum type, the compiler expects the rest of the elements to have the same type. Obviously, what you really want is all of the elements to have the string type and you can enforce this by explicitly converting the first paramter to string (with the $ operator).
import strutils
import parseopt2
for kind, key, val in getopt():
echo "$1 $2 $3" % [$kind, key, val]
In case, you are also wondering what is this TaintedString type appearing in the error message, this is a special type indicating a non-validated external input to the program. Since non-validated input data poses a security risk, the language supports a special "taint mode", which helps you keep track of where the inputs may need validation. This mode is inspired by a similar set of features available in the Perl programming language:
http://docstore.mik.ua/orelly/linux/cgi/ch08_04.htm

If you use the strformat Nim-inbuilt library, the same code snippet can be more concise:
import parseopt # parseopt2 has been deprecated!
import strformat
for kind, key, val in getopt():
echo fmt"{kind} {key} {val}"
Also note that parseopt replaces the deprecated parseopt2 library, at least as of today on Nim 0.19.2.

what is the workaround for QString.contains() method for pyqt4+python3?

I have been converting a Qt/C++ widget code into PyQt4+Python3. I have a QFileSystemModel defined and the items it returns have "data" with the filename as type "str". (This is of type QString in Qt/C++ or Python2x).
I have to search for a filter based on QRegEx. In Qt/C++ and Python2x this is achieved by QString.contains(QRegEx).
I found that QString has been removed in Python3. Since now in Python3 everything is now of type "str", how can i implement the old method QString.contains(QRegEx)?
Thanks,
Kodanda

For string mainipulation, Python is generally superior to anything Qt has to offer (particularly when it comes to regular expressions).
But if you must use QRegExp:
# test whether string contains pattern
if QRegExp(pattern).indexIn(string) != -1:
print('found')
Python:
if re.search(pattern, string):
print('found')

Buiding a stack in a map

I have a string that looks like this:
"7-6-4-1"
or
"7"
or
""
That is, a set of numbers separated by -. There may be zero or more numbers.
I want to return a stack with the numbers pushed on in that order (i.e. push 7 first and 1 ast, for the first example)
If I just wanted to return a list I could just go str.split("-").map{_.toInt} (although this doesn't work on the empty string)/
There's no toStack to convert to a Stack though. So currently, I have
{
val s = new Stack[Int];
if (x.nonEmpty)
x.split('-').foreach {
y => s.push(y.toInt)
}
s
}
Which works, but is pretty ugly. What am I missing?
EDIT: Thanks to all the responders, I learnt quite a bit from this discussion

Stack(x.split("-").map(_.toInt).reverse: _*)
The trick here is to pass the array you get from split into the Stack companion object builder. By default the items go in in the same order as the array, so you have to reverse the array first.
Note the "treat this is a list, not as a single item" annotation, : _*.
Edit: if you don't want to catch the empty string case separately, like so (use the bottom one for mutable stacks, the top for immutable):
if (x.isEmpty) Stack() else Stack(x.split("-").map(_.toInt).reverse: _*)
if (x.isEmpty) Stack[Int]() else Stack(x.split("-").map(_.toInt).reverse: _*)
then you can filter out empty strings:
Stack(x.split("-").filterNot(_.isEmpty).map(_.toInt).reverse: _*)
which will also "helpfully" handle things like 7-9----2-2-4 for you (it will give Stack(4,2,2,9,7)).
If you want to handle even more dastardly formatting errors, you can
val guard = scala.util.control.Exception.catching[Int](classOf[NumberFormatException])
Stack(x.split("-").flatMap(x => guard.opt(x.toInt)).reverse: _*)
to return only those items that actually can be parsed.

(Stack[Int]() /: (if(x.isEmpty) Array.empty else x.split("-")))(
(stack, value) =>
stack.push(value toInt))

Dont forget the ever handy breakOut which affords slightly better performance than col: _* (see Daniel's excellent explanation)
Used here with Rex Kerr's .filterNot(_.isEmpty) solution:
import scala.collection.immutable.Stack
import scala.collection.breakOut
object StackFromString {
def stackFromString(str: String): Stack[Int] =
str.split("-").filterNot(_.isEmpty)
.reverse.map(_.toInt)(breakOut)
def main(args: Array[String]): Unit = {
println(stackFromString("7-6-4-1"))
println(stackFromString("7"))
println(stackFromString(""))
}
}
Will output:
Stack(1, 4, 6, 7)
Stack(7)
Stack()

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Unpredictable behaviour of scala triple quoted string - string

Related

Python GDAL, SetAttributeFilter not working

Different results for SHA-1 hashing between Groovy and Python

Basic string formatting with NIM

what is the workaround for QString.contains() method for pyqt4+python3?

Buiding a stack in a map

Categories

Resources