Different results for SHA-1 hashing between Groovy and Python - groovy

This question is related to my query titled "Create WS security headers for REST web service in SoapUI Pro". However that one is different as I raised that because I couldn't get the script to work at all, I came up with a solution for that but it only works about 66% of the time.
I have noticed that the code that I use to hash the string sometimes produces different results when compared to a python script hashing the same input.
Below is my groovy code that hashes an input value.
MessageDigest cript2 = MessageDigest.getInstance("SHA-1");
cript2.update(nonce.getBytes("ASCII"));
PasswordDigest = new String(cript2.digest());
If I run it with the input nonce value 201703281329 it produces the below
ë±ËËÐùìÓ0¼ÕM�¹,óßP‹
If I use the same input value using the Python code below then it produces ëᄆËËÐùìÓ0ᄐÕMマᄍ,óßPヒ
digest = sha.new(nonce).digest()
However if I run the groovy and python scripts with input value 201703281350 then they both produce ..
iàvè©®É¹m:F ˠ¾
Could someone tell me why I am seeing differences for some input values and not others and how I can modify my groovy code so that it produces same values as Python code?
much appreciated.

If you compare the bytes returned by the digest method of both languages, you'll find that they are indeed the same. The reason is that some combinations of bytes do not result in printable Java Strings.
To compare them:
def hexString = PasswordDigest.collect { String.format('%02x', it) }.join()
Compare to the output of sha.hexdigest():
def hexString = sha.new(nonce).hexdigest()
Both should produce ebb1cbcbd0f9ecd330bcd51b4d8fb92cf3df508b
Edited
Perhaps I didn't make it clear that passwordDigest should not be converted to a String. Below is the complete groovy and python code. Both programs produce the same output:
Groovy:
import java.security.*
String nonce = '201703281329'
MessageDigest digest = MessageDigest.getInstance("SHA-1")
digest.update(nonce.getBytes("ASCII"))
byte[] passwordDigest = digest.digest() // byte[], not string
String hexString = passwordDigest.collect { String.format('%02x', it) }.join()
println hexString
Python:
import sha
nonce = '201703281329'
sha.new(nonce).hexdigest()
print sha.new(nonce).hexdigest()
The output of both: ebb1cbcbd0f9ecd330bcd51b4d8fb92cf3df508b.

Related

How to correctly convert a unicode string to array of bytes in Python 3

I'm having trouble with a conversion. I'm using bcrypt with keyring to store and retrieve passwords. However, keyring is giving me back a unicode string, but bcrypt needs an array of bytes. I'm trying to use the best (most pythonic) way of converting the string that keyring is passing.
I have the following code sample:
password = b"super secret password"
hashed = bcrypt.hashpw(password, bcrypt.gensalt())
keyring.set_password("test_id", "user", hashed)
retrieved_pw = keyring.get_password("test_id", "user")
retrieved_pw2 = retrieved_pw.encode()
The PyCharm debugger shows the following types and values. What I need to pass back to bcrypt is the duplicate of the variable, hashed. When I try to use str.encode(), it seems to encapsulate the answer I need inside another string. I've also tried using bytearray(), but that is basically a duplicate of str.encode()
hashed: {bytes 60} b'$2b$12$267Tbd09J5.BzST7XLS8OOKD9/ebKPIJyu.miCzi8571JrLXyYWnO'
retrieved_pw: {str} 'b'$2b$12$267Tbd09J5.BzST7XLS8OOKD9/ebKPIJyu.miCzi8571JrLXyYWnO''
retrieved_pw2: {bytes 63} b"b'$2b$12$267Tbd09J5.BzST7XLS8OOKD9/ebKPIJyu.miCzi8571JrLXyYWnO'"
What is the best way to have retrieved_pw end up as a duplicate of hashed?

Is it possible to use a string in a SELECT or WHERE statement in spark?

I'm processing some textual data and I transform them into interpretable commands that would be used as argument for a WHERE statement but I get a string and I don't know how to use it.
For example from the string :
'c_programme_nom == "2-Broke-Girls"'
I get :
"F.col('name').like('%2-Broke-Girls%')"
But I get a string and I would like to use it as a parameter in a WHERE statement.
The expected result would be :
df.where(F.col('name').like('%2-Broke-Girls%'))
I don't know if there is a way to do it.
Seems like you're looking to execute strings containing code:
You can use exec in python:
exec() function is used for the dynamic execution of Python program which can either be a string or object code. If it is a string, the string is parsed as a suite of Python statements which is then executed unless a syntax error occurs and if it is an object code, it is simply executed.
exec('print("The sum of 5 and 10 is", (5+10))')
# The sum of 5 and 10 is 15

Output of a random generated list has an extra brackets

Here is my code:
import random
def randnum():
rando = random.sample(range(1000,9999),1)
strrando = list(str(rando))
return strrando
The purpose is being fulfilled, I am generating a list of 4 randomly generated numbers which are turned to type string. The only problem it generates a list that has a set of brackets as strings in the list itself, so it generates 6 strings instead of the 4. I was wondering if anyone could explain to me why this happens?

how use struct.pack for list of strings

I want to write a list of strings to a binary file. Suppose I have a list of strings mylist? Assume the items of the list has a '\t' at the end, except the last one has a '\n' at the end (to help me, recover the data back). Example: ['test\t', 'test1\t', 'test2\t', 'testl\n']
For a numpy ndarray, I found the following script that worked (got it from here numpy to r converter):
binfile = open('myfile.bin','wb')
for i in range(mynpdata.shape[1]):
binfile.write(struct.pack('%id' % mynpdata.shape[0], *mynpdata[:,i]))
binfile.close()
Does binfile.write automatically parses all the data if variable has * in front it (such in the *mynpdata[:,i] example above)? Would this work with a list of integers in the same way (e.g. *myIntList)?
How can I do the same with a list of string?
I tried it on a single string using (which I found somewhere on the net):
oneString = 'test'
oneStringByte = bytes(oneString,'utf-8')
struct.pack('I%ds' % (len(oneString),), len(oneString), oneString)
but I couldn't understand why is the % within 'I%ds' above replaced by (len(oneString),) instead of len(oneString) like the ndarray example AND also why is both len(oneString) and oneString passed?
Can someone help me with writing a list of string (if necessary, assuming it is written to the same binary file where I wrote out the ndarray) ?
There's no need for struct. Simply join the strings and encode them using either a specified or an assumed text encoding in order to turn them into bytes.
''.join(L).encode('utf-8')

Unpredictable behaviour of scala triple quoted string

I'm using junit in scala to compare string output from my scala code. Something like :
val expected = """<Destination id="0" type="EMAIL">
<Address>
me#test.com
</Address>
<TimeZone>
US/Eastern
</TimeZone>
<Message>
</Message>
</Destination>
"""
val actual = getTextToLog()
info(" actual = " + actual)
assert(expected == actual)
The issue is that for some strings, assertions like :
assert(expected == actual)
work and for some they strings they dont. Even when I copy actual (logged to Eclipse console) from Eclipse console and paste it into expected just to be sure , the assertion still fails.
What am I missing?
OK, since this turns out to be a whitespace issues, you should sanitise the two strings before comparing them. Look at the RichString methods like .lines, for example, which might let you create a line-ending or whitespace-agnostic comparison method.
Here is one naive way of doing this with implicit conversions:
import scala.language.implicitConversions
object WhiteSpace {
implicit def strToWhiteSpace(s: String) = new WhiteSpace(s)
}
class WhiteSpace(val s: String) {
def `~==` (other: String) = s.lines.toList == other.lines.toList
}
which allows
import WhiteSpace._
assert(expected ~== actual)
Or you could extend the appropriate jutils class to add an agnostic version of assertEquals.
Note that this comparison deconstructs both strings in the same way. This is much safer than sending one of them on a round-trip conversion.
Whitespace/crlf issues are so common that there's no point fighting it by trying to stop the mismatches; just do agnostic comparisons.

Resources