How to securely verify an HMAC in Python 2.7? - security

I'm using Python 2.7 and am creating an HMAC using the hmac library. Python 3.3 includes a compare_digest() function that will compare two digests and resist timing attacks, but that's not available in 2.7. Prevailing advice is not to roll my own crypto, so are there any mature Python libraries that provide that functionality? PyCrypto does not appear to.

For anyone finding this from search, if using Django, then you can also use the constant_time_compare function in django.utils.crypto.
>>> from django.utils.crypto import constant_time_compare
>>> constant_time_compare("foo", "bar")
False
>>> constant_time_compare("foo", "foo")
True
That this comes with the same caveat as hmac.compare_digest (and actually uses hmac.compare_digest if it exists):
Note: If a and b are of different lengths, or if an error occurs, a timing attack could theoretically reveal information about the types and lengths of a and b–but not their values.

I would suggest you just use the secure compare method available in 3.3.
This is an implementation that is very similar to the Python implementation:
def compare_digest(x, y):
if not (isinstance(x, bytes) and isinstance(y, bytes)):
raise TypeError("both inputs should be instances of bytes")
if len(x) != len(y):
return False
result = 0
for a, b in zip(x, y):
result |= a ^ b
return result == 0
Can't see how that would breach any licenses.

If you have access to Python 2.7.7, compare_digest() was recently backported to this version (as well as the more secure 3.x SSL module in 2.7.9).
https://www.python.org/dev/peps/pep-0466/

Related

How did numpy add the # operator?

How did they do it? Can I also add my own new operators to Python 3? I searched on google but I did not find any information on this.
No, you can't add your own. The numpy team cooperated with the core Python team to add # to the core language, It's in the core Python docs (for example, in the operator precedence table), although core Python doesn't use it for anything in the standard CPython distribution. The core distribution nevertheless recognizes the operator symbol, and generates an appropriate BINARY_MATRIX_MULTIPLY opcode for it:
>>> import dis
>>> def f(a, b):
... return a # b
>>> dis.dis(f)
2 0 LOAD_FAST 0 (a)
2 LOAD_FAST 1 (b)
4 BINARY_MATRIX_MULTIPLY
6 RETURN_VALUE
Answering your second question,
Can I also add my own new operators to Python 3?
A similar question with some very interesting answers can be found here, Python: defining my own operators?
Recently in PyCon US 2022, Sebastiaan Zeeff delivered a talk showing how to implement a new operator. He warns that the implementation is purely educational though. However, it turns out you actually can implement a new operator yourself! Locally, of course :). You can find his talk here, and his code repository here. And if you think your operator could enhance Python Language, why not propose a PEP for it?

Implementing "openssl_private_encrypt" in latest Python 3 versions

I'm trying to maintain FastSpring e-commerce platform Secure Payload api implementation in Python.
Their documentation has examples for encrypting (or technically signing?) the payload with private key in Java and PHP: https://developer.fastspring.com/docs/pass-a-secure-request#locally-encrypt-the-session-object
And I have been previously using a Python "cryptography" library based implementation based on this repository:
https://github.com/klokantech/flask-fastspring/blob/0462833f67727cba9755a26c88941f11f0159a48/flask_fastspring.py#L247
However, that relies on undocumented openssl "_lib.RSA_private_encrypt()" function that is no longer exposed in cryptography versions higher than 2.9.2, which is already several years old. And with latest python versions it no longer includes binary packages and PIP must compile it from source.
PyCryptodome seems to include similar RSA private key signing with PKCS #1 v1.5 padding, but it requires payload to be a Hash object, so naturally the produced output doesn't match what FastSpring expects regardless of what Hash function I use: https://pycryptodome.readthedocs.io/en/latest/src/signature/pkcs1_v1_5.html?highlight=pkcs1_15#pkcs-1-v1-5-rsa
I have been trying to firure out any alternative ways to implement this kind of "private key encryption" without success. So my question is: Is there ANY way to do this with up-to-date python libararies or am I stuck to use an outdated cryptography library until it no longer is supported at all?
The two linked codes implement low level signing using RSASSA-PKCS1-v1_5, but a modified encoding is used for the message rather than EMSA-PKCS1-v1_5, and therefore the processing differs from the standard.
The two major Python crypto libraries PyCryptodome and Cryptography only support high level signing, which encapsulates the entire process, follows the standard and thus does not allow any modification of the encoding of the message.
The most efficient way to solve the problem would be to use a Python library that also supports a low level signing, so that the encoding of the message from the linked Java or Python code can be used. However, I am not aware of such a library.
If you don't know such a library either, there is the following alternative: Since RSASSA-PKCS1-v1_5 is pretty simple and Python supports large integers and their operations natively, a custom implementation in combination with the helper functions of e.g. PyCryptodome is easily possible. At least you wouldn't have to rely on the legacy library anymore:
from Crypto.Util import number
def customizedSign(key, msg):
modBits = number.size(key.n)
k = number.ceil_div(modBits, 8)
ps = b'\xFF' * (k - len(msg) - 3)
em = b'\x00\x01' + ps + b'\x00' + msg
em_int = number.bytes_to_long(em)
m_int = key._decrypt(em_int)
signature = number.long_to_bytes(m_int, k)
return signature
Explanation:
The implementation follows the PyCryptodome implementation of the sign() method. The only functional difference is that instead of EMSA-PKCS1-v1_5 the encoding of the linked codes is used.
EMSA-PKCS1-v1_5 is defined as:
EM = 0x00 || 0x01 || PS || 0x00 || T
where T is the concatenation of the DER encoded DigestInfo value and the hashed message, see here.
The encoding of the linked codes simply uses the message MSG instead of T:
EM = 0x00 || 0x01 || PS || 0x00 || MSG
In both cases, PS is a padding with 0xFF values up to the key size (i.e. size of the modulus).
Usage and test:
Since the signature is deterministic, the same key and the same message always provide the same signature. This way it is easy to show that the above function is equivalent to the linked Java or Python code:
from Crypto.PublicKey import RSA
import base64
# For simplicity, a 512 bits key is used. Note that a 512 bits key may only be used for testing, in practice the key size has to be >= 2048 bits for security reasons.
pkcs8 = """-----BEGIN PRIVATE KEY-----
MIIBVQIBADANBgkqhkiG9w0BAQEFAASCAT8wggE7AgEAAkEA2gdsVIRmg5IH0rG3
u3w+gHCZq5o4OMQIeomC1NTeHgxbkrfznv7TgWVzrHpr3HHK8IpLlG04/aBo6U5W
2umHQQIDAQABAkEAu7wulGvZFat1Xv+19BMcgl3yhCdsB70Mi+7CH98XTwjACk4T
+IYv4N53j16gce7U5fJxmGkdq83+xAyeyw8U0QIhAPIMhbtXlRS7XpkB66l5DvN1
XrKRWeB3RtvcUSf30RyFAiEA5ph7eWXbXWpIhdWMoe50yffF7pW+C5z07tzAIH6D
Ko0CIQCyveSTr917bdIxk2V/xNHxnx7LJuMEC5DcExorNanKMQIgUxHRQU1hNgjI
sXXZoKgfaHaa1jUZbmOPlNDvYYVRyS0CIB9ZZee2zubyRla4qN8PQxCJb7DiICmH
7nWP7CIvcQwB
-----END PRIVATE KEY-----"""
key = RSA.import_key(pkcs8)
msg = 'The quick brown fox jumps over the lazy dog'.encode('utf8')
signature = customizedSign(key, msg)
print(base64.b64encode(signature).decode('utf8')) # OwpVG/nPmkIbVxONRwXHvOqLdYNnP67YtiWA+GcKBZ3rIzAJ+8izvmlqUQnzVp03Wrrzq2ogUmCMaLSPlInDNw==
The linked Java code provides the same signature for the same key and message.

Python int hash, is that a feature or a bug for 'hash(int(-1))'? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
When is a python object's hash computed and why is the hash of -1 different?
Why do -1 and -2 both hash to the same number if Python?
Since they do, how does Python tell these two numbers apart?
>>> -1 is -2
False
>>> hash(-1) is hash(-2)
True
>>> hash(-1)
-2
>>> hash(-2)
-2
-1 is a reserved value at the C level of CPython which prevents hash functions from being able to produce a hash value of -1. As noted by DSM, the same is not true in IronPython and PyPy where hash(-1) != hash(-2).
See this Quora answer:
If you write a type in a C extension module and provide a tp_hash
method, you have to avoid -1 — if you return -1, Python will assume
you meant to throw an error.
If you write a class in pure Python and provide a __hash__ method,
there's no such requirement, thankfully. But that's because the C code
that invokes your __hash__ method does that for you — if your
__hash__ returns -1, then hash() applied to your object will actually return -2.
Which really just repackages the information from effbot:
The hash value -1 is reserved (it’s used to flag errors in the C
implementation). If the hash algorithm generates this value, we simply
use -2 instead.
You can also see this in the source. For example for Python 3’s int object, this is at the end of the hash implementation:
if (x == (Py_uhash_t)-1)
x = (Py_uhash_t)-2;
return (Py_hash_t)x;
Since they do, how does Python tell these two numbers apart?
Since all hash functions map a large input space to a smaller input space, collisions are always expected, no matter how good the hash function is. Think of hashing strings, for example. If hash codes are 32-bit integers, you have 2^32 (a little more than 4 billion) hash codes. If you consider all ASCII strings of length 6, you have (2^7)^6 (just under 4.4 trillion) different items in your input space. With only this set, you are guaranteed to have many, many collisions no matter how good you are. Add Unicode characters and strings of unlimited length to that!
Therefore, the hash code only hints at the location of an object, an equality test follows to test candidate keys. To implement a membership test in a hash-table set, the hash code gives you "bucket" number in which to search for the value. However, all set items with the same hash code are in the bucket. For this, you also need an equality test to distinguish between all candidates in the bucket.
This hash code and equality duality is hinted at in the CPython documentation on hashable objects. In other languages/frameworks, there is a guideline/rule that if you provide a custom hash code function, you must also provide a custom equality test (performed on the same fields as the hash code function).
Indeed, the Python release today address exactly this, with a security patch that addresses the efficiency issue when this (identical hash values, but on a massive scale) is used as a denial of service attack - http://mail.python.org/pipermail/python-list/2012-April/1290792.html

How to efficiently write raw bytes to numpy array data in python 3

While migrating some old python 2 code to python 3, I ran into some problems populating structured numpy arrays from bytes objects.
I have a parser that defines a specific dtype for each type of data structure I might encounter. Since, in general, a given data structure may have variable-length or variable-type fields, these have been represented in the numpy array as fields of object dtype (np.object #alternatively np.dtype('O')).
The array is obtained from bytes (or a bytearray) by first populating the fixed-dtype fields. After this, the dtype of any sub-arrays (contained in 'object' fields) can be built using information from the fixed fields that precede it.
Here is a partial example of this process (dealing only with the fixed-dtype fields) that works in python 2. Note that we have a field named 'nSamples', which will presumably tell us the length of the array pointed to by the 'samples' field of the array, which would be interpreted as a numpy array with shape (2,) and dtype sampleDtype:
fancyDtype = np.dtype([('blah', '<u4'),
('bleh', 'S5'),
('nSamples', '<u8'),
('samples', 'O')])
sampleDtype = np.dtype([('sampleId', '<u2'),
('val', '<f4')])
bytesFromFile = bytearray(
b'*\x00\x00\x00hello\x02\x00\x00\x00\x00\x00\x00\x00\xd0\xb5'
b'\x14_\xa1\x7f\x00\x00"\x00\x00\x00\x80?]\x00\x00\x00\xa0#')
arr = np.zeros((1,), dtype=fancyDtype)
numBytesFixedPortion = 17
# Start out by just reading the fixed-type portion of the array
arr.data[:numBytesFixedPortion] = bytesFromFile[:numBytesFixedPortion]
memoryview(arr.data)[:numBytesFixedPortion] = bytesFromFile[:numBytesFixedPortion]
Both of the last two statements here that work in python 2.7.
Of note is that if I type
arr.data
I get <read-write buffer for 0x7f7a93bb7080, size 25, offset 0 at 0x7f7a9339cf70>, which tells me this is a buffer. Obviously, memoryview(arr.data) returns a memoryview object.
Both of these statements raise the following exception in python 3.6:
NotImplementedError: memoryview: unsupported format T{I:blah:5s:bleh:=Q:nSamples:O:samples:}
This tells me that numpy is returning a different type with its data attribute access, a memoryview rather than a buffer. It also tells me that memoryviews worked in python 2.7 but don't in python 3.6 for this purpose.
I found a similar issue in numpy's issue tracker: https://github.com/numpy/numpy/issues/13617
However, the issue was closed quickly, with the numpy developer indicating that it is a bug in ctypes. Since ctypes is a builtin, I kind of gave up hope on just updating it to get a fix.
I did finally stumble upon a solution that works, though it takes roughly twice as long as the python 2.7 method. It is:
import struct
struct.pack_into(
'B' * numBytesFixedPortion, # fmt
arr.data, # buffer
0, # offset
*buf[:numBytesFixedPortion] # unpacked byte values
)
A coworker also suggested attempting to use this solution:
arrView = arr.view('u1')
arrView[:numBytesFixedPortion] = buf[:numBytesFixedPortion]
However, on doing this, I get the exception:
File "/home/tintedFrantic/anaconda2/envs/py3/lib/python3.6/site-packages/numpy/core/_internal.py", line 461, in _view_is_safe
raise TypeError("Cannot change data-type for object array.")
TypeError: Cannot change data-type for object array.
Note that I get this exception in both python 2.7 and 3.6. It appears numpy disallows views on arrays with any object fields. (Aside: I was able to get numpy to do this correctly by commenting out the check for object-type fields in the numpy code, though that seems a dangerous solution (and not a very portable one either)).
I've also tried creating separate arrays, one with the fixed-dtype fields and one with the object-dtype field and then using numpy.lib.recfunctions.merge_arrays to merge them. That fails with a cryptic message that I can't remember.
I am at a bit of a loss. I just want to write some arbitrary bytes to the numpy array's underlying memory and do it efficiently. This doesn't seem like it should be too hard to do, but I haven't come across a good way to do it. I would like a solution that isn't a hack either, as this is going into systems that need high reliability. If nothing better exists, I will use the struct.pack_into() solution, but I am hoping someone out there knows a better way. By the way, NOT using object-dtype fields is NOT a viable option, as the cost of doing so would be prohibitive.
If it matters, I am using numpy 1.16.2 in python 2.7 and 1.17.4 for python 3.6.
Per the suggestion of #nawsleahcimnoraa, I found out that in python 3.3+ (so not in python 2.7), the memoryview object, which is returned by arr.data in my python 3 environment, has a cast() method. Thus, I can do
arr.data.cast('B')[startIdx:endIdx] = buf[:numBytes]
This is much more like what I had in python 2.7. It is a lot more concise and also performs a little better than the struct method above.
One thing I noticed in testing these solutions is that, in general, the python 3 solutions were slower than the python 2 versions. For example, I tried the struct solution both using python 2 and python 3 and found a significant increase in processing time for python 3.
I also found fairly sizable discrepancies between different python environments of the same version. For example, I found that my system install of python 3.6 performed better than a virtual environment install of python 3.6, so it seems that the results will likely depend largely on a given environment's configuration.
Overall, I am happy with the results of using the cast() method of the memoryview object returned by arr.data and will use that for now. However, if someone discovers something that works better, I would still love to hear about it.

Canonical way to generate random numbers in Cython

What is the best way to generate pseudo uniform random numbers (a double in [0, 1)) that is:
Cross platform (ideally with same same sample sequence)
Thread safe (explicit passing of the mutated state of the prng or
using a thread-local state internally)
Without GIL lock
Easily wrappable in Cython
There was a similar post over 3 years ago about this but a lot of the answers don't meet all criteria. For example, drand48 is POSIX-specific.
The only method I'm aware of, which seems (but not sure) to meet all some criteria is:
from libc.stdlib cimport rand, RAND_MAX
random = rand() / (RAND_MAX + 1.0)
Note #ogrisel asked the same question about 3 years ago.
Edit
Calling rand is not thread safe. Thanks for pointing that out #DavidW.
Big pre-answer caveat: this answer recommends using C++ because the question specifically asks for a solution that runs without the GIL. If you don't have this requirement (and you probably don't...) then Numpy is the simplest and easiest solution. Provided that you're generating large amounts of numbers at a time you will find Numpy perfectly quick. Don't be misled into a complicated exercise in wrapping C++ because someone asked for a no-gil solution.
Original answer:
I think the easiest way to do this is to use the C++11 standard library which provides nice encapsulated random number generators and ways to use them. This is of course not the only options, and you could wrap pretty much any suitable C/C++ library (one good option might be to use whatever library numpy uses, since that's most likely already installed).
My general advice is to only wrap the bits you need and not bother with the full hierarchy and all the optional template parameters. By way of example I've shown one of the default generators, fed into a uniform float distribution.
# distutils: language = c++
# distutils: extra_compile_args = -std=c++11
cdef extern from "<random>" namespace "std":
cdef cppclass mt19937:
mt19937() # we need to define this constructor to stack allocate classes in Cython
mt19937(unsigned int seed) # not worrying about matching the exact int type for seed
cdef cppclass uniform_real_distribution[T]:
uniform_real_distribution()
uniform_real_distribution(T a, T b)
T operator()(mt19937 gen) # ignore the possibility of using other classes for "gen"
def test():
cdef:
mt19937 gen = mt19937(5)
uniform_real_distribution[double] dist = uniform_real_distribution[double](0.0,1.0)
return dist(gen)
(The -std=c++11 at the start is for GCC. For other compilers you may need to tweak this. Increasingly c++11 is a default anyway, so you can drop it)
With reference to your criteria:
Cross platform on anything that supports C++. I believe the sequence should be specified so it's repeatable.
Thread safe, since the state is stored entirely within the mt19937 object (each thread should have its own mt19937).
No GIL - it's C++, with no Python parts
Reasonably easy.
Edit: about using discrete_distribution.
This is a bit harder because the constructors for discrete_distribution are less obvious how to wrap (they involve iterators). I think the easiest thing to do is to go via a C++ vector since support for that is built into Cython and it is readily convertable to/from a Python list
# use Cython's built in wrapping of std::vector
from libcpp.vector cimport vector
cdef extern from "<random>" namespace "std":
# mt19937 as before
cdef cppclass discrete_distribution[T]:
discrete_distribution()
# The following constructor is really a more generic template class
# but tell Cython it only accepts vector iterators
discrete_distribution(vector.iterator first, vector.iterator last)
T operator()(mt19937 gen)
# an example function
def test2():
cdef:
mt19937 gen = mt19937(5)
vector[double] values = [1,3,3,1] # autoconvert vector from Python list
discrete_distribution[int] dd = discrete_distribution[int](values.begin(),values.end())
return dd(gen)
Obviously that's a bit more involved than the uniform distribution, but it's not impossibly complicated (and the nasty bits could be hidden inside a Cython function).

Resources